# 4.1 Strings: Text in Python

> üìò **Definition**  
> **Strings** are sequences of characters used to store text.  
> In scientific computing, they are essential for reading data files, labeling plot axes, and formatting output.

---

## 1. Creating Strings
- **Syntax:** Enclosed in single `'...'` or double `"..."` quotes
- **Rule:** The string must start and end with the **same** type of quote

> üí° **The Apostrophe Trick**  
> If your text contains an apostrophe (e.g., *dog's*), use **double quotes** outside.
>
> ```python
> a = "My dog's name"
> ```

---

## 2. Operator Overloading (String Math)
Python reuses mathematical operators for strings, but with **different meanings**.

| Operator | Function | Example | Result |
| :--- | :--- | :--- | :--- |
| `+` | Concatenation | `"My " + "Dog"` | `"My Dog"` |
| `*` | Repetition | `"Susie" * 3` | `"SusieSusieSusie"` |
| `-` | Invalid | `"A" - "B"` | ‚ùå `TypeError` |

---

> ‚ö†Ô∏è **The ‚ÄúNumber as String‚Äù Trap**  
> A number inside quotes is **text**, not numeric data.
>
> ```python
> d = "927"   # string
> e = 927     # integer
> ```
>
> - `d + e` ‚Üí ‚ùå **Error** (string + int not allowed)  
> - `int(d) + e` ‚Üí `1854` ‚úÖ (convert string ‚Üí int)  
> - `d + str(e)` ‚Üí `"927927"` ‚úÖ (convert int ‚Üí string)

---

## 3. Unicode & Special Characters
Python uses **Unicode (UTF-8)**, allowing Greek letters and scientific symbols.

| Function | Code | Output | Description |
| :--- | :--- | :--- | :--- |
| `ord()` | `ord("a")` | `97` | Numeric code of character |
| `chr()` | `chr(945)` | `Œ±` | Character from code |
| Unicode | `"\u03b1"` | `Œ±` | Hex Unicode literal |

- **`ord()`** ‚Üí ordinal value  
- **`chr()`** ‚Üí character from ordinal  
- **Unicode literals** ‚Üí useful for scientific notation

> üí° **Scientific Tip**  
> Unicode makes Python ideal for readable scientific code:
> ```python
> alpha = "\u03b1"
> omega = "\u03A9"
> ```

---


In [71]:
a = "My name is"
b = 'Shaurya Shukla'
c = a + ' ' + b      #
print(c)

My name is Shaurya Shukla


In [72]:
d = "927"
e = 927 
d + str(e)

'927927'

In [73]:
chr(945)

'Œ±'

In [74]:
print(ord("a"))

ord("z")

97


122

# 4.2 Lists & Tuples in Python

> üìò **Definition**  
> **Lists** and **Tuples** are sequence data structures used to store collections of items.
>
> - **Lists:** Mutable sequences defined by square brackets `[...]`
> - **Tuples:** Immutable sequences defined by parentheses `(...)`

---

## 1. Creating & Indexing
- **Syntax:** Lists use `[ ]`, Tuples use `( )`
- **Zero-Indexing:** Python starts counting at **0** (unlike MATLAB/Fortran)

> üí° **The Negative Index Trick**  
> You can access elements from the **end** of a list using negative indices.
>
> - `b[-1]` ‚Üí last element  
> - `b[-2]` ‚Üí second-last element

---

## 2. Modifying Lists (Mutability)
Lists are **mutable**, meaning individual elements can be changed after creation.

| Operation | Code | Result |
| :--- | :--- | :--- |
| Access | `b[0]` | Get first element |
| Modify | `b[0] = 7` | Change first element |
| Replace | `b[1] = "New"` | Replace element at index 1 |

---

## 3. Slicing Lists
You can extract subsets of a list using the colon operator `:`.

- **Syntax:** `list[start : stop : step]`
- **Rule:** The `stop` index is **excluded**

| Slice | Meaning | Example Result |
| :--- | :--- | :--- |
| `b[1:4]` | Index 1 up to (not including) 4 | `['girl', 2+0j, 3.14]` |
| `b[:3]` | Start up to index 3 | `[5.0, 'girl', 2+0j]` |
| `b[2:]` | Index 2 to end | `[2+0j, 3.14, 21]` |
| `b[::2]` | Every 2nd element | `[5.0, 2+0j, 21]` |

---

> ‚ö†Ô∏è **The ‚ÄúAppend vs Plus‚Äù Trap**  
> When adding elements to lists, `.append()` and `+=` behave differently.
>
> ```python
> g = [1, 2, 3]
> ```
> - `g.append([4, 5])` ‚Üí `[1, 2, 3, [4, 5]]` (**nested list**)  
> - `g += [4, 5]` ‚Üí `[1, 2, 3, 4, 5]` (**extended list**)

---

## 4. Multidimensional Lists
Lists can contain other lists (nested lists), useful for representing matrices.

- **Accessing elements:** `a[row][col]`

```python
a = [[3, 9], [8, 5]]
a[0]      # [3, 9]
a[1][0]   # 8


## 5. Tuples (Immutable Sequences)

> üìò **Definition**  
> A **tuple** is an **ordered, immutable** sequence in Python.  
> Once created, its elements **cannot be changed**.

---

### Creating Tuples
Tuples are defined using **parentheses `()`**.

```python
t = (1, 2, 3)


In [75]:
# --- Part 1: Managing Components (Slicing) ---
# Scenario: You have a list of hydrocarbons sorted by carbon number.

components = ["Methane", "Ethane", "Propane", "Butane", "Pentane"]

# 1. We only need the C2, C3, and C4 components (Ethane to Butane).
print(f"C2-C4 cut {components[1:4]}")

C2-C4 cut ['Ethane', 'Propane', 'Butane']


In [76]:
# 2. We need the last component in the list (Heaviest).
print(f"heaviest: {components[-1]}")

heaviest: Pentane


In [77]:
# --- Part 2: Updating Process Conditions (Mutability) ---
# Scenario: A list of tank pressures (bar).

pressures = [1.0, 1.2, 0.9, 1.5]
# 1. The sensor for the third tank (Index 2) was recalibrated.
#    It should read 1.1 instead of 0.9.

pressures[2] = 1.1
print(f"updated pressures {pressures}")

updated pressures [1.0, 1.2, 1.1, 1.5]


In [78]:
# 2. SAFETY CHECK (Tuples):
#    Universal Gas Constant (R) should never change.

R_const = (8.314, "J/mol.K")

In [79]:
# --- Part 3: Multi-Component Streams (Nested Lists) ---
# Scenario: A table of [Flow_Rate (kg/hr), Temperature (C)] for 2 streams.
# Stream 0: Feed
# Stream 1: Product

streams = [
    [500.0, 25.0],
    [480.0, 60.0]
]
# 1. We need the Temperature of the Product stream (Stream 1). 
#    Action: Access row 1, then column 1.
product_temp = streams[1][1]
print(f"Temperature of product stream is: {product_temp}")


Temperature of product stream is: 60.0


In [80]:
# 2. The Feed flow rate (Stream 0) increases to 550.0.
#    Action: Update row 0, column 0.
streams[0][0] = 550.0
print(f"New Feed Flow Rate: {streams}")

New Feed Flow Rate: [[550.0, 25.0], [480.0, 60.0]]


> üß™ **Practice Challenge ‚Äî Lists Mastery**
>
> This challenge tests your understanding of **slicing**, **mutability**, and **list methods** in Python, using a **chemical engineering scenario**.

---

### üß´ The *‚ÄúDirty Sensor‚Äù* Challenge

#### **Scenario**
You are logging temperature data from a **batch reactor**.  

The temperature sensor:
- Always records **two dummy zeros** during startup  
- Sometimes the reaction finishes early, requiring you to **manually add a final "Cooling" tag** to the log

---

### üìä Your Data
```python
temp_log = [0.0, 0.0, 65.5, 68.2, 70.1, 72.4]

üéØ Your Mission

Write a Python script that performs the following three steps in order:

1. Clean

Create a new list called process_data by slicing off the first two dummy zeros.

2. Update

The last temperature reading (72.4) is a sensor glitch.
Overwrite it to be 72.0 in your new list.

3. Tag

Append the string "COOLING_STARTED" to the end of the process_data list.

‚úÖ Goal Output

After completing all three steps, your list should look exactly like this:

[65.5, 68.2, 70.1, 72.0, 'COOLING_STARTED']


In [81]:
temp_log = [0.0, 0.0, 65.5, 68.2, 70.1, 72.4]
process_data = temp_log[2:]
print(process_data)

process_data[-1] = 72.0
print(process_data)

process_data.append("COOLING_STARTED")
print(process_data)

[65.5, 68.2, 70.1, 72.4]
[65.5, 68.2, 70.1, 72.0]
[65.5, 68.2, 70.1, 72.0, 'COOLING_STARTED']


In [82]:
temp_log  = temp_log + [2.0]   # adding 2.0 in last 
temp_log

[0.0, 0.0, 65.5, 68.2, 70.1, 72.4, 2.0]

# 4.4 NumPy Arrays: The Engine of Scientific Computing

<div style="background-color: #e7f3fe; border-left: 6px solid #2196F3; padding: 15px; margin-bottom: 20px;">
<strong>üìò Definition:</strong> The <strong>NumPy Array</strong> (<code>ndarray</code>) is a grid of values that are all of the <strong>same type</strong> (homogeneous). It is the standard for storing numerical data in science because it is significantly faster and more memory-efficient than Python Lists.
</div>

## 1. Creating Arrays
You must import the library first: `import numpy as np`

| Function | Syntax | Description | Example |
| :--- | :--- | :--- | :--- |
| **`np.array()`** | `np.array(list)` | Converts a standard list to an array. | `np.array([1, 2, 3])` |
| **`np.linspace()`** | `(start, stop, N)` | Generates **N** evenly spaced points (Inclusive). | `np.linspace(0, 10, 50)` |
| **`np.logspace()`** | `(start, stop, N)` | Logarithmic spacing from $10^{start}$ to $10^{stop}$. | `np.logspace(1, 3, 5)` |
| **`np.arange()`** | `(start, stop, step)` | Step-based generation (Stop is **excluded**). | `np.arange(0, 10, 0.5)` |
| **`np.zeros()`** | `(N)` | Creates an array filled with zeros (float default). | `np.zeros(5)` |
| **`np.ones()`** | `(N)` | Creates an array filled with ones. | `np.ones((2,2))` |

<div style="background-color: #e7f3fe; padding: 10px; border-radius: 5px;">
<strong>üí° Linspace vs. Arange:</strong><br>
<ul>
    <li>Use <strong>linspace</strong> when you know <em>how many points</em> you want (e.g., for plotting).</li>
    <li>Use <strong>arange</strong> when you know the <em>step size</em> (e.g., time steps).</li>
</ul>
</div>

---

## 2. Vectorization (Math without Loops)
NumPy allows you to perform mathematical operations on entire arrays at once. This is called **Vectorization**.

| Operation | Standard Math | NumPy Syntax | Result (if `a=[1, 2, 3]`) |
| :--- | :--- | :--- | :--- |
| **Scalar Math** | $y = x + 2$ | `a + 2` | `[3, 4, 5]` |
| **Powers** | $y = x^2$ | `a ** 2` | `[1, 4, 9]` |
| **Functions** | $y = \sin(x)$ | `np.sin(a)` | `[0.84, 0.90, 0.14]` |
| **Array Math** | $z = x + y$ | `a + b` | Adds element-by-element |



---

## 3. Slicing & Addressing
Works exactly like Python Lists `[start:stop:step]`.

* **1D Array:** `a[1:5]`
* **2D Array (Matrices):** `matrix[row, col]`

<div style="background-color: #e6fffa; border-left: 6px solid #009688; padding: 15px; margin-bottom: 20px;">
<strong>‚úÖ Best Practice for Matrices:</strong><br>
Although <code>a[row][col]</code> works, always use the comma syntax <code>a[row, col]</code>.<br>
It is faster and clearer.
</div>

---

## 4. Boolean Indexing (Filtering)
A powerful feature to filter data based on conditions.

* **Syntax:** `array[condition]`
* **Example:** Get all values greater than 1.
    ```python
    b = np.array([0.5, 2.0, 0.1, 5.0])
    high_vals = b[b > 1]
    # Result: [2.0, 5.0]
    ```

You can also use this to **replace** data (e.g., removing noise):
```python
b[b < 0] = 0  # Replaces all negative numbers with 0



## 5. Multidimensional Arrays (Matrices)
Created using nested lists or `.reshape()`.

* **Create:** `m = np.array([[1, 2], [3, 4]])`
* **Reshape:** `a.reshape(rows, cols)` turns a flat array into a matrix.



### Matrix Math vs. Array Math
<div style="background-color: #ffcccc; border-left: 6px solid #f44336; padding: 15px; margin-bottom: 20px;">
<strong>‚ö†Ô∏è The "Dot Product" Trap</strong><br>
In Python, the asterisk <code>*</code> always means <strong>element-by-element</strong> multiplication.<br>
<br>
<ul>
    <li><code>A * B</code> ‚Üí Multiplies matching items (Not Linear Algebra).</li>
    <li><code>np.dot(A, B)</code> ‚Üí True Matrix Multiplication (Rows √ó Cols).</li>
</ul>
</div>

---

## 6. Summary: Lists vs. NumPy Arrays

| Feature | Python List `[]` | NumPy Array `np.array()` |
| :--- | :--- | :--- |
| **Data Types** | Mixed (Int, String, Object) | **Homogeneous** (Must be same type) |
| **Speed** | Slow (requires loops) | **Fast** (Vectorized C-code) |
| **Math** | Concatenation (`+` joins lists) | Addition (`+` adds numbers) |
| **Storage** | Scattered memory | Contiguous memory block |
| **Resizing** | Cheap (Append is easy) | Expensive (Rebuilds entire array) |

> üí° **Pro Tip:**
> If you are building a list item-by-item (looping), use a **List** first.
> Once the data is collected, convert it to an **Array** for analysis.

In [83]:
# --- Part 1: Creating Arrays (The 3 Best Ways) ---
# Scenario: Setting up a time vector for a reaction simulation (0 to 10 sec)

# Method A: Using a List (Old way)
time_list = [0, 1, 2, 3, 4, 5]
arr_from_list = np.array(time_list)
arr_from_list

array([0, 1, 2, 3, 4, 5])

In [84]:
# Method B: arange (Start, Stop, Step) - Good for time steps
# Note: Stop (10.1) is excluded, so this goes up to 10.0

time_step = np.arange(0,10.1,0.5)
time_step

array([ 0. ,  0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5,  5. ,
        5.5,  6. ,  6.5,  7. ,  7.5,  8. ,  8.5,  9. ,  9.5, 10. ])

In [85]:
# Method C: linspace (Start, Stop, Number of Points) - Good for plotting
# We want exactly 20 data points between 0 and 10.
time_smooth = np.linspace(0, 10, 20)
time_smooth

array([ 0.        ,  0.52631579,  1.05263158,  1.57894737,  2.10526316,
        2.63157895,  3.15789474,  3.68421053,  4.21052632,  4.73684211,
        5.26315789,  5.78947368,  6.31578947,  6.84210526,  7.36842105,
        7.89473684,  8.42105263,  8.94736842,  9.47368421, 10.        ])

In [86]:
print(f"Time Step Array ({len(time_step)} pts):\n {time_step}")
print(f"Smooth Time Array ({len(time_smooth)} pts):\n {time_smooth}")
print("-" * 30)

Time Step Array (21 pts):
 [ 0.   0.5  1.   1.5  2.   2.5  3.   3.5  4.   4.5  5.   5.5  6.   6.5
  7.   7.5  8.   8.5  9.   9.5 10. ]
Smooth Time Array (20 pts):
 [ 0.          0.52631579  1.05263158  1.57894737  2.10526316  2.63157895
  3.15789474  3.68421053  4.21052632  4.73684211  5.26315789  5.78947368
  6.31578947  6.84210526  7.36842105  7.89473684  8.42105263  8.94736842
  9.47368421 10.        ]
------------------------------


In [87]:
# --- Part 2: Vectorization (Math without Loops) ---
# Scenario: Calculate Concentration C = C0 * e^(-k * t)
# We calculate 20 concentrations in ONE line of code.

C0 = 100.0 #Intial Conc
k = 0.5    #Rate Constant

#The calculation applies to every number in 'time_smooth' at once 
cocn_profile = C0 * np.exp((-k * time_smooth))

print(f"Concentration profile:\n {cocn_profile}")
print(f"Final Concentration:\n {cocn_profile[-1]:.2f}")
print("-"*30)
                           

Concentration profile:
 [100.          76.86205266  59.07775139  45.40837238  34.90180709
  26.82624535  20.61920283  15.84834253  12.18136138   9.3628444
   7.19647439   5.53135794   4.25151525   3.26780189   2.51169961
   1.93054388   1.48385565   1.14052191   0.87662855   0.6737947 ]
Final Concentration:
 0.67
------------------------------


In [88]:
# --- Part 3: Boolean Filtering (The Safety System) ---
# Scenario: Find all time points where Concentration dropped below 10.0

#1. Create a boolean mask (True/False)
low_cocn_mask = cocn_profile < 10.0

#2. Apply the mask to get the actual values 
safe_values = conc_profile[low_cocn_mask]

print(f"Values below 10.0\n {safe_values}")
print(f"Number of values below target: {len(safe_values)}")
print("-"*30)

Values below 10.0
 [9.3628444  7.19647439 5.53135794 4.25151525 3.26780189 2.51169961
 1.93054388 1.48385565 1.14052191 0.87662855 0.6737947 ]
Number of values below target: 11
------------------------------


In [89]:
# --- Part 4: The 'Dot Product' Trap (Matrices) ---
# Scenario: Stoichiometry Calculation
# Matrix A (2x2) and Vector B (2x1)

A = np.array([[2,0],
             [0,2]]) #Diagonal Matrix
B = np.array([1,1])

# Wrong Way: Element Wise multiplication
print(f"Wrong (A*B):\n {(A*B)}")
# Result: [[2, 0], [0, 2]] (Just multiplies numbers in place)

# Right Way: Matrix Multiplication (Rows X Columns)
print(f"Right (np.dot): {np.dot(A,B)}")

# Result: [2, 2] (Correct Linear Algebra)

Wrong (A*B):
 [[2 0]
 [0 2]]
Right (np.dot): [2 2]


## üß™ Solo Challenge: The Arrhenius Equation

Now it‚Äôs time to test yourself!

### üîç Problem
Write a Python script using **NumPy** (**no for-loops allowed**) to calculate the reaction rate constant using the Arrhenius equation:

**k = A ¬∑ exp(‚àíEa / (R ¬∑ T))**

### üìä Given Data
**Constants:**
- A = 1 √ó 10‚Åµ  
- Ea = 50,000 J/mol  
- R = 8.314 J/(mol¬∑K)

**Temperature Range:**
- Create a NumPy array of temperatures from **300 K to 500 K**
- Use **50 evenly spaced points**

### üßÆ Tasks
1. Create the temperature array using NumPy  
2. Calculate the corresponding array of rate constants `k_values`  
3. Find and print the **maximum rate constant**

üîπ *Try to solve it on your own first.


In [110]:
import numpy as np

A = 1E5
E_a = 50000 # J/mol
R = 8.314   # J/(mol.k)

T = np.linspace(300, 600, 50)
print(T)

k = (A*np.exp(-E_a/(R*T)))
print(f"Rate Constants:\n {k}")
print(f"maximum rate constant: {k.max()}")

[300.         306.12244898 312.24489796 318.36734694 324.48979592
 330.6122449  336.73469388 342.85714286 348.97959184 355.10204082
 361.2244898  367.34693878 373.46938776 379.59183673 385.71428571
 391.83673469 397.95918367 404.08163265 410.20408163 416.32653061
 422.44897959 428.57142857 434.69387755 440.81632653 446.93877551
 453.06122449 459.18367347 465.30612245 471.42857143 477.55102041
 483.67346939 489.79591837 495.91836735 502.04081633 508.16326531
 514.28571429 520.40816327 526.53061224 532.65306122 538.7755102
 544.89795918 551.02040816 557.14285714 563.26530612 569.3877551
 575.51020408 581.63265306 587.75510204 593.87755102 600.        ]
Rate Constants:
 [1.96748866e-04 2.93787960e-04 4.31844552e-04 6.25441266e-04
 8.93255020e-04 1.25901666e-03 1.75253729e-03 2.41086676e-03
 3.27958803e-03 4.41424965e-03 5.88193653e-03 7.76297751e-03
 1.01527862e-02 1.31638296e-02 1.69277177e-02 2.15974043e-02
 2.73494896e-02 3.43866121e-02 4.29399159e-02 5.32715807e-02
 6.56773978e-02 8.0