# Lecture 2: Computational Thinking with Python

## Key Concept Review

### String operations

Before we begin the exercises, let's review some essential string operations you'll need:

#### Looping Over Strings
You can loop through each character in a string using a for loop:

In [1]:
formula = "H2O"
for character in formula:
    print(character)

H
2
O


#### Useful String Methods

```python
string.isalpha() - returns True if the character is a letter
string.isdigit() - returns True if the character is a number
string.upper() - converts to uppercase
string.lower() - converts to lowercase
```

Get familiar with these functions by experimenting with them. E.g. what is the output of the following?
```python
"H".isalpha()
```

### String Indexing and Slicing
```python
formula = "CH4"
print(formula[0])    # Output: "C" (first character)
print(formula[1:3])  # Output: "H4" (characters from index 1 to 2)
print(len(formula))  # Output: 3 (length of string)
```
Familiarise yourself with these functions.


### Converting Strings to Numbers
```python
number_string = "42"
number = int(number_string)  # Converts "42" to integer 42
```

Familiarise yourself with these functions.

## Exercise 1: Molecular Formula Parser

**Objective:** Write a function `parse_molecular_formula(formula)` that counts atoms in molecular formulas. Start with the simple cases, but design your solution to handle increasingly complex inputs without major rewrites.

**Think computationally:** How can you break this problem into smaller, reusable pieces that won't need to be completely rewritten when the requirements change?


In [2]:
from typing import Dict

def parse_molecular_formula(formula: str) -> Dict[str, int]:
    atom_dict = dict()

    # Write your code here
    
    return atom_dict

### Test cases: Set 1 Basic formulae

The test cases will increase in difficulty with each set. Start with the first set, but think about how your approach will scale to the later, more complex cases.

In [79]:
# Simple single-letter elements, single-digit counts
test_cases_level1 = [
    ("H2O", {"H": 2, "O": 1}),
    ("CH4", {"C": 1, "H": 4}),
    ("CO2", {"C": 1, "O": 2}),
    ("NH3", {"N": 1, "H": 3})
]

def parse_molecular_formula(formula):
    a=[]
    resulty={}
    for i in formula:
        a.append(i)
    j = 0
    while j < len(a):
        if j+1 < len(a) and a[j+1].isdigit():   
            last_key = a[j]
            resulty[last_key] = resulty.get(last_key, 0) + int(a[j+1])
            j += 2
        else:
            last_key = a[j]
            resulty[last_key] = resulty.get(last_key, 0) + 1  
            j += 1
    
    return(resulty) 
    




for formula, expected in test_cases_level1:
    result = parse_molecular_formula(formula)
    print(f"{formula}: {result} {'✓' if result == expected else '✗'}")


H2O: {'H': 2, 'O': 1} ✓
CH4: {'C': 1, 'H': 4} ✓
CO2: {'C': 1, 'O': 2} ✓
NH3: {'N': 1, 'H': 3} ✓


In [75]:
atom = "H2O"
a = []
resulty = {}

def parse_molecular_formula(formula):

    for i in atom:
        a.append(i)
    j = 0
    while j < len(a):
        if j+1 < len(a) and a[j+1].isdigit():   
            last_key = a[j]
            resulty[last_key] = resulty.get(last_key, 0) + int(a[j+1])
            j += 2
        else:
            last_key = a[j]
            resulty[last_key] = resulty.get(last_key, 0) + 1  # <-- fixed
            j += 1
    
    return(resulty)
        
        

{'H': 2, 'O': 1}


### Test cases: Set 2 Multi-letter Elements

In [102]:
# Two-letter elements like Ca, Cl, Na
test_cases_level2 = [
    ("NaCl", {"Na": 1, "Cl": 1}),
    ("CaCl2", {"Ca": 1, "Cl": 2}),
    ("MgO", {"Mg": 1, "O": 1}),
    ("AlCl3", {"Al": 1, "Cl": 3}),
    ("H2SO4", {"H": 2, "S": 1, "O": 4})
]

def parse_molecular_formula2(formula):
    a = [ch for ch in formula]
    resulty = {}
    j = 0
    while j < len(a):
        elem = a[j]
        if j+1 < len(a) and a[j+1].islower():
            elem += a[j+1]
            j += 1
        num_str = ""
        while j+1 < len(a) and a[j+1].isdigit():
            num_str += a[j+1]
            j += 1
        count = int(num_str) if num_str else 1
        resulty[elem] = resulty.get(elem, 0) + count
        j += 1
    return resulty

for formula, expected in test_cases_level2:
    result = parse_molecular_formula2(formula)
    print(f"{formula}: {result} {'✓' if result == expected else '✗'}")


NaCl: {'Na': 1, 'Cl': 1} ✓
CaCl2: {'Ca': 1, 'Cl': 2} ✓
MgO: {'Mg': 1, 'O': 1} ✓
AlCl3: {'Al': 1, 'Cl': 3} ✓
H2SO4: {'H': 2, 'S': 1, 'O': 4} ✓


In [91]:
stri = "NaCl3"
cut = []
point = 0
for i in range(len(stri)-1):
    if stri[i].islower() == True:
        cut.append(i+1)
    else:
        continue
print(cut)  
parts = []
prev = 0
for c in cut:
    parts.append(stri[prev:c])
    prev = c
parts.append(stri[prev:])
print(parts)
    

[2, 4]
['Na', 'Cl', '3']


### Test cases: Set 3 Large Numbers

In [103]:
# Multi-digit counts
test_cases_level3 = [
    ("C12H22O11", {"C": 12, "H": 22, "O": 11}),  # Sucrose
    ("Ca10P6O26", {"Ca": 10, "P": 6, "O": 26}),   # Hydroxyapatite
    ("C2H5OH", {"C": 2, "H": 6, "O": 1}),         # Ethanol
    ("Fe2O3", {"Fe": 2, "O": 3})                  # Iron oxide
]


for formula, expected in test_cases_level3:
    result = parse_molecular_formula2(formula)
    print(f"{formula}: {result} {'✓' if result == expected else '✗'}")


C12H22O11: {'C': 12, 'H': 22, 'O': 11} ✓
Ca10P6O26: {'Ca': 10, 'P': 6, 'O': 26} ✓
C2H5OH: {'C': 2, 'H': 6, 'O': 1} ✓
Fe2O3: {'Fe': 2, 'O': 3} ✓
