### COMPUTATIONAL CARPETNARY PROJECT

***PART A - DATA STRUCTURES AND FUNCTIONS***

Creating a dictionnary of elements and atomic masses 

In [1]:
import pandas as pd
df = pd.read_csv("periodic_table.csv")

element_dict={}

for index, row in df.iterrows():
    symbol= row["Symbol"]
    mass =row["AtomicMass"]
    element_dict[symbol]=mass

#check it works 
for symbol in list(element_dict.keys())[:10]:
    print(symbol,":",element_dict[symbol])

H : 1.008
He : 4.0026
Li : 7.0
Be : 9.012183
B : 10.81
C : 12.011
N : 14.007
O : 15.999
F : 18.99840316
Ne : 20.18


- the csv file of the periodic table was loaded and from this an empty dictionnary was created
- by iterating over the rows of the dataframe, the element **symbol** as the key and the **atomic mass** as the value 
- this yielded a dictionnary that maps each element symbol to its atomic mass 





Now the goal is to write a function that takes chemcial formulas and returns the molecular mass 

In [3]:
def molecular_mass(formula):
    total_mass = 0
    i = 0
    while i < len(formula):
        symbol = formula[i]
        i += 1

        if i < len(formula) and formula[i].islower():
            symbol += formula[i]
            i += 1  

        num_str = ""
        while i < len(formula) and formula[i].isdigit():
            num_str += formula[i]
            i += 1
        count = int(num_str) if num_str else 1

        
        total_mass += element_dict[symbol] * count

    return total_mass


#check it works 
print("H2O :", molecular_mass("H2O"))     # ~18.015
print("He :", molecular_mass("He"))      # ~4.0026
print("C6H12O6:", molecular_mass("C6H12O6")) # ~180.156


H2O : 18.015
He : 4.0026
C6H12O6: 180.156


Calculating the Molecular Mass of a Formula

- a function molecular_mass(formula) was written that takes a chemical formula (e.g. "H2O") as input  
- inside the function, a variable total_mass is created and set to 0 → this will store the running sum of the molecular mass  
- a while loop goes through the formula one character at a time using an index i 

*Step 1: Identify the element symbol*
- the first character is always an uppercase letter (like "H" in "H2O", or "C" in "CO2")  
- if the next character is a lowercase letter, it is added to the symbol (so "He" or "Na" are treated as one element)  

*Step 2: Find how many atoms of that element are present*
- the code then looks for digits following the symbol (e.g. 2 in "H2O" or 12 in "C6H12O6")  
- if digits are found, they are combined into a number → this is the **count of atoms**  
- if no number is found, the count defaults to 1 

*Step 3: Look up the atomic mass*
- using the dictionary (element_dict), the atomic mass for the element symbol is found  
- the atomic mass is multiplied by the count of atoms  

*Step 4: Add to the total mass*
- the result for that element is added to total_mass  
- the loop continues until the whole formula has been processed  

*Final Step: Return the result*
- once the loop ends, the function returns total_mass → the total molecular mass of the formula  

---

*Example with "H2O"*
- Symbol = "H", Count = 2 → mass = 2 × 1.008 = 2.016  
- Symbol = "O", Count = 1 → mass = 1 × 15.999 = 15.999  
- **Total = 18.015**  


Now the idea is to extend the code for the mass calculator to be able to deal with parenthesis in the brute formula

In [4]:
def molecular_mass_parentheses(formula):
    partial = []   # will hold partial masses and markers
    i = 0

    while i < len(formula):
        # 1. Handle uppercase + optional lowercase (symbols)
        if formula[i].isupper():
            symbol = formula[i]
            i += 1
            if i < len(formula) and formula[i].islower():
                symbol += formula[i]
                i += 1

            # parse number
            num_str = ""
            while i < len(formula) and formula[i].isdigit():
                num_str += formula[i]
                i += 1
            count = int(num_str) if num_str else 1

            # push mass onto stack
            partial.append(element_dict[symbol] * count)

        # 2. Handle opening parenthesis
        elif formula[i] == "(":
            partial.append("(")
            i += 1

        # 3. Handle closing parenthesis
        elif formula[i] == ")":
            i += 1
            # read number after )
            num_str = ""
            while i < len(formula) and formula[i].isdigit():
                num_str += formula[i]
                i += 1
            count = int(num_str) if num_str else 1

            # pop until "("
            temp = 0
            while partial and partial[-1] != "(":
                temp += partial.pop()
            partial.pop()  # remove "("
            partial.append(temp * count)

        else:
            i += 1  # skip unexpected

    return sum(partial)

print("H2O:",molecular_mass_parentheses("H2O"))      # ~18
print("Ca(OH)2:", molecular_mass_parentheses("Ca(OH)2"))  # ~74


H2O: 18.015
Ca(OH)2: 74.094


Calculating Molecular Mass with Parentheses (Non-Recursive)

- the previous function worked for simple formulas (like *H2O* or *C6H12O6*), but it could not handle parentheses  
- to fix this, the function was extended to support groups like *(OH)2* in *Ca(OH)2*  
- the idea is to use a *stack*, which is like a list where we add and remove values from the end  

How it works
1. *Read element symbols*
   - just like before, the code finds symbols (like *Ca*, *O*, *H*) and numbers after them  
   - the atomic mass multiplied by the count is pushed onto the stack  

2. *Handle "("*  
   - when an opening parenthesis is found, a marker "(" is pushed onto the stack  
   - this marks the beginning of a grouped part of the formula  

3. *Handle ")"* 
   - when a closing parenthesis is found, the code collects everything from the stack until it reaches "(" 
   - these masses are added together to get the total for the group inside the parentheses  
   - the number after the ")" tells how many times to multiply the group (e.g. 2 in *(OH)2*)  
   - the multiplied result is pushed back onto the stack  

4. *Finish the formula* 
   - when the loop ends, the stack contains all the individual contributions  
   - adding them together gives the total molecular mass  

Example: Ca(OH)2
- *Ca* → 40.078 pushed to the stack  
- "(" pushed as a marker  
- *O* → 15.999 pushed  
- *H* → 1.008 pushed  
- ")" closes the group → (15.999 + 1.008) × 2 = 34.014 pushed  
- Final stack = [40.078, 34.014] → total = 74.092  

