# Python Fundamentals Tutorial for Chemistry Applications

Welcome to the Python Fundamentals tutorial tailored for chemistry!

This tutorial covers:
- Core Python syntax essentials
- Data types and variable use
- Control flow with conditionals and loops
- Functions and modular coding
- Introduction to key scientific libraries
- A simple workflow for chemical data processing

## 🚀 Setup: Installing Necessary Libraries

Before running the code examples in this notebook, please ensure all required scientific and cheminformatics libraries are installed. You can run the following command in a new code cell, or directly in your terminal/command prompt.

**(Note:** If you are using a virtual environment like Conda, you might need to use `conda install` instead of `pip install` for some libraries like RDKit.)

In [None]:
#!pip install numpy pandas matplotlib seaborn scipy sympy rdkit pymatgen

# Note: Uncomment the line above and run this cell to install the libraries.
# If using Conda, RDKit is often installed via its own channel:
# conda install -c conda-forge rdkit

### Variables and Data Types

In [None]:
chemical_name = "Caffeine"      # string to hold compound name
boiling_point = 178.12            # float for temperature in Celsius
atomic_numbers = [6, 1, 7]      # list of atomic numbers (C, H, N)
is_soluble = False               # boolean to represent solubility
atom_count = 24                   # integer number of atoms

print(f"Compound: {chemical_name}")
print(f"Boiling Point: {boiling_point} °C")
print(f"Atoms present with atomic numbers: {atomic_numbers}")
print(f"Water soluble?: {is_soluble}")
print(f"Total atom count: {atom_count}")

### Basic Calculations and String Handling

In [None]:
mass_sample = 1.25              # grams
molecular_weight = 194.19      # g/mol (approximate MW of caffeine)
moles = mass_sample / molecular_weight
print(f"Moles in sample: {moles:.5f} mol")

formula = "C8H10N4O2"  # caffeine molecular formula
print(f"The molecular formula of {chemical_name} is {formula}.")

### Data Structures

In [None]:
solvent_list = ["methanol", "chloroform", "hexane"]  # mutable list
melting_points = (234.0, -64.7, -95.0)                # tuple, immutable
compound_props = {                                    # dictionary to hold data
    "Caffeine": {"MP": 234.0, "Solubility": "low"},
    "Water": {"MP": 0.0, "Solubility": "high"}
}
print("Solvents:", solvent_list)
print("Melting points (°C):", melting_points)
print("Compound properties dictionary:", compound_props)

### Control Flow

In [None]:
pH = 5.6
if pH < 7:
    print("Solution is acidic")
elif pH == 7:
    print("Solution is neutral")
else:
    print("Solution is basic")

element_symbols = ["Na", "Cl", "O", "H"]
print("\n--- Loops ---")
for element in element_symbols:
    print(f"Element: {element}")

### Functions

In [None]:
def calculate_molarity(moles, volume_liters):
    """Calculate molarity given moles and solution volume in liters.""" 
    return moles / volume_liters

molarity = calculate_molarity(0.1, 0.5)
print(f"Molarity of solution: {molarity:.3f} M")

### Using Key Scientific Libraries

### Pandas – Data Handling

In [None]:
import pandas as pd
import numpy as np

data = {
    "Compound": ["Ethanol", "Acetone", "Benzene", "Methanol"],
    "Boiling Point (°C)": [78.37, 56.05, 80.1, 64.7],
    "Molar Mass (g/mol)": [46.07, 58.08, 78.11, 32.04],
    "Density (g/mL)": [0.789, 0.784, 0.879, 0.791]
}
df = pd.DataFrame(data)
print("Original DataFrame:\n", df)

# Select a single column
print("\nBoiling Points:\n", df['Boiling Point (°C)'].describe())

# Filter data based on a condition
high_bp_compounds = df[df['Boiling Point (°C)'] > 70]
print("\nCompounds with Boiling Point > 70°C:\n", high_bp_compounds)


### Matplotlib – Visualization

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(8, 6))
plt.scatter(df['Molar Mass (g/mol)'], df['Boiling Point (°C)'], color='b', marker='o')
plt.title('Boiling Point vs. Molar Mass', fontsize=16)
plt.xlabel('Molar Mass (g/mol)', fontsize=12)
plt.ylabel('Boiling Point (°C)', fontsize=12)
plt.grid(True)
plt.show()

### Seaborn – Statistical Data Visualization

In [None]:
import seaborn as sns

sns.set_theme(style="whitegrid")
plt.figure(figsize=(10, 6))
sns.barplot(x='Compound', y='Boiling Point (°C)', data=df, palette='viridis')
plt.title('Boiling Points of Selected Compounds', fontsize=16)
plt.xlabel('Compound', fontsize=12)
plt.ylabel('Boiling Point (°C)', fontsize=12)
plt.show()

### RDKit – Cheminformatics

In [None]:
from rdkit import Chem
from rdkit.Chem import Descriptors, Draw

# Create a molecule object from a SMILES string
mol = Chem.MolFromSmiles('CCC(=O)O') # SMILES for Propanoic acid
if mol is not None:
    print(f"Molecule name: Propanoic acid")
    print(f"Molecular Weight: {Descriptors.MolWt(mol):.2f}")
    print(f"LogP (octanol-water partition coefficient): {Descriptors.MolLogP(mol):.2f}")
    display(Draw.MolToImage(mol)) # Display the molecular structure in the notebook

### SciPy – Scientific Computing

In [None]:
from scipy.optimize import curve_fit

# Let's fit a simple model for a chemical process
# Example: Modeling a first-order reaction decay
def first_order_decay(t, A, k):
    return A * np.exp(-k * t)

# Generate some example data
t_data = np.linspace(0, 10, 50)
C_data = 10 * np.exp(-0.5 * t_data) + np.random.normal(0, 0.5, t_data.size)

# Fit the model to the data using curve_fit
popt, pcov = curve_fit(first_order_decay, t_data, C_data)

A_fit, k_fit = popt

print(f"Fitted initial concentration (A): {A_fit:.2f}")
print(f"Fitted rate constant (k): {k_fit:.2f}")

plt.figure()
plt.scatter(t_data, C_data, label='Experimental Data')
plt.plot(t_data, first_order_decay(t_data, A_fit, k_fit), color='red', label='Fitted Curve')
plt.title('First-Order Reaction Fit with SciPy')
plt.xlabel('Time')
plt.ylabel('Concentration')
plt.legend()
plt.show()

### SymPy – Symbolic Mathematics

In [None]:
from sympy import symbols, Eq, solve

# Define symbolic variables
A, B, C = symbols('A B C')

# Represent a chemical equilibrium reaction: A + B <=> C
K_eq = 1.5 # Equilibrium constant

# Set up the equilibrium equation
# K_eq = [C] / ([A] * [B])
eq = Eq(K_eq, C / (A * B))

print("Equilibrium equation:", eq)

# Solve for the concentration of C given A=0.5 and B=0.3
C_solved = solve(eq.subs({A: 0.5, B: 0.3}), C)

print(f"\nConcentration of C at equilibrium: {C_solved[0]:.2f}")

### Pymatgen – Python Materials Genomics

In [None]:
from pymatgen.core import Element, Composition, Structure

# Create a Composition object for a material
comp = Composition("Fe2O3") # Iron(III) oxide
print(f"Composition: {comp}")
print(f"Total atoms in one formula unit: {comp.get_reduced_composition().num_atoms}")
print(f"Molar Mass: {comp.get_molecular_weight():.2f} g/mol")

# Create a simple crystal structure (e.g., a cubic unit cell)
# Structure takes lattice, species, and atomic coordinates as input
a = 3.84 # Angstroms, lattice parameter for a simple cubic cell
lattice = [[a, 0, 0], [0, a, 0], [0, a, 0]]
species = ["Fe", "Fe", "O", "O", "O"]
coords = [[0, 0, 0], [0.5, 0.5, 0.5], [0, 0.5, 0.5], [0.5, 0, 0.5], [0.5, 0.5, 0]]

unit_cell = Structure(lattice, species, coords)
print("\nExample crystal structure created:")
print(unit_cell)

### Final Note

In [None]:
print("""
This tutorial outlined fundamental Python concepts and provided examples of key libraries for chemical and materials science applications.
Practice implementing these concepts in your own workflows for data handling, analysis, and automation.
""")