# Notebook 02: Diffusion-Controlled Reactions
## **PROJECT: Solve the Mystery of the Slow Reaction**

---

## PROJECT SCENARIO

You're a research chemist at a pharmaceutical company. Your team synthesized a promising drug candidate, but there's a problem:

**The key synthesis reaction is 1000√ó slower than transition state theory predicts!**

Your manager suspects the reaction might be **diffusion-controlled** rather than activation-controlled. Your mission:

1. **Diagnose**: Determine if the reaction is diffusion or activation-controlled
2. **Analyze**: Use experimental data to quantify the rate-limiting step
3. **Optimize**: Recommend a solvent that maximizes reaction rate
4. **Validate**: Test your hypothesis using the Smoluchowski equation

By the end, you'll deliver a **solvent optimization report** with cost-benefit analysis.

---

## LEARNING OBJECTIVES

By the end of this project notebook, you will be able to:
- [ ] Explain the cage effect and encounter pairs in solution-phase reactions
- [ ] Calculate diffusion coefficients using the Stokes-Einstein equation
- [ ] Derive and apply the Smoluchowski equation for diffusion limits
- [ ] Diagnose whether a reaction is diffusion or activation-controlled
- [ ] Use viscosity dependence to distinguish rate-limiting mechanisms
- [ ] Optimize solvent selection for maximum reaction rate

**Self-Assessment**: Check off each objective as you complete it!

---

## PHASE 1: DISCOVER üîç

Before investigating the data, test your intuition about solution-phase reactions.

### PRE-LAB QUESTIONS

**Question 1**: A reaction in water is 10√ó faster than in honey (which is ~1000√ó more viscous). What does this tell you?
- A) The reaction is activation-controlled (viscosity doesn't matter)
- B) The reaction is diffusion-controlled (viscosity slows it down)
- C) The reaction involves a catalyst
- D) Nothing - could be coincidence

<details>
<summary><strong>Click to reveal answer</strong></summary>

**Answer: B) The reaction is diffusion-controlled**

If a reaction rate depends strongly on viscosity (k ‚àù 1/Œ∑), it means the rate-limiting step is molecules diffusing together, not the chemical activation barrier. Activation-controlled reactions show Arrhenius temperature dependence but are independent of viscosity.
</details>

**Question 2**: Two molecules meet in solution and collide. In the gas phase, they might collide once and separate. What happens in solution?
- A) Same thing - single collision then separation
- B) They get "trapped" in a solvent cage and collide many times
- C) They immediately react
- D) The solvent prevents them from colliding

<details>
<summary><strong>Click to reveal answer</strong></summary>

**Answer: B) They get trapped in a solvent cage**

This is the **cage effect**! In solution, surrounding solvent molecules create a "cage" that traps reactive pairs together. They undergo 10-100 collisions in a single "encounter" before diffusing apart. This dramatically changes reaction kinetics compared to gas phase.
</details>

**Question 3**: The Smoluchowski diffusion limit for typical small molecules in water is ~10¬π‚Å∞ M‚Åª¬πs‚Åª¬π. If your measured rate is 10‚Å∏ M‚Åª¬πs‚Åª¬π, what does this mean?
- A) Your reaction is diffusion-controlled
- B) Your reaction is activation-controlled (limited by chemistry, not diffusion)
- C) You made a measurement error
- D) The Smoluchowski equation is wrong

<details>
<summary><strong>Click to reveal answer</strong></summary>

**Answer: B) Activation-controlled**

If k_observed << k_diffusion, then diffusion is fast enough, and the rate-limiting step is the activation barrier. If k_observed ‚âà k_diffusion, the reaction would be diffusion-controlled.
</details>

---

### YOUR CHALLENGE

**Initial Data**: Your reaction measured in various solvents shows:
- Acetone (Œ∑ = 0.31 cP): k = 2.1 √ó 10‚Åπ M‚Åª¬πs‚Åª¬π
- Ethanol (Œ∑ = 1.07 cP): k = 5.9 √ó 10‚Å∏ M‚Åª¬πs‚Åª¬π
- DMSO (Œ∑ = 2.00 cP): k = 3.2 √ó 10‚Å∏ M‚Åª¬πs‚Åª¬π

**Make a prediction**: 
- Is this reaction diffusion-controlled or activation-controlled?
- What solvent would you recommend to maximize rate?

Write your hypothesis (we'll test it with data!):
- My hypothesis: _______________________

---

## 1. Introduction: Reactions in Solution

In the gas phase, molecules fly freely between collisions, and reaction rates are often determined by collision frequency and energy. In solution, the solvent changes everything:

| Gas Phase | Liquid Phase |
| :--- | :--- |
| Molecules fly freely | Molecules are crowded |
| Single collisions | **Cage Effect** (trapped encounter pairs) |
| Low density | High density, diffusion limits rate |
| $Z \approx 10^{34}$ m$^{-3}$s$^{-1}$ | $Z \approx 10^{36}$ m$^{-3}$s$^{-1}$ |

### The Cage Effect & Encounter Pairs
When reactants A and B meet in solution, they don't just collide once. They get trapped in a **solvent cage** and collide many times (an "encounter").

$$ A + B \underset{k_{-d}}{\overset{k_d}{\rightleftharpoons}} (AB) \xrightarrow{k_a} P $$

-   $k_d$: Rate of diffusing together to form the encounter pair $(AB)$.
-   $k_{-d}$: Rate of diffusing apart.
-   $k_a$: Rate of reaction within the cage.

![Solvent Cage](images/solvent_cage.png)

Let's visualize this phenomenon!

In [None]:
# ============================================================
# GOOGLE COLAB SETUP
# ============================================================
import sys
import os

# Check if running in Google Colab
IN_COLAB = 'google.colab' in sys.modules

if IN_COLAB:
    print("=" * 60)
    print("RUNNING IN GOOGLE COLAB")
    print("=" * 60)

    # Clone repository to access images
    repo_url = "https://github.com/mcbadlon31/Reaction-Dynamics-Physical-Chemistry.git"

    print(f"\nCloning repository: {repo_url}")
    print("This may take a minute...")

    !git clone {repo_url} --depth 1 --quiet

    # Change to repository directory
    os.chdir('Reaction-Dynamics-Physical-Chemistry')

    # Install additional packages if needed
    print("\nInstalling additional packages...")
    !pip install -q seaborn plotly ipywidgets

    print("\n" + "=" * 60)
    print("[SUCCESS] Colab setup complete!")
    print("=" * 60)
    print(f"Current directory: {os.getcwd()}")
    print("\nYou can now run all cells normally.")
    print("Images will load from the cloned repository.")

else:
    print("=" * 60)
    print("RUNNING IN LOCAL JUPYTER ENVIRONMENT")
    print("=" * 60)
    print("\nNo setup needed - using local files")

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation
from IPython.display import HTML, display
from scipy import constants
import ipywidgets as widgets

plt.style.use('seaborn-v0_8-darkgrid')
plt.rcParams.update({
    'figure.figsize': (10, 6),
    'figure.dpi': 120,
    'axes.titlesize': 14,
    'axes.labelsize': 12,
    'lines.linewidth': 2,
    'font.family': 'sans-serif',
    'font.sans-serif': ['Arial', 'DejaVu Sans'],
    'grid.alpha': 0.3
})

k_B = constants.Boltzmann
N_A = constants.Avogadro
pi = np.pi

print("Libraries loaded.")

---

## PHASE 2: INVESTIGATE üî¨

Time to dive into the data and diagnose your mystery reaction!

You'll complete **three investigations**:

1. **Cage Effect Quantification**: Measure how long molecules stay trapped together
2. **Diffusion Coefficient Analysis**: Calculate D using Stokes-Einstein equation
3. **Diagnostic Plot**: Test if your reaction is diffusion or activation-controlled

Each investigation includes:
- Data loading and visualization
- Guided calculations
- Interpretation questions
- Checkpoints with solutions

Let's solve the mystery!

### Visualizing the Cage Effect
The animation below simulates a "crowded" environment. The red particle is trapped by the blue solvent particles, forcing it to collide multiple times with neighbors before escaping.

---

## INVESTIGATION 1: Cage Effect Quantification üìä

### YOUR TASK
Analyze experimental data on cage lifetimes in different solvents to understand how viscosity affects molecular encounters.

### EXERCISE 1.1: Load and Visualize Cage Data

In [None]:
# EXERCISE 1.1: Analyze Cage Effect Data

import pandas as pd

# Load the cage lifetime data
try:
    cage_data = pd.read_csv('data/diffusion/cage_lifetime_data.csv')
    print("‚úì Cage effect data loaded successfully!")
    print(f"\nData shape: {cage_data.shape}")
    print(f"\nFirst few rows:")
    print(cage_data.head())
except FileNotFoundError:
    print("‚ö†Ô∏è Data file not found. Make sure you're in the repository directory.")
    cage_data = None

# YOUR TASK: Create visualizations
if cage_data is not None:
    fig, axes = plt.subplots(1, 3, figsize=(16, 5))
    
    # Plot 1: Cage lifetime vs viscosity
    axes[0].scatter(cage_data['viscosity_cP'], cage_data['cage_lifetime_ps'], 
                    s=100, alpha=0.7, c='blue', edgecolors='black')
    axes[0].set_xlabel('Viscosity (cP)', fontsize=12)
    axes[0].set_ylabel('Cage Lifetime (ps)', fontsize=12)
    axes[0].set_title('Cage Lifetime vs. Viscosity', fontsize=14)
    axes[0].set_xscale('log')
    axes[0].set_yscale('log')
    axes[0].grid(True, alpha=0.3)
    
    # Annotate some points
    for idx in [0, 7, 10]:  # hexane, ethanol, DMSO
        row = cage_data.iloc[idx]
        axes[0].annotate(row['solvent'], 
                        (row['viscosity_cP'], row['cage_lifetime_ps']),
                        xytext=(5, 5), textcoords='offset points', fontsize=9)
    
    # Plot 2: Average collisions per cage
    axes[1].scatter(cage_data['viscosity_cP'], cage_data['average_collisions_per_cage'],
                    s=100, alpha=0.7, c='green', edgecolors='black')
    axes[1].set_xlabel('Viscosity (cP)', fontsize=12)
    axes[1].set_ylabel('Collisions per Cage', fontsize=12)
    axes[1].set_title('Cage Collision Count', fontsize=14)
    axes[1].set_xscale('log')
    axes[1].grid(True, alpha=0.3)
    
    # Plot 3: Escape probability
    axes[2].scatter(cage_data['viscosity_cP'], cage_data['escape_probability'],
                    s=100, alpha=0.7, c='red', edgecolors='black')
    axes[2].set_xlabel('Viscosity (cP)', fontsize=12)
    axes[2].set_ylabel('Escape Probability', fontsize=12)
    axes[2].set_title('Probability of Escaping Cage', fontsize=14)
    axes[2].set_xscale('log')
    axes[2].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    print("\n" + "="*70)
    print("ANALYSIS QUESTIONS")
    print("="*70)
    print("\n1. How does cage lifetime scale with viscosity?")
    print("   Observation: As viscosity increases, cage lifetime ______ (increases/decreases)")
    
    print("\n2. In hexane (Œ∑=0.30 cP) vs DMSO (Œ∑=2.0 cP), how many times longer is the cage?")
    hexane_lifetime = cage_data[cage_data['solvent'] == 'hexane']['cage_lifetime_ps'].values[0]
    dmso_lifetime = cage_data[cage_data['solvent'] == 'DMSO']['cage_lifetime_ps'].values[0]
    print(f"   Hexane cage lifetime: {hexane_lifetime} ps")
    print(f"   DMSO cage lifetime: {dmso_lifetime} ps")
    print(f"   Ratio: DMSO/Hexane = {dmso_lifetime/hexane_lifetime:.1f}√ó")
    
    print("\n3. What does 'escape probability' mean?")
    print("   It's the fraction of encounter pairs that diffuse apart WITHOUT reacting")
    print("   Low escape probability ‚Üí more likely to react while caged")
    print("="*70)

In [None]:
class CageEffectAnimator:
    """Simulate the cage effect in 2D"""
    
    def __init__(self, n_solvent=50):
        self.n_solvent = n_solvent
        self.fig, self.ax = plt.subplots(figsize=(6, 6))
        self.setup_simulation()
        
    def setup_simulation(self):
        self.ax.set_xlim(-5, 5)
        self.ax.set_ylim(-5, 5)
        self.ax.set_aspect('equal')
        self.ax.set_title('Solvent Cage Effect')
        self.ax.grid(False)
        
        # Solvent particles (blue)
        self.solvent_x = np.random.uniform(-5, 5, self.n_solvent)
        self.solvent_y = np.random.uniform(-5, 5, self.n_solvent)
        self.solvent_scat = self.ax.scatter(self.solvent_x, self.solvent_y, 
                                          c='lightblue', s=100, alpha=0.6)
        
        # Reactant particle (red)
        self.reactant_x = [0.0]
        self.reactant_y = [0.0]
        self.reactant_scat = self.ax.scatter([0], [0], c='red', s=150, edgecolors='black')
        
        # Trajectory line
        self.traj_line, = self.ax.plot([], [], 'r-', linewidth=1, alpha=0.5)
        
    def animate(self):
        def update(frame):
            # Random walk step for reactant
            step_size = 0.2
            dx = np.random.normal(0, step_size)
            dy = np.random.normal(0, step_size)
            
            # Simple hard-sphere repulsion from solvent
            new_x = self.reactant_x[-1] + dx
            new_y = self.reactant_y[-1] + dy
            
            # Check collisions with solvent (simplified)
            for i in range(self.n_solvent):
                dist = np.sqrt((new_x - self.solvent_x[i])**2 + (new_y - self.solvent_y[i])**2)
                if dist < 0.8: # Collision radius
                    # Bounce back
                    new_x = self.reactant_x[-1] - dx
                    new_y = self.reactant_y[-1] - dy
                    break
            
            # Boundary check
            if abs(new_x) > 5: new_x = self.reactant_x[-1]
            if abs(new_y) > 5: new_y = self.reactant_y[-1]
            
            self.reactant_x.append(new_x)
            self.reactant_y.append(new_y)
            
            # Keep trajectory short
            if len(self.reactant_x) > 50:
                self.reactant_x.pop(0)
                self.reactant_y.pop(0)
                
            self.reactant_scat.set_offsets(np.c_[new_x, new_y])
            self.traj_line.set_data(self.reactant_x, self.reactant_y)
            
            # Jiggle solvent slightly
            self.solvent_x += np.random.normal(0, 0.05, self.n_solvent)
            self.solvent_y += np.random.normal(0, 0.05, self.n_solvent)
            self.solvent_scat.set_offsets(np.c_[self.solvent_x, self.solvent_y])
            
            return self.reactant_scat, self.traj_line, self.solvent_scat
            
        anim = FuncAnimation(self.fig, update, frames=100, interval=50, blit=True)
        plt.close()
        return HTML(anim.to_jshtml())

print("\nüé¨ Cage Effect Animation:")
animator = CageEffectAnimator(n_solvent=40)
display(animator.animate())

## 2. Diffusion and the Stokes-Einstein Equation

### 2.1 Fick's First Law
Diffusion is the net movement of particles from high concentration to low concentration. The flux $J$ (particles per area per second) is proportional to the concentration gradient:
$$ J = -D \frac{\partial c}{\partial x} $$
where $D$ is the **diffusion coefficient** (m$^2$/s).

### 2.2 Stokes-Einstein Equation
For a spherical particle of radius $R$ moving in a solvent of viscosity $\eta$, the drag force is $F = 6\pi\eta R v$. Einstein showed that the diffusion coefficient is related to thermal energy and this drag:
$$ D = \frac{k_B T}{6 \pi \eta R} $$

**Implications**:
-   Smaller molecules diffuse faster.
-   Higher temperature increases diffusion (more thermal energy).
-   Higher viscosity slows diffusion (more drag).

In [None]:
def calculate_diffusion(T, eta_cP, R_nm):
    # Convert units
    eta = eta_cP * 1e-3  # Pa s (1 cP = 10^-3 Pa s)
    R = R_nm * 1e-9  # m
    
    # Stokes-Einstein
    D = (k_B * T) / (6 * pi * eta * R)
    
    print(f"Temperature: {T} K")
    print(f"Viscosity: {eta_cP} cP")
    print(f"Hydrodynamic Radius: {R_nm} nm")
    print(f"Diffusion Coefficient D: {D:.2e} m^2/s")
    print(f"D: {D * 1e9:.2f} x 10^-9 m^2/s (typical units)")

widgets.interact(calculate_diffusion, 
                 T=widgets.FloatSlider(min=200, max=400, step=10, value=298, description='T (K)'),
                 eta_cP=widgets.FloatSlider(min=0.1, max=10, step=0.1, value=0.89, description='Viscosity (cP)'),
                 R_nm=widgets.FloatSlider(min=0.1, max=5, step=0.1, value=0.5, description='Radius (nm)'));

## 3. The Smoluchowski Limit

What is the maximum possible rate for a reaction A + B $\rightarrow$ P?
Smoluchowski solved the diffusion equation for particles B diffusing towards a "sink" (particle A) that absorbs them immediately upon contact (at distance $R^* = R_A + R_B$).

### Derivation Sketch
1.  **Fick's First Law**: Flux $J = -D \frac{\partial [B]}{\partial r}$.
2.  **Steady-State Profile**: The concentration of B around A is given by:
    $$ [B](r) = [B]_{bulk} \left(1 - \frac{R^*}{r}\right) $$
    (Boundary condition: $[B] = 0$ at $r = R^*$).
3.  **Total Flux**: The total flux into the sphere of radius $R^*$ is:
    $$ \Phi = 4\pi (R^*)^2 J|_{R^*} = 4\pi R^* D [B]_{bulk} $$
4.  **Rate Constant**: The rate constant is the flux per unit concentration:
    $$ k_d = 4\pi R^* D $$

Converting to molar units:
$$ k_d = 4 \pi R^* (D_A + D_B) N_A $$

Using the Stokes-Einstein equation ($D = \frac{k_B T}{6\pi\eta R}$), we get a remarkable result:
$$ k_d \approx \frac{8RT}{3\eta} $$

**Key Insight**: The diffusion limit depends mainly on **solvent viscosity** and **temperature**, not on the size of the reactants!

In [None]:
def smoluchowski_limit(D_sum_10_9, R_star_nm):
    D_sum = D_sum_10_9 * 1e-9  # m^2/s
    R_star = R_star_nm * 1e-9  # m
    
    # Rate constant in m^3 molecule^-1 s^-1
    kd_SI = 4 * pi * R_star * D_sum
    
    # Convert to M^-1 s^-1 (L mol^-1 s^-1)
    # Multiply by N_A to get per mole
    # Multiply by 1000 to convert m^3 to L
    kd_M = kd_SI * N_A * 1000
    
    print(f"Sum of Diffusion Coeffs: {D_sum:.2e} m^2/s")
    print(f"Reaction Distance R*: {R_star_nm} nm")
    print(f"Smoluchowski Limit kd: {kd_M:.2e} M^-1 s^-1")
    
    widgets.interact(smoluchowski_limit, 
                 D_sum_10_9=widgets.FloatSlider(min=0.1, max=10, step=0.1, value=2.0, description='D_sum (10^-9)'),
                 R_star_nm=widgets.FloatSlider(min=0.1, max=2, step=0.1, value=0.5, description='R* (nm)'));

## 4. Activation vs. Diffusion Control

Real reactions involve both diffusion ($k_d$) and chemical activation ($k_a$). Applying the **Steady-State Approximation** to the encounter pair $(AB)$:

$$ \frac{d[(AB)]}{dt} = k_d[A][B] - k_{-d}[(AB)] - k_a[(AB)] \approx 0 $$

Solving for $[(AB)]$ and substituting into Rate $= k_a[(AB)]$, we get:
$$ k_{eff} = \frac{k_a k_d}{k_{-d} + k_a} $$

-   **Diffusion Control** ($k_a \gg k_{-d}$): $k_{eff} \approx k_d$. Rate depends on viscosity ($k \propto T/\eta$).
-   **Activation Control** ($k_a \ll k_{-d}$): $k_{eff} \approx K_{eq} k_a$. Rate depends on activation energy ($k \propto e^{-E_a/RT}$).

### Phase Diagram Explorer
The plot below shows the transition between the two regimes. The "Observed Rate" (black line) follows the slower of the two limits.

In [None]:
def plot_rate_control(kd_log, ka_log):
    kd = 10**kd_log
    ka = 10**ka_log
    
    k_obs = (kd * ka) / (kd + ka)
    
    print(f"Diffusion Limit kd = {kd:.2e}")
    print(f"Activation Rate ka = {ka:.2e}")
    print(f"Observed Rate k_obs = {k_obs:.2e}")
    
    if kd < 0.1 * ka:
        print("Regime: Diffusion Controlled (limited by transport)")
    elif ka < 0.1 * kd:
        print("Regime: Activation Controlled (limited by chemistry)")
    else:
        print("Regime: Mixed Control")
        
    # Plot dependence on viscosity
    viscosity = np.logspace(-2, 1, 100) # relative viscosity
    # kd scales as 1/eta
    kd_vals = kd / viscosity
    
    k_obs_vals = (kd_vals * ka) / (kd_vals + ka)
    
    plt.figure(figsize=(8, 5))
    plt.loglog(viscosity, kd_vals, '--', label='Diffusion Limit (kd ~ 1/eta)')
    plt.loglog(viscosity, [ka]*len(viscosity), '--', label='Activation Limit (ka)')
    plt.loglog(viscosity, k_obs_vals, 'k-', linewidth=2, label='Observed Rate (k_obs)')
    plt.xlabel('Relative Viscosity')
    plt.ylabel('Rate Constant')
    plt.title('Effect of Viscosity on Reaction Rate')
    plt.legend()
    plt.grid(True)
    plt.show()

widgets.interact(plot_rate_control, 
                 kd_log=widgets.FloatSlider(min=5, max=11, step=0.1, value=9, description='log(kd)'),
                 ka_log=widgets.FloatSlider(min=5, max=11, step=0.1, value=10, description='log(ka)'));

---

## PHASE 3: SYNTHESIZE üéØ

## CAPSTONE CHALLENGE: Solvent Optimization Report

Time to solve your pharmaceutical mystery and deliver your recommendation!

---

### THE COMPLETE DATASET

You now have rate constant measurements for your reaction in 13 different solvents. Your task is to:

1. **Diagnose**: Prove whether the reaction is diffusion or activation-controlled
2. **Calculate**: Determine the Smoluchowski diffusion limit for each solvent
3. **Compare**: Calculate the ratio k_observed / k_diffusion
4. **Optimize**: Recommend the best solvent considering rate, cost, and safety
5. **Report**: Write a technical memo justifying your choice

Let's analyze the complete dataset:

In [None]:
# CAPSTONE CHALLENGE: Comprehensive Solvent Analysis

# Load all datasets
try:
    mystery_data = pd.read_csv('data/diffusion/mystery_reaction_data.csv')
    solvent_props = pd.read_csv('data/diffusion/solvent_properties.csv')
    
    # Merge datasets
    full_data = mystery_data.merge(solvent_props, on='solvent', how='left')
    
    print("‚úì All data loaded successfully!")
    print(f"\nDataset: {len(full_data)} solvents analyzed\n")
    
except FileNotFoundError:
    print("‚ö†Ô∏è Data files not found")
    full_data = None

# STEP 1: Calculate Smoluchowski diffusion limits
if full_data is not None:
    
    # Parameters for the reaction (typical small molecules)
    R_A = 0.3e-9  # m (molecule A radius)
    R_B = 0.4e-9  # m (molecule B radius)
    R_star = R_A + R_B  # Reaction distance
    T = 298  # K
    
    # Calculate diffusion coefficients using Stokes-Einstein
    def calculate_D(eta_cP, R_nm):
        eta = eta_cP * 1e-3  # Convert to Pa¬∑s
        R = R_nm * 1e-9      # Convert to m
        D = (k_B * T) / (6 * pi * eta * R)
        return D
    
    # Calculate D for both molecules in each solvent
    full_data['D_A'] = full_data['viscosity_cP_298K'].apply(lambda eta: calculate_D(eta, R_A*1e9))
    full_data['D_B'] = full_data['viscosity_cP_298K'].apply(lambda eta: calculate_D(eta, R_B*1e9))
    full_data['D_sum'] = full_data['D_A'] + full_data['D_B']
    
    # Calculate Smoluchowski limit: k_d = 4œÄR*D N_A (in M^-1 s^-1)
    full_data['k_diffusion'] = 4 * pi * R_star * full_data['D_sum'] * N_A * 1000  # *1000 for m¬≥ to L
    
    # Calculate the ratio (diagnostic for diffusion control)
    full_data['k_ratio'] = full_data['rate_constant_M_inv_s_inv'] / full_data['k_diffusion']
    
    # STEP 2: Diagnostic Analysis
    print("="*80)
    print("DIAGNOSTIC ANALYSIS: DIFFUSION vs. ACTIVATION CONTROL")
    print("="*80)
    
    print("\nKey Metrics by Solvent:")
    print(full_data[['solvent', 'viscosity_cP_298K', 'rate_constant_M_inv_s_inv', 
                      'k_diffusion', 'k_ratio']].to_string(index=False))
    
    # Calculate average ratio
    avg_ratio = full_data['k_ratio'].mean()
    
    print(f"\nüìä DIAGNOSIS:")
    print(f"   Average k_obs / k_diff = {avg_ratio:.3f}")
    
    if avg_ratio > 0.5:
        print(f"\n   ‚úì DIFFUSION-CONTROLLED!")
        print(f"   The observed rate is {avg_ratio*100:.0f}% of the diffusion limit.")
        print(f"   Viscosity is the primary rate-limiting factor.")
    else:
        print(f"\n   ‚úó ACTIVATION-CONTROLLED")
        print(f"   The observed rate is only {avg_ratio*100:.0f}% of what diffusion allows.")
        print(f"   Chemical activation barrier dominates.")
    
    print("\n" + "="*80)
    
    # STEP 3: Test viscosity dependence
    # For diffusion control: k ‚àù 1/Œ∑ (from Smoluchowski: k_d = 8RT/3Œ∑)
    # Let's check correlation
    import scipy.stats as stats
    
    # Log-log plot should give slope of -1 for diffusion control
    log_eta = np.log10(full_data['viscosity_cP_298K'])
    log_k = np.log10(full_data['rate_constant_M_inv_s_inv'])
    
    slope, intercept, r_value, p_value, std_err = stats.linregress(log_eta, log_k)
    
    print("\nVISCOSITY DEPENDENCE TEST:")
    print(f"   log(k) vs log(Œ∑) slope: {slope:.2f}")
    print(f"   R¬≤ = {r_value**2:.3f}")
    print(f"\n   Theory predicts:")
    print(f"   - Diffusion control: slope ‚âà -1.0")
    print(f"   - Activation control: slope ‚âà 0.0")
    print(f"\n   Your reaction: slope = {slope:.2f}")
    
    if abs(slope + 1.0) < 0.2:
        print(f"   ‚úì Confirms diffusion control!")
    else:
        print(f"   ‚Üí Mixed or activation control")
    
    print("="*80)
    
    # STEP 4: Optimization with constraints
    print("\nSOLVENT OPTIMIZATION")
    print("="*80)
    
    # Add scoring: rate (50%), cost (30%), safety (20%)
    # Normalize each metric
    full_data['rate_score'] = (full_data['rate_constant_M_inv_s_inv'] / 
                                 full_data['rate_constant_M_inv_s_inv'].max()) * 50
    full_data['cost_score'] = (1 - full_data['cost_per_L_USD'] / 
                                full_data['cost_per_L_USD'].max()) * 30
    full_data['safety_score'] = (full_data['safety_rating'] / 5) * 20
    full_data['total_score'] = (full_data['rate_score'] + full_data['cost_score'] + 
                                 full_data['safety_score'])
    
    # Sort by total score
    ranked = full_data.sort_values('total_score', ascending=False)
    
    print("\nTOP 5 SOLVENT RECOMMENDATIONS:")
    print(ranked[['solvent', 'rate_constant_M_inv_s_inv', 'cost_per_L_USD', 
                   'safety_rating', 'total_score']].head().to_string(index=False))
    
    optimal_solvent = ranked.iloc[0]['solvent']
    optimal_k = ranked.iloc[0]['rate_constant_M_inv_s_inv']
    optimal_cost = ranked.iloc[0]['cost_per_L_USD']
    optimal_safety = ranked.iloc[0]['safety_rating']
    
    print(f"\nüéØ RECOMMENDED SOLVENT: {optimal_solvent.upper()}")
    print(f"   Rate constant: {optimal_k:.2e} M‚Åª¬πs‚Åª¬π")
    print(f"   Cost: ${optimal_cost:.0f}/L")
    print(f"   Safety rating: {optimal_safety}/5")
    
    print("\n" + "="*80)
    
    # STEP 5: Visualizations
    fig, axes = plt.subplots(2, 2, figsize=(14, 10))
    
    # Plot 1: k_obs vs k_diffusion
    axes[0, 0].scatter(full_data['k_diffusion'], full_data['rate_constant_M_inv_s_inv'],
                       s=100, alpha=0.7, c='blue', edgecolors='black')
    # Add diagonal line (y=x) for reference
    max_k = max(full_data['k_diffusion'].max(), full_data['rate_constant_M_inv_s_inv'].max())
    axes[0, 0].plot([0, max_k], [0, max_k], 'r--', linewidth=2, 
                    label='k_obs = k_diff (diffusion limit)')
    axes[0, 0].set_xlabel('k_diffusion (M‚Åª¬πs‚Åª¬π)', fontsize=12)
    axes[0, 0].set_ylabel('k_observed (M‚Åª¬πs‚Åª¬π)', fontsize=12)
    axes[0, 0].set_title('Observed vs. Diffusion Limit', fontsize=14)
    axes[0, 0].legend()
    axes[0, 0].grid(True, alpha=0.3)
    
    # Plot 2: k vs 1/Œ∑
    axes[0, 1].scatter(1/full_data['viscosity_cP_298K'], full_data['rate_constant_M_inv_s_inv'],
                       s=100, alpha=0.7, c='green', edgecolors='black')
    axes[0, 1].set_xlabel('1/Œ∑ (1/cP)', fontsize=12)
    axes[0, 1].set_ylabel('k (M‚Åª¬πs‚Åª¬π)', fontsize=12)
    axes[0, 1].set_title('Rate vs. Inverse Viscosity', fontsize=14)
    axes[0, 1].grid(True, alpha=0.3)
    
    # Plot 3: log-log plot for slope analysis
    axes[1, 0].scatter(log_eta, log_k, s=100, alpha=0.7, c='red', edgecolors='black')
    # Add regression line
    axes[1, 0].plot(log_eta, slope * log_eta + intercept, 'k--', linewidth=2,
                    label=f'Slope = {slope:.2f}')
    axes[1, 0].set_xlabel('log(Œ∑)', fontsize=12)
    axes[1, 0].set_ylabel('log(k)', fontsize=12)
    axes[1, 0].set_title('Log-Log Viscosity Dependence', fontsize=14)
    axes[1, 0].legend()
    axes[1, 0].grid(True, alpha=0.3)
    
    # Plot 4: Optimization scorecard
    top_5 = ranked.head(5)
    x_pos = np.arange(len(top_5))
    axes[1, 1].bar(x_pos, top_5['total_score'], alpha=0.7, color='purple', edgecolor='black')
    axes[1, 1].set_xticks(x_pos)
    axes[1, 1].set_xticklabels(top_5['solvent'], rotation=45, ha='right')
    axes[1, 1].set_ylabel('Total Score', fontsize=12)
    axes[1, 1].set_title('Solvent Optimization Ranking', fontsize=14)
    axes[1, 1].grid(True, alpha=0.3, axis='y')
    
    plt.tight_layout()
    plt.show()

### üìù YOUR TECHNICAL MEMO

Based on your analysis, write a brief recommendation (3-4 sentences) to your manager:

**MEMO TEMPLATE:**

**TO**: Project Manager  
**FROM**: [Your Name]  
**RE**: Solvent Optimization for Drug Synthesis

**FINDINGS**:
1. Is the reaction diffusion or activation-controlled? (cite evidence: k_obs/k_diff ratio, viscosity slope)
2. What is the recommended solvent and why?
3. What are the expected benefits (rate increase, cost, safety)?
4. What are the risks or limitations?

---

### üéì LEARNING OBJECTIVES - REVIEW

Go back and check off your completed objectives:
- [x] Explain the cage effect and encounter pairs in solution-phase reactions
- [x] Calculate diffusion coefficients using the Stokes-Einstein equation
- [x] Derive and apply the Smoluchowski equation for diffusion limits
- [x] Diagnose whether a reaction is diffusion or activation-controlled
- [x] Use viscosity dependence to distinguish rate-limiting mechanisms
- [x] Optimize solvent selection for maximum reaction rate

**Excellent work! You've solved the pharmaceutical mystery!**

---

### ü§î FINAL REFLECTION

1. **Real-World Application**: How might this analysis apply to biological systems (enzyme kinetics, cellular reactions)?
   - Hint: Viscosity in cells is ~3√ó higher than water!

2. **Temperature vs. Viscosity**: For a diffusion-controlled reaction, how would you speed it up?
   - Increase temperature (‚Üë D, ‚Üì Œ∑)?
   - Change solvent (‚Üì Œ∑)?
   - Which is more practical industrially?

3. **Connection to Next Topics**: How might Marcus electron transfer theory (Notebook 05) relate to diffusion control?
   - Preview: Long-range ET can avoid diffusion limits!

---

### üìö EXTENSIONS

Want to go deeper?
- Calculate the Debye-Smoluchowski equation for charged reactants (ionic strength effects)
- Explore the Stokes-Einstein violation in supercooled liquids
- Model reaction-diffusion patterns (Turing patterns in biology)

---

## CONGRATULATIONS! üéâ

You've successfully:
- ‚úÖ Diagnosed a diffusion-controlled pharmaceutical reaction
- ‚úÖ Quantified the Smoluchowski diffusion limit across 13 solvents
- ‚úÖ Optimized solvent selection with multi-criteria analysis
- ‚úÖ Mastered the Stokes-Einstein equation and viscosity effects

**Ready for Notebook 03: Transition State Theory!**

### The Material-Balance Equation
For a more general description of concentration changes in space and time (e.g., in a flow reactor or biological cell), we combine diffusion, convection, and reaction:

$$ \frac{\partial [J]}{\partial t} = \underbrace{D \frac{\partial^2 [J]}{\partial x^2}}_{Diffusion} - \underbrace{v \frac{\partial [J]}{\partial x}}_{Convection} - \underbrace{k_r [J]}_{Reaction} $$

In [None]:
def solve_reaction_diffusion(D, kr, time_max):
    """Numerical solver for Reaction-Diffusion equation"""
    
    # Spatial domain
    L = 10.0
    nx = 100
    dx = L / nx
    x = np.linspace(-L/2, L/2, nx)
    
    # Time step (stability condition: dt < dx^2 / 2D)
    dt = 0.2 * dx**2 / D
    nt = int(time_max / dt)
    
    # Initial condition: Gaussian pulse
    u = np.exp(-x**2)
    
    # Store history for plotting
    history = [u.copy()]
    times = [0]
    
    # Time stepping loop
    for n in range(nt):
        un = u.copy()
        # Finite difference scheme
        # u_new = u + dt * (D * d2u/dx2 - kr * u)
        for i in range(1, nx-1):
            diffusion = D * (un[i+1] - 2*un[i] + un[i-1]) / dx**2
            reaction = -kr * un[i]
            u[i] = un[i] + dt * (diffusion + reaction)
            
        if n % (nt // 10) == 0:
            history.append(u.copy())
            times.append(n * dt)
            
    # Plotting
    plt.figure(figsize=(10, 6))
    
    # Plot evolution
    for i, u_state in enumerate(history):
        alpha = 0.2 + 0.8 * (i / len(history))
        plt.plot(x, u_state, label=f't={times[i]:.2f}', color=plt.cm.viridis(i/len(history)), linewidth=2)
        
    plt.xlabel('Position x')
    plt.ylabel('Concentration [J]')
    plt.title(f'Reaction-Diffusion Evolution (D={D}, kr={kr})')
    plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
    plt.grid(True)
    plt.tight_layout()
    plt.show()

print("\nüìä Reaction-Diffusion Solver:")
widgets.interact(solve_reaction_diffusion, 
                 D=widgets.FloatSlider(min=0.1, max=2.0, step=0.1, value=1.0, description='Diffusion D'),
                 kr=widgets.FloatSlider(min=0.0, max=1.0, step=0.1, value=0.2, description='Reaction k'),
                 time_max=widgets.FloatSlider(min=1.0, max=10.0, step=1.0, value=5.0, description='Max Time'));

## Summary

1.  **Diffusion**: Molecules in solution move via random walks, described by Fick's laws and the Stokes-Einstein equation.
2.  **Smoluchowski Limit**: The maximum rate of reaction is determined by how fast reactants can diffuse together ($k_d \approx 10^{10}$ M$^{-1}$s$^{-1}$).
3.  **Rate Control**: Reactions can be diffusion-controlled (viscosity dependent) or activation-controlled (viscosity independent).
4.  **Material Balance**: Combines diffusion, convection, and reaction to describe concentration changes in space and time.