# üß¨ Complete Drug Discovery Pipeline Demo

A comprehensive demonstration of the modular `src` package for drug discovery workflows.

## üìã Workflow Overview

1. **üì¶ Import Modules** - Load all components from `src`
2. **üèóÔ∏è Load Molecules** - Load drug molecules from CSV or examples
3. **üîÆ Visualization** - 3D molecular viewers (py3Dmol, Plotly)
4. **‚úèÔ∏è Molecular Editor** - SMILES input with 2D/3D preview
5. **üî¨ ADMET Analysis** - Property filtering & comparison
6. **‚öõÔ∏è DFT Optimization** - Single molecule & batch processing
7. **üìπ Trajectory Analysis** - Visualize optimization trajectories
8. **üß¨ Wavefunction Visualization** - Molecular orbital analysis
9. **üìä Progress Monitoring** - Track optimization jobs
10. **üíæ Export** - Save results to SDF format

---
**Everything is imported from `src` - no code from scratch!**

---
# 1Ô∏è‚É£ Import ALL Modules from `src`

In [1]:
# Standard libraries
import numpy as np
import pandas as pd
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Interactive widgets
import ipywidgets as widgets
from IPython.display import display, HTML, clear_output

# ============================================================
# Force reload src modules (prevents cached import issues)
# ============================================================
import sys
sys.path.insert(0, '.')

import importlib
import src
importlib.reload(src)

from src import (
    # Core classes
    DrugMolecule, Config, ADMETPredictor,
    
    # Visualization
    MolecularViewer3D, PropertyPlotter,
    
    # Interactive widgets - Visualization
    InteractivePy3DmolViewer,         # py3Dmol 3D viewer
    TrajectoryViewer,                 # Trajectory visualization
    WavefunctionVisualizer,           # MO visualization
    
    # Interactive widgets - Editing
    MolecularEditor,                  # SMILES editor
    
    # Interactive widgets - ADMET Analysis
    InteractivePropertyFilter,        # Property filter with histograms
    MoleculeComparator,               # Side-by-side comparison
    
    # Interactive widgets - DFT Optimization
    DFTControlPanel,                  # Single molecule DFT
    BatchDFTControlPanel,             # Parallel batch DFT
    
    # Interactive widgets - Progress
    LiveProgressMonitor,              # Log file viewer
    
    # Export & utility functions
    export_molecules_to_sdf,
    load_xyz_trajectory,
    analyze_trajectory,
    plot_trajectory_analysis,
    
    # Utilities
    get_example_molecules, calculate_molecular_descriptors
)

---
# 2Ô∏è‚É£ Load Drug Molecules

In [2]:
# Load FDA approved structures or use examples
data_file = Path('./FDA_Approved_structures.csv')

if data_file.exists():
    df = pd.read_csv(data_file)
    print(f"üìä Loaded {len(df)} structures from FDA Approved database")
    
    molecules = []
    max_molecules = 30
    
    smiles_col = next((c for c in df.columns if 'smiles' in c.lower() or 'structure' in c.lower()), None)
    name_col = next((c for c in df.columns if 'name' in c.lower() or 'drug' in c.lower()), None)
    cas_col = next((c for c in df.columns if 'cas' in c.lower()), None)
    
    if smiles_col:
        for idx, row in df.head(max_molecules).iterrows():
            try:
                mol = DrugMolecule(
                    name=str(row.get(name_col, f"Drug_{idx}")),
                    cas=str(row.get(cas_col, f"CAS-{idx}")),
                    smiles=str(row[smiles_col])
                )
                mol.calculate_descriptors()
                molecules.append(mol)
            except:
                pass
        
        print(f"‚úÖ Created {len(molecules)} DrugMolecule objects")
else:
    molecules = get_example_molecules()
    for mol in molecules:
        mol.calculate_descriptors()
    print(f"‚úÖ Loaded {len(molecules)} example molecules")


üìä Loaded 2584 structures from FDA Approved database
‚úÖ Created 28 DrugMolecule objects


[17:07:18] UFFTYPER: Unrecognized charge state for atom: 11


---
# 3Ô∏è‚É£ 3D Molecular Visualization

Interactive py3Dmol viewer with multiple style options.

In [3]:
# InteractivePy3DmolViewer - py3Dmol 3D viewer with style controls
py3dmol_viewer = InteractivePy3DmolViewer()
py3dmol_viewer.load_molecules(molecules)
display(py3dmol_viewer.display())

VBox(children=(HTML(value='<h4>üîÆ Interactive 3D Molecule Viewer (py3Dmol)</h4>'), HBox(children=(Dropdown(desc‚Ä¶

---
# 4Ô∏è‚É£ Molecular Editor

Draw and edit molecules using SMILES notation with real-time 2D/3D preview.

In [4]:
# MolecularEditor - SMILES input with 2D/3D preview
mol_editor = MolecularEditor()
#mol_editor.load_molecules(molecules, drug_molecule_class=DrugMolecule)
mol_editor.load_molecules(molecules)
display(mol_editor.display())

VBox(children=(HTML(value="\n        <div style='background: linear-gradient(135deg, #ec4899 0%, #f97316 100%)‚Ä¶

---
# 5Ô∏è‚É£ ADMET Property Analysis

## 5a. Interactive Property Filter

In [5]:
# InteractivePropertyFilter - Filter with histograms
property_filter = InteractivePropertyFilter(molecules)
display(property_filter.display())

filtered_molecules = property_filter.get_filtered_molecules()

VBox(children=(HTML(value="\n        <div style='background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%)‚Ä¶

## 5b. Molecule Comparison Tool

In [6]:
# MoleculeComparator - Side-by-side comparison with radar chart
comparator = MoleculeComparator(filtered_molecules)
display(comparator.display())

VBox(children=(HTML(value="\n        <div style='background: linear-gradient(135deg, #11998e 0%, #38ef7d 100%)‚Ä¶

---
# 6Ô∏è‚É£ DFT Geometry Optimization

## 6a. Single Molecule DFT

In [None]:
# DFTControlPanel - Single molecule DFT optimization
dft_panel = DFTControlPanel()
dft_panel.load_molecules(molecules)
display(dft_panel.display())

VBox(children=(HTML(value="\n        <div style='background: linear-gradient(135deg, #667eea 0%, #764ba2 100%)‚Ä¶

## 6b. Parallel Batch DFT Optimization

In [8]:
# BatchDFTControlPanel - Parallel multi-molecule DFT
batch_dft_panel = BatchDFTControlPanel()
batch_dft_panel.load_molecules(molecules)
display(batch_dft_panel.display())

VBox(children=(HTML(value="\n        <div style='background: linear-gradient(135deg, #667eea 0%, #764ba2 100%)‚Ä¶

---
# 7Ô∏è‚É£ Trajectory Visualization

Load and visualize geometry optimization trajectories with frame-by-frame playback.

In [9]:
# TrajectoryViewer - Load and visualize optimization trajectories
traj_viewer = TrajectoryViewer(base_dir='./optimized_molecules')
display(traj_viewer.display())

VBox(children=(HTML(value="\n        <div style='background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%)‚Ä¶

---
# 8Ô∏è‚É£ Wavefunction & Molecular Orbital Visualization

Load and visualize molecular orbitals (HOMO/LUMO) from saved DFT wavefunctions.

In [12]:
from src.interactive import WavefunctionVisualizer
from src.analysis import OrbitalVisualizer
mo_viz = OrbitalVisualizer()
wfn_viewer = WavefunctionVisualizer(base_dir='./optimized_molecules')
wfn_viewer.set_mo_visualizer(mo_viz)
display(wfn_viewer.display())

VBox(children=(HTML(value="\n        <div style='background: linear-gradient(135deg, #667eea 0%, #764ba2 100%)‚Ä¶

---
# 9Ô∏è‚É£ Data Aggregation & Analysis

Collect results from optimized molecules into a comprehensive dataset.

In [13]:
from src.analysis.dft_data_collector import DFTDataCollector

# 1. Collect Data
collector = DFTDataCollector(output_dir='./optimized_molecules')
df_results = collector.collect_data()

# 2. Display Preview
if not df_results.empty:
    print(f"‚úÖ Collected results for {len(df_results)} molecules!")
    display(df_results.head())
    
    # 3. Save to CSV
    csv_path = "final_dft_analysis_results.csv"
    df_results.to_csv(csv_path, index=False)
    print(f"üì¶ Saved full dataset to '{csv_path}'")

Found 2 wavefunction files in optimized_molecules
‚úÖ Collected results for 2 molecules!


Unnamed: 0,Molecule_Name,SMILES,Total_Energy_Ha,HOMO_eV,LUMO_eV,Gap_eV,Dipole_Moment_Debye,Polarizability_au3,Volume_A3,PSA_A2,...,Max_Fukui_f_minus,Max_Spin_Density,ESP_Min_au,ESP_Max_au,ESP_Variance,File_Path,Nucleophilicity_Index,Max_Fukui_f_radical,ESP_Avg_Pos_au,ESP_Avg_Neg_au
0,Acetohydroxamic acid_optimized,CC(=O)NO,-280.33024,-3.329935,2.494031,5.823967,2.538557,152.817609,73.421511,84.95,...,0.250804,0.0,-186.187402,11.183243,5910.245528,optimized_molecules/Acetohydroxamic acid_optim...,33.340036,0.176685,9.354826,-124.86697
1,Acetic acid_optimized,CC(O)=O,-225.801492,-4.421893,3.207377,7.62927,0.850428,95.157269,59.719455,49.69,...,0.347085,0.0,-187.317596,8.728149,6370.761393,optimized_molecules/Acetic acid_optimized/wave...,20.688857,0.296148,7.885066,-128.086088


üì¶ Saved full dataset to 'final_dft_analysis_results.csv'


---
# üìä Summary

## Complete Modular Architecture

| # | Feature | Class | Lines |
|---|---------|-------|-------|
| 1 | **3D Viewer** | `InteractivePy3DmolViewer` | 3 |
| 2 | **Molecular Editor** | `MolecularEditor` | 3 |
| 3 | **Property Filter** | `InteractivePropertyFilter` | 2 |
| 4 | **Molecule Comparator** | `MoleculeComparator` | 2 |
| 5 | **Single DFT** | `DFTControlPanel` | 3 |
| 6 | **Batch DFT** | `BatchDFTControlPanel` | 3 |
| 7 | **Trajectory Viewer** | `TrajectoryViewer` | 2 |
| 8 | **Wavefunction Viewer** | `WavefunctionVisualizer` | 2 |
| 9 | **Progress Monitor** | `LiveProgressMonitor` | 2 |

