# Notebook 10: Complete Research Workflow

## From Database to Publication-Ready Results

This notebook integrates everything from the workshop into a **complete, reproducible workflow** for computational materials research.

---

In [None]:
import numpy as np
from typing import Dict, List, Optional, Tuple
from dataclasses import dataclass
from datetime import datetime

# Import workshop utilities (when available)
# from qe_workshop_utils import (
#     check_charge_neutrality, generate_scf_input, parse_scf_output,
#     check_born_stability_cubic, fit_birch_murnaghan, generate_kpath_card
# )

RY_TO_EV = 13.605693122994
BOHR_TO_ANGSTROM = 0.529177210903

---

## The Complete DFT Workflow

```
╔═══════════════════════════════════════════════════════════════════════════════╗
║                        RIGOROUS DFT RESEARCH WORKFLOW                         ║
║                     "Garbage In → Garbage Out" Prevention                     ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                               ║
║  ┌─────────────────────────────────────────────────────────────────────────┐  ║
║  │ PHASE 1: STRUCTURE DISCOVERY & VALIDATION                               │  ║
║  ├─────────────────────────────────────────────────────────────────────────┤  ║
║  │                                                                         │  ║
║  │  1.1 Database Search ──────────────────────────────────────────────────│  ║
║  │      • Materials Project, OQMD, AFLOW, JARVIS, NOMAD                   │  ║
║  │      • Identify candidate structures                                    │  ║
║  │      • Check existing calculations (don't reinvent the wheel!)         │  ║
║  │                               ↓                                         │  ║
║  │  1.2 Structure Validation ─────────────────────────────────────────────│  ║
║  │      □ Charge neutrality check                                         │  ║
║  │      □ Bond length reasonableness (Shannon radii)                      │  ║
║  │      □ Space group consistency                                         │  ║
║  │      □ No atomic overlaps                                              │  ║
║  │                               ↓                                         │  ║
║  │  1.3 Structure Prediction (if novel material) ─────────────────────────│  ║
║  │      • CALYPSO, USPEX, AIRSS, XtalOpt                                  │  ║
║  │      • Generate multiple candidates                                     │  ║
║  │                                                                         │  ║
║  └─────────────────────────────────────────────────────────────────────────┘  ║
║                                     ↓                                         ║
║  ┌─────────────────────────────────────────────────────────────────────────┐  ║
║  │ PHASE 2: DFT SETUP & CONVERGENCE                                        │  ║
║  ├─────────────────────────────────────────────────────────────────────────┤  ║
║  │                                                                         │  ║
║  │  2.1 Choose Computational Parameters ──────────────────────────────────│  ║
║  │      • Functional: LDA / GGA-PBE / GGA+U / Hybrid                      │  ║
║  │      • Pseudopotentials: NC / US / PAW (use SSSP!)                     │  ║
║  │                               ↓                                         │  ║
║  │  2.2 Convergence Testing ──────────────────────────────────────────────│  ║
║  │      □ ecutwfc: energy converged to 1 meV/atom                         │  ║
║  │      □ k-points: energy converged to 1 meV/atom                        │  ║
║  │      □ smearing (metals): compare MV, Gaussian, FD                     │  ║
║  │                                                                         │  ║
║  └─────────────────────────────────────────────────────────────────────────┘  ║
║                                     ↓                                         ║
║  ┌─────────────────────────────────────────────────────────────────────────┐  ║
║  │ PHASE 3: STRUCTURE OPTIMIZATION                                         │  ║
║  ├─────────────────────────────────────────────────────────────────────────┤  ║
║  │                                                                         │  ║
║  │  3.1 Full Relaxation ──────────────────────────────────────────────────│  ║
║  │      • vc-relax: cell + positions (if needed)                          │  ║
║  │      • relax: positions only                                           │  ║
║  │      • Target: forces < 10⁻⁴ Ry/Bohr, stress < 0.5 kbar               │  ║
║  │                               ↓                                         │  ║
║  │  3.2 Equation of State (optional) ─────────────────────────────────────│  ║
║  │      • 5-7 volumes around equilibrium                                  │  ║
║  │      • Birch-Murnaghan fit → V₀, E₀, B₀, B₀'                          │  ║
║  │                               ↓                                         │  ║
║  │  3.3 Magnetic Ground State (if magnetic) ──────────────────────────────│  ║
║  │      • Compare FM, AFM, NM configurations                              │  ║
║  │      • Find lowest energy state                                         │  ║
║  │                                                                         │  ║
║  └─────────────────────────────────────────────────────────────────────────┘  ║
║                                     ↓                                         ║
║  ┌─────────────────────────────────────────────────────────────────────────┐  ║
║  │ PHASE 4: STABILITY VERIFICATION (CRITICAL!)                             │  ║
║  ├─────────────────────────────────────────────────────────────────────────┤  ║
║  │                                                                         │  ║
║  │  ⚠️  DO NOT SKIP THIS PHASE! ⚠️                                         │  ║
║  │                                                                         │  ║
║  │  4.1 Thermodynamic Stability ──────────────────────────────────────────│  ║
║  │      □ Formation energy ΔHf < 0                                        │  ║
║  │      □ Convex hull distance Ehull < 25 meV/atom                        │  ║
║  │                               ↓                                         │  ║
║  │  4.2 Dynamic Stability ────────────────────────────────────────────────│  ║
║  │      □ Phonon calculation (ph.x)                                       │  ║
║  │      □ NO imaginary frequencies                                        │  ║
║  │                               ↓                                         │  ║
║  │  4.3 Mechanical Stability ─────────────────────────────────────────────│  ║
║  │      □ Elastic constants calculated                                    │  ║
║  │      □ Born stability criteria satisfied                               │  ║
║  │                                                                         │  ║
║  │  ═══════════════════════════════════════════════════════════════════   │  ║
║  │  ALL PASSED? → Proceed     ANY FAILED? → STOP, structure invalid       │  ║
║  └─────────────────────────────────────────────────────────────────────────┘  ║
║                                     ↓                                         ║
║  ┌─────────────────────────────────────────────────────────────────────────┐  ║
║  │ PHASE 5: PROPERTY CALCULATIONS                                          │  ║
║  ├─────────────────────────────────────────────────────────────────────────┤  ║
║  │                                                                         │  ║
║  │  5.1 Electronic Properties ────────────────────────────────────────────│  ║
║  │      • Band structure (high-symmetry path)                             │  ║
║  │      • Density of states (DOS, PDOS)                                   │  ║
║  │      • Effective masses (for semiconductors)                           │  ║
║  │                               ↓                                         │  ║
║  │  5.2 Optical Properties ───────────────────────────────────────────────│  ║
║  │      • Dielectric function ε(ω)                                        │  ║
║  │      • Absorption spectrum                                              │  ║
║  │      • Refractive index                                                 │  ║
║  │                               ↓                                         │  ║
║  │  5.3 Thermal Properties ───────────────────────────────────────────────│  ║
║  │      • Phonon DOS → specific heat                                      │  ║
║  │      • Thermal conductivity (advanced)                                 │  ║
║  │                               ↓                                         │  ║
║  │  5.4 Transport Properties ─────────────────────────────────────────────│  ║
║  │      • Seebeck coefficient                                              │  ║
║  │      • Electrical conductivity                                         │  ║
║  │                                                                         │  ║
║  └─────────────────────────────────────────────────────────────────────────┘  ║
║                                     ↓                                         ║
║  ┌─────────────────────────────────────────────────────────────────────────┐  ║
║  │ PHASE 6: DOCUMENTATION & REPRODUCIBILITY                                │  ║
║  ├─────────────────────────────────────────────────────────────────────────┤  ║
║  │                                                                         │  ║
║  │  □ Record all computational parameters                                  │  ║
║  │  □ Archive input/output files                                          │  ║
║  │  □ Document software versions                                          │  ║
║  │  □ Create workflow scripts for reproducibility                         │  ║
║  │                                                                         │  ║
║  └─────────────────────────────────────────────────────────────────────────┘  ║
║                                                                               ║
╚═══════════════════════════════════════════════════════════════════════════════╝
```

---

## Workflow Tracking Structure

In [None]:
@dataclass
class CalculationRecord:
    """Record of a single calculation."""
    calc_type: str          # 'scf', 'relax', 'bands', 'phonon', etc.
    input_file: str
    output_file: str
    converged: bool
    energy_ry: Optional[float] = None
    notes: str = ""
    timestamp: str = ""

@dataclass
class MaterialWorkflow:
    """Complete workflow tracking for a material."""
    # Identification
    formula: str
    space_group: str
    source: str  # 'Materials Project', 'OQMD', 'predicted', etc.
    
    # Validation status
    charge_neutral: bool = False
    structure_validated: bool = False
    
    # Computational setup
    functional: str = "PBE"
    pseudopotentials: str = "SSSP-efficiency"
    ecutwfc: float = 0.0
    ecutrho: float = 0.0
    kpoints: Tuple[int, int, int] = (1, 1, 1)
    
    # Convergence verified
    ecut_converged: bool = False
    kpts_converged: bool = False
    
    # Structure optimization
    optimized_volume: Optional[float] = None
    bulk_modulus_gpa: Optional[float] = None
    magnetic_config: str = "NM"
    
    # Stability tests
    thermo_stable: Optional[bool] = None
    formation_energy_ev: Optional[float] = None
    ehull_mev: Optional[float] = None
    
    dynamic_stable: Optional[bool] = None
    min_phonon_freq: Optional[float] = None
    
    mechanical_stable: Optional[bool] = None
    
    # Properties
    band_gap_ev: Optional[float] = None
    is_metal: bool = False
    
    # Calculations log
    calculations: List[CalculationRecord] = None
    
    def __post_init__(self):
        if self.calculations is None:
            self.calculations = []
    
    def is_fully_stable(self) -> bool:
        """Check if all stability tests passed."""
        return (self.thermo_stable is True and 
                self.dynamic_stable is True and 
                self.mechanical_stable is True)
    
    def can_calculate_properties(self) -> bool:
        """Check if ready for property calculations."""
        return (self.structure_validated and 
                self.ecut_converged and 
                self.kpts_converged and
                self.is_fully_stable())
    
    def summary(self) -> str:
        """Generate workflow summary."""
        lines = [
            f"Material: {self.formula}",
            f"Space Group: {self.space_group}",
            f"Source: {self.source}",
            "",
            "=== Validation ===",
            f"  Charge neutral: {'PASS' if self.charge_neutral else 'FAIL'}",
            f"  Structure validated: {'PASS' if self.structure_validated else 'PENDING'}",
            "",
            "=== Convergence ===",
            f"  ecutwfc: {self.ecutwfc} Ry {'(converged)' if self.ecut_converged else '(not verified)'}",
            f"  k-points: {self.kpoints} {'(converged)' if self.kpts_converged else '(not verified)'}",
            "",
            "=== Stability ===",
        ]
        
        if self.thermo_stable is not None:
            lines.append(f"  Thermodynamic: {'STABLE' if self.thermo_stable else 'UNSTABLE'}")
            if self.formation_energy_ev is not None:
                lines.append(f"    ΔHf = {self.formation_energy_ev:.3f} eV/atom")
        else:
            lines.append("  Thermodynamic: NOT TESTED")
            
        if self.dynamic_stable is not None:
            lines.append(f"  Dynamic: {'STABLE' if self.dynamic_stable else 'UNSTABLE'}")
        else:
            lines.append("  Dynamic: NOT TESTED")
            
        if self.mechanical_stable is not None:
            lines.append(f"  Mechanical: {'STABLE' if self.mechanical_stable else 'UNSTABLE'}")
        else:
            lines.append("  Mechanical: NOT TESTED")
        
        lines.extend([
            "",
            "=== Properties ===",
        ])
        
        if self.can_calculate_properties():
            if self.band_gap_ev is not None:
                lines.append(f"  Band gap: {self.band_gap_ev:.3f} eV")
            elif self.is_metal:
                lines.append("  Band gap: METAL")
            else:
                lines.append("  Band gap: NOT CALCULATED")
        else:
            lines.append("  NOT READY: Complete stability tests first!")
        
        return "\n".join(lines)

# Example usage
print("Material Workflow Tracking")
print("=" * 60)

In [None]:
# Example: Create workflow for BaTiO3
bto_workflow = MaterialWorkflow(
    formula="BaTiO3",
    space_group="Pm-3m (221)",
    source="Materials Project (mp-2998)",
    charge_neutral=True,
    structure_validated=True,
    functional="PBE",
    pseudopotentials="SSSP-efficiency",
    ecutwfc=60.0,
    ecutrho=480.0,
    kpoints=(8, 8, 8),
    ecut_converged=True,
    kpts_converged=True,
    thermo_stable=True,
    formation_energy_ev=-3.21,
    ehull_mev=0.0,
    dynamic_stable=True,
    min_phonon_freq=0.5,
    mechanical_stable=True,
    band_gap_ev=2.3,
)

print(bto_workflow.summary())

---

## Master Workflow Script

In [None]:
def create_workflow_script(material: MaterialWorkflow, 
                          workdir: str = "./") -> str:
    """
    Generate a complete workflow script for a material.
    """
    prefix = material.formula.lower().replace('(', '').replace(')', '')
    
    script = f"""#!/bin/bash
#===============================================================================
# DFT Workflow Script: {material.formula}
# Generated: {datetime.now().strftime('%Y-%m-%d %H:%M')}
#===============================================================================

PREFIX="{prefix}"
WORKDIR="{workdir}"
PSEUDO_DIR="./pseudo"
OUTDIR="./tmp"

# Ensure directories exist
mkdir -p $OUTDIR

echo "========================================"
echo "Starting workflow for {material.formula}"
echo "========================================"

#-------------------------------------------------------------------------------
# PHASE 1: CONVERGENCE (should be done already)
#-------------------------------------------------------------------------------
# Converged parameters:
# ecutwfc = {material.ecutwfc} Ry
# k-points = {material.kpoints}

#-------------------------------------------------------------------------------
# PHASE 2: STRUCTURE OPTIMIZATION
#-------------------------------------------------------------------------------
echo "Step 1: Variable-cell relaxation..."
pw.x < ${{PREFIX}}_vcrelax.in > ${{PREFIX}}_vcrelax.out

# Check convergence
if grep -q "JOB DONE" ${{PREFIX}}_vcrelax.out; then
    echo "  vc-relax converged successfully"
else
    echo "  ERROR: vc-relax did not converge!"
    exit 1
fi

#-------------------------------------------------------------------------------
# PHASE 3: HIGH-PRECISION SCF
#-------------------------------------------------------------------------------
echo "Step 2: High-precision SCF..."
pw.x < ${{PREFIX}}_scf.in > ${{PREFIX}}_scf.out

if grep -q "convergence has been achieved" ${{PREFIX}}_scf.out; then
    echo "  SCF converged"
else
    echo "  ERROR: SCF did not converge!"
    exit 1
fi

#-------------------------------------------------------------------------------
# PHASE 4: STABILITY TESTS
#-------------------------------------------------------------------------------
echo "Step 3: Phonon calculation (dynamic stability)..."
ph.x < ${{PREFIX}}_ph.in > ${{PREFIX}}_ph.out

# Check for imaginary frequencies
if grep -qi "imaginary" ${{PREFIX}}_ph.out; then
    echo "  WARNING: Imaginary frequencies detected!"
    echo "  Material is dynamically UNSTABLE"
    # Don't exit - user may want to analyze
else
    echo "  No imaginary frequencies - dynamically stable"
fi

# Generate force constants
q2r.x < ${{PREFIX}}_q2r.in > ${{PREFIX}}_q2r.out

# Calculate phonon dispersion
matdyn.x < ${{PREFIX}}_matdyn.in > ${{PREFIX}}_matdyn.out

#-------------------------------------------------------------------------------
# PHASE 5: ELECTRONIC PROPERTIES
#-------------------------------------------------------------------------------
echo "Step 4: Band structure calculation..."

# NSCF for bands
pw.x < ${{PREFIX}}_bands.in > ${{PREFIX}}_bands.out

# Extract bands
bands.x < ${{PREFIX}}_bands_pp.in > ${{PREFIX}}_bands_pp.out

echo "Step 5: DOS calculation..."

# NSCF for DOS (dense k-mesh)
pw.x < ${{PREFIX}}_nscf.in > ${{PREFIX}}_nscf.out

# Calculate DOS
dos.x < ${{PREFIX}}_dos.in > ${{PREFIX}}_dos.out

# Calculate PDOS
projwfc.x < ${{PREFIX}}_pdos.in > ${{PREFIX}}_pdos.out

#-------------------------------------------------------------------------------
# PHASE 6: SUMMARY
#-------------------------------------------------------------------------------
echo ""
echo "========================================"
echo "Workflow completed for {material.formula}"
echo "========================================"
echo ""
echo "Output files:"
echo "  Structure:   ${{PREFIX}}_vcrelax.out"
echo "  SCF:         ${{PREFIX}}_scf.out"
echo "  Phonons:     ${{PREFIX}}_ph.out"
echo "  Bands:       ${{PREFIX}}_bands.dat.gnu"
echo "  DOS:         ${{PREFIX}}.dos"
echo ""
"""
    return script

print("Master Workflow Script Generator")
print("=" * 60)
print("\nGenerated script for BaTiO3:")
print(create_workflow_script(bto_workflow)[:2000] + "...")

---

## Documentation Template

In [None]:
def generate_calculation_report(material: MaterialWorkflow) -> str:
    """Generate a documentation report for publication."""
    
    report = f"""
================================================================================
COMPUTATIONAL METHODS - {material.formula}
================================================================================

STRUCTURE SOURCE
----------------
Initial structure obtained from: {material.source}
Space group: {material.space_group}

VALIDATION
----------
- Charge neutrality verified: {'Yes' if material.charge_neutral else 'No'}
- Bond lengths validated against Shannon ionic radii
- Space group symmetry confirmed

COMPUTATIONAL DETAILS
---------------------
All calculations performed with Quantum ESPRESSO v7.x

Exchange-correlation functional: {material.functional}
Pseudopotentials: {material.pseudopotentials}

Convergence parameters (verified to 1 meV/atom):
  - Plane-wave cutoff: {material.ecutwfc} Ry
  - Charge density cutoff: {material.ecutrho} Ry
  - Brillouin zone sampling: {material.kpoints[0]}×{material.kpoints[1]}×{material.kpoints[2]} Monkhorst-Pack grid

Structure optimization:
  - Force convergence: 10⁻⁴ Ry/Bohr
  - Stress convergence: 0.5 kbar
"""
    
    if material.is_fully_stable():
        report += f"""
STABILITY VERIFICATION
----------------------
Thermodynamic stability:
  - Formation energy: {material.formation_energy_ev:.3f} eV/atom
  - Distance from convex hull: {material.ehull_mev:.1f} meV/atom
  - Status: STABLE

Dynamic stability:
  - Phonon calculation performed on 4×4×4 q-grid
  - Minimum phonon frequency: {material.min_phonon_freq:.2f} THz
  - Status: STABLE (no imaginary modes)

Mechanical stability:
  - Born stability criteria: SATISFIED
"""
    
    if material.band_gap_ev is not None:
        report += f"""
ELECTRONIC PROPERTIES
---------------------
Band gap: {material.band_gap_ev:.3f} eV (DFT-{material.functional})

Note: DFT typically underestimates band gaps. For accurate gaps,
consider GW calculations or hybrid functionals.
"""
    
    report += """
DATA AVAILABILITY
-----------------
Input files and calculation outputs are available in the
supplementary materials. All calculations can be reproduced
using the provided workflow scripts.
"""
    
    return report

print(generate_calculation_report(bto_workflow))

---

## Quick Reference Checklist

In [None]:
def print_workflow_checklist():
    """Print the complete workflow checklist."""
    checklist = """
╔═══════════════════════════════════════════════════════════════╗
║              DFT CALCULATION QUALITY CHECKLIST                ║
╠═══════════════════════════════════════════════════════════════╣
║                                                               ║
║  BEFORE STARTING:                                             ║
║  □ Structure from reliable source                             ║
║  □ Charge neutrality verified                                 ║
║  □ Bond lengths reasonable                                    ║
║  □ Literature search completed                                ║
║                                                               ║
║  DFT SETUP:                                                   ║
║  □ Appropriate functional selected                            ║
║  □ Pseudopotentials from verified library (SSSP)              ║
║  □ ecutwfc convergence tested (< 1 meV/atom)                  ║
║  □ k-points convergence tested (< 1 meV/atom)                 ║
║  □ Smearing appropriate for material type                     ║
║                                                               ║
║  OPTIMIZATION:                                                ║
║  □ Forces converged (< 10⁻⁴ Ry/Bohr)                          ║
║  □ Stress converged (< 0.5 kbar)                              ║
║  □ Final volume reasonable                                    ║
║  □ Magnetic ground state found (if applicable)                ║
║                                                               ║
║  STABILITY (MANDATORY!):                                      ║
║  □ Formation energy calculated                                ║
║  □ Convex hull distance < 25 meV/atom                         ║
║  □ Phonon calculation completed                               ║
║  □ No imaginary frequencies                                   ║
║  □ Elastic constants satisfy Born criteria                    ║
║                                                               ║
║  PROPERTIES:                                                  ║
║  □ Band structure on correct high-symmetry path               ║
║  □ DOS with sufficient k-point density                        ║
║  □ Band gap noted (with DFT limitation caveat)                ║
║                                                               ║
║  DOCUMENTATION:                                               ║
║  □ All parameters recorded                                    ║
║  □ Software versions documented                               ║
║  □ Input/output files archived                                ║
║  □ Results compared with experiment/literature                ║
║                                                               ║
╚═══════════════════════════════════════════════════════════════╝
"""
    print(checklist)

print_workflow_checklist()

---

## Summary

### The Key Lesson

**"Garbage In → Garbage Out"**

No amount of computational sophistication can fix a fundamentally flawed structure or inappropriate methodology.

### Critical Steps (Never Skip!)

1. **Validate your structure** before any calculation
2. **Test convergence** of all computational parameters
3. **Verify stability** before calculating properties
4. **Document everything** for reproducibility

### Workshop Notebooks Summary

| Notebook | Topic | Key Concepts |
|----------|-------|-------------|
| 00 | Overview | Philosophy, workflow, resources |
| 01 | Database Search | MP, OQMD, AFLOW APIs |
| 02 | Structure Validation | Charge neutrality, bond lengths |
| 03 | DFT Setup | Functionals, pseudopotentials |
| 04 | Convergence | ecutwfc, k-points, smearing |
| 05 | Optimization | vc-relax, EOS fitting |
| 06 | Magnetic Systems | FM, AFM, ground state |
| 07 | Stability | Thermodynamic, dynamic, mechanical |
| 08 | Electronic | Bands, DOS, PDOS |
| 09 | Advanced | Optical, phonon, transport |
| 10 | Workflow | Integration, documentation |

### Resources

- **Databases**: Materials Project, OQMD, AFLOW, JARVIS, NOMAD
- **Pseudopotentials**: SSSP (Materials Cloud)
- **Software**: Quantum ESPRESSO, VASP, CASTEP, etc.
- **Visualization**: VESTA, XCrySDen, Materials Studio

---

## Final Notes for Workshop Participants

### Your Next Steps

1. **Practice**: Run the example calculations on your HPC system
2. **Validate**: Always question your inputs and results
3. **Document**: Keep detailed records of all calculations
4. **Compare**: Check your results against literature
5. **Ask**: When in doubt, consult with experienced researchers

### Remember

DFT is a powerful tool, but only when used correctly. The time invested in proper validation and convergence testing will save you from publishing incorrect results.

**Good luck with your research!**