dmandrus1/FQM
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
MLIP Coding Project README
============================
This project automates: (1) downloading CIFs for a chemical system, (2) batch
relaxing the structures with Fairchem eSEN (via ASE), and (3) building a
pymatgen Phase Diagram from the resulting energies.
Contents
--------
- download_cifs_for_pd.py
- batch_relax_to_csv.py
- make_phase_diagram_from_csv.py
- environment.yml (conda environment)
- requirements.txt (pip-only alternative)
Prerequisites
-------------
1) Create and activate the conda environment:
$ conda env create -f environment.yml
$ conda activate convex
2) Materials Project API key (for CIF downloads):
Obtain a key from materialsproject.org and set it once for this env:
$ mkdir -p "$CONDA_PREFIX/etc/conda/activate.d" "$CONDA_PREFIX/etc/conda/deactivate.d"
$ cat > "$CONDA_PREFIX/etc/conda/activate.d/mpapi.sh" <<'EOF'
export MP_API_KEY="PUT_YOUR_KEY_HERE"
EOF
$ cat > "$CONDA_PREFIX/etc/conda/deactivate.d/mpapi.sh" <<'EOF'
unset MP_API_KEY
EOF
3) (Optional) Apple Silicon acceleration for Torch:
The environment installs torch with MPS support when available.
Verify with:
$ python - <<'PY'
import torch
print("Torch MPS available:", hasattr(torch.backends,'mps') and torch.backends.mps.is_available())
PY
Typical Workflow (CoSb example)
--------------------------------
Step 1 Download CIFs from Materials Project
---------------------------------------------
Script: download_cifs_for_pd.py
Description:
Given element symbols that define a chemical system (e.g., Co Sb), downloads
Materials Project structures as CIFs. Can standardize to conventional cells
and optionally keep only one CIF per pretty formula.
Usage:
$ python download_cifs_for_pd.py Co Sb --out CoSb_cifs [options]
Options:
elements (positional) Space-separated element symbols (>= 2)
--out, -o OUT_DIR Output folder for CIFs (default: derived from elements)
--api-key KEY API key (defaults to env var MP_API_KEY if set)
--conventional Write CIFs in conventional standard cells (spglib)
--dedupe-formula Keep only one CIF per pretty formula
Examples:
$ python download_cifs_for_pd.py Co Sb --out CoSb_cifs
$ python download_cifs_for_pd.py Fe O --out FeO_cifs --conventional --dedupe-formula
Outputs:
- A folder with .cif files for the requested chemical system.
Step 2 Batch relax CIFs and export energies to CSV
----------------------------------------------------
Script: batch_relax_to_csv.py
Description:
Uses ASE + Fairchem eSEN to relax each CIF and records energies/metadata in
a single CSV for downstream PhaseDiagram calculations.
Usage:
$ python batch_relax_to_csv.py INPUT_DIR OUTPUT_CSV [options]
Options:
input_dir (positional) Folder containing .cif files
output_csv (positional) Path to write the summary CSV (e.g., results.csv)
--model NAME Fairchem model name (default: uma-s-1p1)
--fmax VALUE Force convergence target eV/Ã… (default: 0.05)
--steps N Max relaxation steps (default: 500)
--pressure GPa Target hydrostatic pressure in GPa (default: 0.0)
--device DEV 'cuda', 'cpu', or leave unset for auto-detect
--outputs DIR Folder to write per-structure outputs (default: outputs_batch)
--pattern GLOB CIF filename glob in INPUT_DIR (default: *.cif)
Example:
$ python batch_relax_to_csv.py CoSb_cifs results.csv --fmax 0.05 --steps 500 --outputs outputs_CoSb
Primary CSV columns (one row per CIF):
- material_id stem of input filename
- formula_reduced reduced formula (pymatgen)
- energy_eV total energy (eV) at relaxed geometry
- energy_per_atom_eV energy per atom (eV/atom)
- n_atoms number of atoms
- elements comma-separated element symbols
- input_path original CIF path
- output_cif path to relaxed CIF
- converged_by_fmax True/False (optimizer convergence)
- steps_used number of steps taken
(Plus any additional per-structure metadata the script writes.)
Outputs:
- OUTPUT_CSV e.g., results.csv
- outputs_* directory relaxed CIFs and logs
Notes:
- Ensure fairchem-core and torch are installed (handled by environment.yml).
- On Apple Silicon, device auto-detection typically selects MPS-backed CPU; you can force --device cpu.
Step 3 Build and plot a Phase Diagram from the CSV
----------------------------------------------------
Script: make_phase_diagram_from_csv.py
Description:
Reads the results CSV, filters entries (by elements, convergence, size), then
constructs a pymatgen PhaseDiagram. Saves a static PNG if kaleido is
available; otherwise falls back to an interactive HTML.
Also writes a JSON with key phase-stability info.
Usage:
$ python make_phase_diagram_from_csv.py CSV [options]
Options:
csv (positional) Input CSV from batch_relax_to_csv.py
--elements EL1 EL2 [...] Restrict to these element symbols (space-separated)
--only-converged Use only rows with converged_by_fmax == True
--min-atoms N Minimum number of atoms to include
--max-atoms N Maximum number of atoms to include
--label FILE Output plot filename (default: pd_plot.png)
--out-dir DIR, -o DIR Output directory for all files (default: current directory)
--no-plot Compute PD but do not create a plot file
Examples:
# CoSb binary, use only converged relaxations, write PNG (or HTML fallback)
$ python make_phase_diagram_from_csv.py results.csv \
--elements Co Sb --only-converged --label CoSb_PD.png
# Save all outputs to a specific directory
$ python make_phase_diagram_from_csv.py results.csv \
--elements Co Sb --only-converged -o ./CoSb_phase_diagram
Outputs:
- pd_plot.png (or .html) Phase diagram plot (up to quaternary plotted)
- energies_above_hull.csv Table of energy above hull for all entries
- stable_phases.json JSON summary of stable entries and hull distances
All outputs are saved to the directory specified by --out-dir (or current directory by default).
Troubleshooting
---------------
- RequestsDependencyWarning (chardet / charset-normalizer):
The environment includes 'charset-normalizer'. If you see this warning,
update requests & charset-normalizer inside the env:
$ pip install -U requests charset-normalizer
- Could not save PNG (kaleido issue):
The script falls back to writing an interactive HTML. You can also reinstall:
$ pip install -U kaleido plotly
- Torch device choice:
Specify --device cpu or --device cuda explicitly if auto-detection is not
what you want.
- Missing API key:
Set MP_API_KEY in the env (see above), or pass --api-key to download_cifs_for_pd.py
Data Flow Summary
-----------------
download_cifs_for_pd.py --> folder_of_cifs/
batch_relax_to_csv.py --> results.csv (+ relaxed CIFs in outputs_*)
make_phase_diagram_from_csv.py --> pd_plot.png/html + stable_phases.json