FerroGen consists of three bash shell scripts for sequentially screening diffusion-model–generated inorganic crystal structures to identify promising ferroelectric candidates. This is a multi-stage filtering pipeline that (1) applies symmetry filtering by polar space groups using FINDSYM, (2) computes thermodynamic stability via oxidation state and energy above hull using SevenNet-0, (3) estimates the band gap (a key ferroelectric photovoltaic property) using CGCNN. The workflow is managed by these bash shell scripts, while specific programs (e.g., FINDSYM, SevenNet-0, CGCNN) invoke helper Python scripts where provided.
The following Python packages are required:
python>=3.10
numpy
pandas
torch # PyTorch
ase>=3.25.0
pymatgen>=2025.4.24
sevenn>=0.10.4Yeo, B. C., Kang, S., & Lee, J.-H. (2025). Diffusion–Model–Driven Discovery of Ferroelectrics for Photocurrent Applications [Working paper]. ChemRxiv. https://chemrxiv.org/engage/chemrxiv/article-details/68d1110df416303770b9e33c
Yeo, B. C., Kang, S., & Lee, J.-H. (2025). Diffusion–Model–Driven Discovery of Ferroelectrics for Photocurrent Applications [Data set]. Zenodo. https://doi.org/10.5281/zenodo.17223175
- Input:
generated_materials_cif/— folder of structures generated by the diffusion model (Zenodo dataset)
- Process:
- Uses FINDSYM to analyze the symmetry of the generated structures
- Outputs:
1.cif/— folder containing symmetry-processed CIF structures2.cif/— folder containing only polar space group CIF structures (output from step 1)result_space_group.txt— list of identified space groupsresult_polar_group.txt— list of structures belonging to polar space groups
- Input:
2.cif/— folder containing only polar space group CIF structures (output from step 1)
- Process:
- Performs oxidation state validation
- Relaxes structures using SevenNet-0 to compute thermodynamic stability (energy above hull)
- Outputs:
result_oxidation_state.txt— file summarizing oxidation state–validated results4.cif/— folder containing structures relaxed with SevenNet-0
- Input:
4.cif/— folder containing structures relaxed with SevenNet-0 (output from step 2)
- Process:
- Uses CGCNN to predict band gaps
- Computes energy above hull values
- Assigns new sequential numbering (1, 2, 3, …) to the processed structures
- Outputs:
5.cif/— folder containing candidate CIF structures, renamed sequentially (1, 2, 3, …)result_final_all.txt— file summarizing predicted band gaps (from CGCNN) and energy above hull results