Skip to content

cyye001/Con-CDVAE

Repository files navigation

Con-CDVAE

中文版README

This code is improved on the basis of CDVAE, and implements the generation of crystals according to the target properties.

Ref: Cai-Yuan Ye, Hong-Ming Weng, Quan-Sheng Wu, Con-CDVAE: A method for the conditional generation of crystal structures, Computational Materials Today, 1, 100003 (2024).

arXiv: https://arxiv.org/abs/2403.12478

Update

  • v2.1.0 Updated the code for the prior module and the structure generation
  • v2.0.0 Use the new PyTorch environment
  • v1.0.0 Initial implementations of Con-CDVAE

Tip: Version 2.x is currently under active development and may be unstable.

Environment

We recommend using Anaconda to manage Python environments. First, create and activate a new Python environment:

conda create --name concdvae310 python=3.10
conda activate concdvae310

Then, use requirements.txt to install the Python packages.

pip install -r requirements.txt

Finally, the PyTorch-related libraries need to be installed according to your device and CUDA version. The version we used is:

torch                    2.3.0+cu118
torchaudio               2.3.0+cu118
torchvision              0.18.0+cu118

torch_geometric          2.5.3
torch_cluster            1.6.3+pt23cu118
torch_scatter            2.1.2+pt23cu118
torch_sparse             0.6.18+pt23cu118
torch_spline_conv        1.2.2+pt23cu118

pytorch-lightning        2.4.0
torchmetrics             1.6.3

For details, you can refer to PyTorch, pytorch-geometric, pytorch-lightning.

After setting up the environment, you can use the provided model checkpoint to run Con-CDVAE for conditional generation of materials. Before doing so, make sure to update the necessary environment paths. You can either run the following commands:

cp .env_bak .env
bash writeenv.sh

Or, if you prefer, modify the .env file manually. Update it with the following lines, replacing <YOUR_PATH_TO_CONCDVAE> with the absolute path to your Con-CDVAE directory:

export PROJECT_ROOT="<YOUR_PATH_TO_CONCDVAE>"
export HYDRA_JOBS="<YOUR_PATH_TO_CONCDVAE>/output/hydra"
export WABDB_DIR="<YOUR_PATH_TO_CONCDVAE>/output/wandb"

Datasets

You can find a small sample of the dataset in data/ (mptest/ and mptest4conz ), including the data used for Con-CDVAE two-step training. The complete data can be easily downloaded according to the API provided by the Materials Project (MP) and Open Quantum Materials Database (OQMD), and they can be used in the same format as the sample.

Use the pre-train model

A pre-trained model is available in src/model/mp20_format, trained on the mp_20 dataset. It can generate crystal structures based on formation energy. This model may not exactly match the results presented in the paper, as it was retrained using the modified code.

Use the following command to generate crystals using the default strategy:

python scripts/gen_crystal.py --config <YOUR_PATH_TO_CONCDVAE>/conf/gen/default.yaml

Use the following command to generate crystals using the full strategy:

python scripts/gen_crystal.py --config <YOUR_PATH_TO_CONCDVAE>/conf/gen/full.yaml

Use the following command to generate crystals using the less strategy:

python scripts/gen_crystal.py --config <YOUR_PATH_TO_CONCDVAE>/conf/gen/less.yaml

The configuration files for controlling the generation parameters are located in conf/gen/. You can refer to the two CSV files in src/model/mp20_format for the model input.

After crystal structures are generated, they are saved in the same directory as the model under filenames like eval_gen_xxx.pt, where xxx corresponds to the settings specified in your YAML and CSV files.

Training Con-CDVAE

Step-one training

To train a Con-CDVAE, run the following command first:

python concdvae/run.py data=mptest expname=test model=vae_mp_format

To use other dataset, user should prepare the data in the same forme as the sample, and edit a new configure files in conf/data/ folder, and use data=your_data_conf. To train model for other property, you can try model=vae_mp_gap.

If you want to accelerate with multiple gpus, you should run this command:

torchrun --nproc_per_node 4 concdvae/run.py \
    data=mptest \
    expname=test \
    model=vae_mp_gap \
    train.pl_trainer.accelerator=gpu  \
    train.pl_trainer.devices=4 \
    train.pl_trainer.strategy=ddp_find_unused_parameters_true 

After training, model checkpoints can be found in <YOUR_PATH_TO_CONCDVAE>/output/hydra/singlerun/YYYY-MM-DD/<expname>/epoch=xxx-step=xxx.ckpt.

Step-two training

After finishing step-one training, you can train the Prior block with the following command.

python concdvae/run_prior.py \
  --model_path <YOUR_PATH_TO_CONCDVAE>/output/hydra/singlerun/YYYY-MM-DD/<expname> \
  --model_file epoch=xxx-step=xxx.ckpt
  --prior_label prior_default

Then you can get the default condition Prior in <YOUR_PATH_TO_CONCDVAE>/output/hydra/singlerun/YYYY-MM-DD/<expname>/prior_default-epoch=xxx-step=xxx.ckpt.

If you want to train full conditon Prior, you should use:

python concdvae/run_prior.py \
  --model_path <YOUR_PATH_TO_CONCDVAE>/output/hydra/singlerun/YYYY-MM-DD/<expname> \
  --prior_label prior_full \
  --priorcondition_file mp_full \
  --data_file mptest4conz

Evaluating model

To evaluate crystal system, you can use the code concdvae/pt2CS.py.

To evaluate other properties, you should train a CGCNN with the following command:

python cgcnn/main.py /your_path_to_con-cdvae/cgcnn/data/mptest --prop band_gap --label your_label 

This code use the same dataset as Con-CDVAE, You can build the required database using the methods mentioned earlier. If you want to train CGCNN on other property, you can set --prop formation_energy_per_atom, --prop BG_type, --prop FM_type. It is important to note that if you are training for a classification task, you should set --task classification.

After training, model checkpoints can be found in your_labelmodel_best.pth.tar. The trained model can be found in cgcnn/pre-trained.

When you've generated crystals and need to evaluate, run the following command:

python cgcnn/predict.py --gendatapath /your_path_to_generated_crystal/ --modelpath /your_path_to_cgcnn_model/model_best.pth.tar --file your_crystal_file.pt --label your_label

Running the API service

We use FastAPI to deploy the Con-CDVAE model on our website (MaterialsGalaxy). Below is an example of how to launch the service:

cd fastapi
nohup uvicorn concdvae_api:app  --host '0.0.0.0' --port 8081 --reload > log_api 2>&1 &

After deployment, you can test the API using the following command:

python ../scripts/test_api.py

About

This is a conditionally generative model for crystal structures based on a modified version of CDVAE.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors