Merge pull request #76 from usnistgov/develop

Develop
usnistgov · Nov 7, 2022 · b26fff6 · b26fff6
2 parents 17f6883 + 12186ff
commit b26fff6
Show file tree

Hide file tree

Showing 18 changed files with 3,475 additions and 23 deletions.
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
@@ -31,7 +31,12 @@ jobs:
     - name: Test with pytest
       run: |
         export DGLBACKEND=pytorch
+        export CUDA_VISIBLE_DEVICES="-1"
+        #pip install dgl-cu111
         pip install flake8 pytest pycodestyle pydocstyle codecov pytest-cov coverage 
+        #pip uninstall -y torch nvidia-cublas-cu11 nvidia-cuda-nvrtc-cu11 nvidia-cuda-runtime-cu11 nvidia-cudnn-cu11
+        #conda install -y  pytorch-cpu
+        #pip install attrs==22.1.0 certifi==2022.9.24 charset-normalizer==2.1.1 codecov==2.1.12 contourpy==1.0.5 coverage==6.5.0 cycler==0.11.0 dgl==0.9.1 flake8==5.0.4 fonttools==4.38.0 idna==3.4 iniconfig==1.1.1 jarvis-tools==2022.9.16 joblib==1.2.0 kiwisolver==1.4.4 matplotlib==3.6.1 mccabe==0.7.0 networkx==3.0b1 numpy==1.23.4 packaging==21.3 pandas==1.5.1 Pillow==9.2.0 pluggy==1.0.0 psutil==5.9.3 py==1.11.0 pycodestyle==2.9.1 pydantic==1.10.2 pydocstyle==6.1.1 pyflakes==2.5.0 pyparsing==2.4.7 pytest==7.1.3 pytest-cov==4.0.0 python-dateutil==2.8.2 pytorch-ignite==0.5.0.dev20221024 pytz==2022.5 requests==2.28.1 scikit-learn==1.1.2 scipy==1.9.3 six==1.16.0 snowballstemmer==2.2.0 spglib==2.0.1 threadpoolctl==3.1.0 tomli==2.0.1 toolz==0.12.0 torch==1.12.1 tqdm==4.64.1 typing_extensions==4.4.0 urllib3==1.26.12 xmltodict==0.13.0
         echo 'PIP freeze'
         pip freeze
         coverage run -m pytest

diff --git a/README.md b/README.md
@@ -5,12 +5,14 @@
 ![GitHub tag (latest by date)](https://img.shields.io/github/v/tag/usnistgov/alignn)
 ![GitHub code size in bytes](https://img.shields.io/github/languages/code-size/usnistgov/alignn)
 ![GitHub commit activity](https://img.shields.io/github/commit-activity/y/usnistgov/alignn)
+[![Downloads](https://pepy.tech/badge/alignn)](https://pepy.tech/project/alignn)
+<!--
 [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/atomistic-line-graph-neural-network-for/formation-energy-on-materials-project)](https://paperswithcode.com/sota/formation-energy-on-materials-project?p=atomistic-line-graph-neural-network-for)
 [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/atomistic-line-graph-neural-network-for/band-gap-on-materials-project)](https://paperswithcode.com/sota/band-gap-on-materials-project?p=atomistic-line-graph-neural-network-for)
 [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/atomistic-line-graph-neural-network-for/formation-energy-on-qm9)](https://paperswithcode.com/sota/formation-energy-on-qm9?p=atomistic-line-graph-neural-network-for)
 [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/atomistic-line-graph-neural-network-for/formation-energy-on-jarvis-dft-formation)](https://paperswithcode.com/sota/formation-energy-on-jarvis-dft-formation?p=atomistic-line-graph-neural-network-for)
 [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/atomistic-line-graph-neural-network-for/band-gap-on-jarvis-dft)](https://paperswithcode.com/sota/band-gap-on-jarvis-dft?p=atomistic-line-graph-neural-network-for)
-[![Downloads](https://pepy.tech/badge/alignn)](https://pepy.tech/project/alignn)
+-->
 
 # Table of Contents
 * [Introduction](#intro)
@@ -19,6 +21,7 @@
 * [Pre-trained models](#pretrained)
 * [Quick start using colab](#colab)
 * [JARVIS-ALIGNN webapp](#webapp)
+* [ALIGNN-FF](#alignnff)
 * [Peformances on a few datasets](#performances)
 * [Useful notes](#notes)
 * [References](#refs)
@@ -155,6 +158,35 @@ A basic web-app is for direct-prediction available at [JARVIS-ALIGNN app](https:
 
 ![JARVIS-ALIGNN](https://github.com/usnistgov/alignn/blob/develop/alignn/tex/jalignn.PNG)
 
+
+
+<a name="alignnff"></a>
+ALIGNN-FF
+-------------------------
+
+To train ALIGNN-FF use `train_folder_ff.py` script which uses `atomwise_alignn` model:
+
+AtomWise prediction example which looks for similar setup as before but unstead of `id_prop.csv`, it requires `id_prop.json` file (see example in the sample_data_ff directory):
+
+```
+train_folder_ff.py --root_dir "alignn/examples/sample_data_ff" --config "alignn/examples/sample_data_ff/config_example_atomwise.json" --output_dir=temp
+```
+
+A pretrained ALIGNN-FF (under active development right now) can be used for predicting several properties, such as:
+
+```
+run_alignn_ff.py --file_path alignn/examples/sample_data/POSCAR-JVASP-10.vasp --task="unrelaxed_energy"
+run_alignn_ff.py --file_path alignn/examples/sample_data/POSCAR-JVASP-10.vasp --task="optimize"
+run_alignn_ff.py --file_path alignn/examples/sample_data/POSCAR-JVASP-10.vasp --task="ev_curve"
+```
+
+To know about other tasks, type.
+
+```
+run_alignn_ff.py -h
+```
+
+
 <a name="performances"></a>
 
 Performances
@@ -293,6 +325,16 @@ Useful notes (based on some of the queries we received)
 References
 -----------------
 
+1) [Atomistic Line Graph Neural Network for improved materials property predictions](https://www.nature.com/articles/s41524-021-00650-1)
+2) [Prediction of the Electron Density of States for Crystalline Compounds with Atomistic Line Graph Neural Networks (ALIGNN)](https://link.springer.com/article/10.1007/s11837-022-05199-y)
+3) [Recent advances and applications of deep learning methods in materials science](https://www.nature.com/articles/s41524-022-00734-6)
+4) [Designing High-Tc Superconductors with BCS-inspired Screening, Density Functional Theory and Deep-learning](https://arxiv.org/abs/2205.00060)
+5) [A Deep-learning Model for Fast Prediction of Vacancy Formation in Diverse Materials](https://arxiv.org/abs/2205.08366)
+6) [Graph neural network predictions of metal organic framework CO2 adsorption properties](https://www.sciencedirect.com/science/article/pii/S092702562200163X)
+7) [Rapid Prediction of Phonon Structure and Properties using an Atomistic Line Graph Neural Network (ALIGNN)](https://arxiv.org/abs/2207.12510)
+8) [Unified graph neural network force-field for the periodic table](https://arxiv.org/abs/2209.05554)
+
+
 Please see detailed publications list [here](https://jarvis-tools.readthedocs.io/en/master/publications.html).
 
 <a name="contrib"></a>

diff --git a/alignn/__init__.py b/alignn/__init__.py
@@ -1,2 +1,2 @@
 """Version number."""
-__version__ = "2022.10.23"
+__version__ = "2022.11.05"
diff --git a/alignn/config.py b/alignn/config.py
@@ -4,15 +4,14 @@
 from typing import Optional, Union
 import os
 from pydantic import root_validator
-
-# vfrom pydantic import Field, root_validator, validator
 from pydantic.typing import Literal
 from alignn.utils import BaseSettings
 from alignn.models.modified_cgcnn import CGCNNConfig
 from alignn.models.icgcnn import ICGCNNConfig
 from alignn.models.gcn import SimpleGCNConfig
 from alignn.models.densegcn import DenseGCNConfig
 from alignn.models.alignn import ALIGNNConfig
+from alignn.models.alignn_atomwise import ALIGNNAtomWiseConfig
 from alignn.models.dense_alignn import DenseALIGNNConfig
 from alignn.models.alignn_cgcnn import ACGCNNConfig
 from alignn.models.alignn_layernorm import ALIGNNConfig as ALIGNN_LN_Config
@@ -23,7 +22,7 @@
     VERSION = (
         subprocess.check_output(["git", "rev-parse", "HEAD"]).decode().strip()
     )
-except Exception as exp:
+except Exception:
     VERSION = "NA"
     pass
 
@@ -196,6 +195,7 @@ class TrainingConfig(BaseSettings):
     cutoff: float = 8.0
     max_neighbors: int = 12
     keep_data_order: bool = False
+    normalize_graph_level_loss: bool = False
     distributed: bool = False
     n_early_stopping: Optional[int] = None  # typically 50
     output_dir: str = os.path.abspath(".")  # typically 50
@@ -213,6 +213,7 @@ class TrainingConfig(BaseSettings):
         SimpleGCNConfig,
         DenseGCNConfig,
         ALIGNNConfig,
+        ALIGNNAtomWiseConfig,
         ALIGNN_LN_Config,
         DenseALIGNNConfig,
         ACGCNNConfig,

diff --git a/alignn/data.py b/alignn/data.py
@@ -12,7 +12,9 @@
 import numpy as np
 import pandas as pd
 from jarvis.core.atoms import Atoms
-from jarvis.core.graphs import Graph, StructureDataset
+from alignn.graphs import Graph, StructureDataset
+
+# from jarvis.core.graphs import Graph, StructureDataset
 from jarvis.db.figshare import data as jdata
 from torch.utils.data import DataLoader
 from tqdm import tqdm
@@ -55,7 +57,7 @@ def load_dataset(
     d = data
     if limit is not None:
         d = d[:limit]
-    d = pd.DataFrame(d)
+    # d = pd.DataFrame(d)
     # d = d.replace("na", np.nan)
     return d
 
@@ -67,7 +69,7 @@ def mean_absolute_deviation(data, axis=None):
 
 
 def load_graphs(
-    df: pd.DataFrame,
+    dataset=[],
     name: str = "dft_3d",
     neighbor_strategy: str = "k-nearest",
     cutoff: float = 8,
@@ -108,6 +110,8 @@ def atoms_to_graph(atoms):
     if cachefile is not None and cachefile.is_file():
         graphs, labels = dgl.load_graphs(str(cachefile))
     else:
+        df = pd.DataFrame(dataset)
+
         graphs = df["atoms"].progress_apply(atoms_to_graph).values
         if cachefile is not None:
             dgl.save_graphs(str(cachefile), graphs.tolist())
@@ -176,6 +180,9 @@ def get_torch_dataset(
     dataset=[],
     id_tag="jid",
     target="",
+    target_atomwise="",
+    target_grad="",
+    target_stress="",
     neighbor_strategy="",
     atom_features="",
     use_canonize="",
@@ -189,8 +196,9 @@ def get_torch_dataset(
 ):
     """Get Torch Dataset."""
     df = pd.DataFrame(dataset)
-    # print("df", df)
-    vals = df[target].values
+    # df['natoms']=df['atoms'].apply(lambda x: len(x['elements']))
+    print(" data df", df)
+    vals = np.array([ii[target] for ii in dataset])  # df[target].values
     print("data range", np.max(vals), np.min(vals))
     f = open(os.path.join(output_dir, tmp_name + "_data_range"), "w")
     line = "Max=" + str(np.max(vals)) + "\n"
@@ -207,11 +215,13 @@ def get_torch_dataset(
         cutoff=cutoff,
         max_neighbors=max_neighbors,
     )
-
     data = StructureDataset(
         df,
         graphs,
         target=target,
+        target_atomwise=target_atomwise,
+        target_grad=target_grad,
+        target_stress=target_stress,
         atom_features=atom_features,
         line_graph=line_graph,
         id_tag=id_tag,
@@ -224,6 +234,9 @@ def get_train_val_loaders(
     dataset: str = "dft_3d",
     dataset_array=[],
     target: str = "formation_energy_peratom",
+    target_atomwise: str = "",
+    target_grad: str = "",
+    target_stress: str = "",
     atom_features: str = "cgcnn",
     neighbor_strategy: str = "k-nearest",
     n_train=None,
@@ -447,6 +460,9 @@ def get_train_val_loaders(
             id_tag=id_tag,
             atom_features=atom_features,
             target=target,
+            target_atomwise=target_atomwise,
+            target_grad=target_grad,
+            target_stress=target_stress,
             neighbor_strategy=neighbor_strategy,
             use_canonize=use_canonize,
             name=dataset,
@@ -462,6 +478,9 @@ def get_train_val_loaders(
             id_tag=id_tag,
             atom_features=atom_features,
             target=target,
+            target_atomwise=target_atomwise,
+            target_grad=target_grad,
+            target_stress=target_stress,
             neighbor_strategy=neighbor_strategy,
             use_canonize=use_canonize,
             name=dataset,
@@ -477,6 +496,9 @@ def get_train_val_loaders(
             id_tag=id_tag,
             atom_features=atom_features,
             target=target,
+            target_atomwise=target_atomwise,
+            target_grad=target_grad,
+            target_stress=target_stress,
             neighbor_strategy=neighbor_strategy,
             use_canonize=use_canonize,
             name=dataset,
@@ -489,6 +511,7 @@ def get_train_val_loaders(
         )
 
         collate_fn = train_data.collate
+        # print("line_graph,line_dih_graph", line_graph, line_dih_graph)
         if line_graph:
             collate_fn = train_data.collate_line_graph
 

diff --git a/alignn/examples/sample_data_ff/config_example_atomwise.json b/alignn/examples/sample_data_ff/config_example_atomwise.json
@@ -0,0 +1,53 @@
+{
+    "version": "112bbedebdaecf59fb18e11c929080fb2f358246",
+    "dataset": "user_data",
+    "target": "target",
+    "atom_features": "cgcnn",
+    "neighbor_strategy": "k-nearest",
+    "id_tag": "jid",
+    "random_seed": 123,
+    "classification_threshold": null,
+    "n_val": null,
+    "n_test": null,
+    "n_train": null,
+    "train_ratio": 0.8,
+    "val_ratio": 0.1,
+    "test_ratio": 0.1,
+    "target_multiplication_factor": null,
+    "epochs": 3,
+    "batch_size": 2,
+    "weight_decay": 1e-05,
+    "learning_rate": 0.001,
+    "filename": "sample",
+    "warmup_steps": 2000,
+    "criterion": "l1",
+    "optimizer": "adamw",
+    "scheduler": "onecycle",
+    "pin_memory": false,
+    "save_dataloader": false,
+    "write_checkpoint": true,
+    "write_predictions": true,
+    "store_outputs": false,
+    "progress": true,
+    "log_tensorboard": false,
+    "standard_scalar_and_pca": false,
+    "use_canonize": false,
+    "num_workers": 0,
+    "cutoff": 8.0,
+    "max_neighbors": 12,
+    "keep_data_order": false,
+    "model": {
+        "name": "alignn_atomwise",
+        "atom_input_features": 92,
+        "calculate_gradient":true,
+        "atomwise_output_features":3,
+        "alignn_layers":4,
+        "gcn_layers":4,
+        "output_features": 1,
+        "graphwise_weight":0.85,
+        "gradwise_weight":0.05,
+        "atomwise_weight":0.05,
+        "stresswise_weight":0.05
+
+    }
+}
diff --git a/alignn/examples/sample_data_ff/id_prop.json b/alignn/examples/sample_data_ff/id_prop.json
diff --git a/alignn/ff/best_model.pt b/alignn/ff/best_model.pt
diff --git a/alignn/ff/config.json b/alignn/ff/config.json
@@ -0,0 +1,64 @@
+{
+    "version": "112bbedebdaecf59fb18e11c929080fb2f358246",
+    "dataset": "user_data",
+    "target": "target",
+    "atom_features": "cgcnn",
+    "neighbor_strategy": "k-nearest",
+    "id_tag": "jid",
+    "random_seed": 123,
+    "classification_threshold": null,
+    "n_val": null,
+    "n_test": null,
+    "n_train": null,
+    "train_ratio": 0.9,
+    "val_ratio": 0.05,
+    "test_ratio": 0.05,
+    "target_multiplication_factor": null,
+    "epochs": 500,
+    "batch_size": 16,
+    "weight_decay": 1e-05,
+    "learning_rate": 0.001,
+    "filename": "sample",
+    "warmup_steps": 2000,
+    "criterion": "l1",
+    "optimizer": "adamw",
+    "scheduler": "onecycle",
+    "pin_memory": false,
+    "save_dataloader": false,
+    "write_checkpoint": true,
+    "write_predictions": true,
+    "store_outputs": false,
+    "progress": true,
+    "log_tensorboard": false,
+    "standard_scalar_and_pca": false,
+    "use_canonize": false,
+    "num_workers": 0,
+    "cutoff": 8.0,
+    "max_neighbors": 12,
+    "keep_data_order": false,
+    "normalize_graph_level_loss": false,
+    "distributed": false,
+    "n_early_stopping": null,
+    "output_dir": "out_continue",
+    "model": {
+        "name": "alignn_atomwise",
+        "alignn_layers": 4,
+        "gcn_layers": 4,
+        "atom_input_features": 92,
+        "edge_input_features": 80,
+        "triplet_input_features": 40,
+        "embedding_features": 64,
+        "hidden_features": 256,
+        "output_features": 1,
+        "grad_multiplier": -1,
+        "calculate_gradient": true,
+        "atomwise_output_features": 3,
+        "graphwise_weight": 1.0,
+        "gradwise_weight": 10.0,
+        "stresswise_weight": 0.0,
+        "atomwise_weight": 0.0,
+        "link": "identity",
+        "zero_inflated": false,
+        "classification": false
+    }
+}