# LOAD-AtoMS

_**L**arge **O**pen **A**ccess **D**atasets for **Ato**mistic **M**aterials **S**cience_

This demo notebook shows how to use the `load-atoms` package to download and use datasets of atomic structures.

The main function of the package is `load_atoms`:

In [1]:
from load_atoms import load_dataset

dataset = load_dataset("C-GAP-17")

Dataset C-GAP-17 not found. Downloading...


45.8MiB [00:00, 54.9MiB/s]                           


Loaded C-GAP-17, containing 4,080 structures and 256,628 atoms.
The use of this dataset is licensed under https://creativecommons.org/licenses/by-nc-sa/4.0/.    


"`Datasets`" are just lists of `ase.Atoms` objects.

In [2]:
dataset[0]

Atoms(symbols='C64', pbc=True, cell=[9.483921, 9.483921, 9.483921], force=..., frac_pos=..., calculator=SinglePointCalculator(...))

`load_atoms` also exposes some useful functions for working with datasets:

In [3]:
from load_atoms import filter_by

bulk_amo = filter_by(dataset, config_type="bulk_amo")
small = filter_by(dataset, lambda atoms: len(atoms) < 64)
len(bulk_amo), len(small)

(3070, 1283)

In [4]:
from load_atoms import info

info("C-GAP-17")

# C-GAP-17 

## Description

Complete training dataset for the C-GAP-17 model. For details, see the supplementary information [here](https://www.repository.cam.ac.uk/handle/1810/262814)
## Citation

```
@article{Deringer-17,
    title = {Machine learning based interatomic potential for amorphous carbon},
    doi = {10.1103/PhysRevB.95.094203},
    volume = {95},
    number = {9},
    urldate = {2021-07-15},
    journal = {Physical Review B},
    author = {Deringer, Volker L. and Cs{\'a}nyi, G{\'a}bor},    
    year = {2017},
    pages = {094203},
}
```
## License

https://creativecommons.org/licenses/by-nc-sa/4.0/


In [5]:
from load_atoms import cross_validate_split

train, test = cross_validate_split(dataset, fold=0, folds=5, seed=42)
len(train), len(test)

(3264, 816)