# `oganesson` tutorial

`oganesson` helps material scientists generate, modify, analyze and featurize structures, create structure datasets, create nudged elastic bands diffusion pathways along with `VASP` input files, and evolve structures using ML structure optimizers.

## Structural manipulation and preparation

The `OgStructure` class enables you to do various structure manuipulation operations:

- You can adsorb an atom on a surface by simply calling the `add_atom_to_surface(atom)`:

In [None]:
from oganesson.ogstructure import OgStructure

og=OgStructure(file_name='examples/structures/MoS2.vasp')
og.add_atom_to_surface('Li').structure.to('MoS2_Li.vasp','poscar')

- You can scan for multiple adsorption positions on a surface by using the `adsorption_scanner(atom)` method:

In [None]:
from oganesson.ogstructure import OgStructure
from ase.build import surface
s = surface('Au', (1, 1, 1), 9)
s.center(vacuum=10, axis=2)

og=OgStructure(s)
og.add_atom_to_surface('O').structure.to('Au_O.cif','poscar')

og=OgStructure(s)
ad_structures = og.adsorption_scanner('O')
counter = 0
for s in ad_structures:
    counter += 1
    s().to(str(counter) + '.cif','poscar')

- You can add an interstitial defect to a structure by calling the `add_interstitial(atom)`:

In [None]:
s = OgStructure(file_name='examples/structures/Li3PO4_mp-13725.cif')
s.add_interstitial('H')().to('H_interstitial.cif','cif')

- You can generate the entire VASP input POSCARs for the NEB diffusion pathway of an atom in the structure:

In [None]:
from oganesson.ogstructure import OgStructure
og = OgStructure(file_name='examples/structures/Li3PO4_mp-13725.cif')
og.generate_neb('Li', r=3)


- To generate an alloy supercell with multiple components based on a template supercell, you can use the `substitutions_random()` method to randomly replace the atom in the supercell with other atoms at specific proportions.

In [None]:
from oganesson.genetic_algorithms import GA
from oganesson.ogstructure import OgStructure
from ase.build import bulk

Cu = bulk('Cu', 'fcc', a=3.6)
structure = OgStructure(Cu.repeat([4,4,4]))
structure.substitutions_random('Cu',{'Al':16, 'Cr':16, 'Ti':16, 'V':16})
structure()

- To extract dynamics data from VASP OUTCAR files, there is an option to rely only on the available `OUTCAR` file, with the assumption that the first line of the `POSCAR` in the calculation had the correct assignment of species, and that there was not too many of them. However, it is better to obtain the species from the `POSCAR` file used in the calculation. So typically, one should provide both files to the `Outcar` class using the `poscar_file` parameter in the constructor.

In [None]:
from oganesson.io.vasp import Outcar

outcar = Outcar('examples/')
outcar.write_md_data()

A very handy method in OgStructure is the `create_ripple()` method that enables you to generate a sinusoidal 2D structure if your `OgStructure` is the unit cell of a 2D material. The argument `strain` lets you specify how much lateral strain is applied to the sheet that will force it to buckle.

If you run `create_ripple()` with `relax=False`, it will generate a quick wave structure by moving the atoms to positions along a sine graph. This can be OK for flat 2D materials like graphene, but not for buckled ones like MoS2. In the latter case, it is better to call `create_ripple()` with `relax=True`. This will trigger the relaxation of intermediate structures using the m3gnet optimizer. You can set the number of intermediate structures via the argument `steps`.

If you want to observe the intermediate structures, you can set `write_intermediate=True`.

In [None]:
from oganesson.ogstructure import OgStructure

MoS2 = OgStructure(file_name='examples/structures/MoS2.vasp')
MoS2 = MoS2.create_ripple(axis='x',units=30,strain=0.8,steps=10,relax=True)
MoS2().to('MoS2ripple_relaxed.cif')

## Calculation of the structural properties

- You can obtain the radial distribution fuction (RDF) using the `rdf()` function, which is borrowed from the ASE library.

In [None]:
from oganesson.ogstructure import OgStructure
from ase.build import bulk

Cu = bulk('Cu', 'fcc', a=3.6)
structure = OgStructure(Cu.repeat([4,4,4]))
structure.substitutions_random('Cu',{'Al':16, 'Cr':16, 'Ti':16, 'V':16})
print(structure.get_rdf(rmax=4,nbins=100,elements=[13,13])[0])

- You can plot the simulated XRD for your structure using the `xrd()` function.

In [None]:
from oganesson.ogstructure import OgStructure
from ase.build import bulk

Cu = bulk('Cu', 'fcc', a=3.6)
structure = OgStructure(Cu.repeat([4,4,4]))
structure.substitutions_random('Cu',{'Al':16, 'Cr':16, 'Ti':16, 'V':16})
structure.xrd()

# Machine learning

`oganesson` supports machine learning workflows by providing quick and easy tools to generate machine learning descriptors for materials.

## Machine learning descriptors

The following demonstrates how to convert a `CIF` file to a descriptor vector using the `BACD` descriptors and the `SineMatrix` features from `DScribe`:

In [None]:
from oganesson.descriptors import BACD, DScribeSineMatrix
from oganesson.ogstructure import OgStructure


bacd = BACD(OgStructure(file_name='examples/structures/Li3PO4_mp-13725.cif'))
print(bacd.describe())

sf = DScribeSineMatrix(OgStructure(file_name='examples/structures/Li3PO4_mp-13725.cif'))
print(sf.describe())


In [None]:
from oganesson.descriptors import BACD
from oganesson.ogstructure import OgStructure
bacd = BACD(OgStructure(file_name='examples/structures/Li3PO4_mp-13725.cif'))
print(bacd.is_invariant())

To make sure that the descriptors are indeed equivariant (invariant with respect to translation, rotation and substitution), here is a test.

## Generation of machine learning datasets

If you need to featurize materials in bulk, here is an example of how to do that. The code extracts perovskite materials from the Materials Project database via an API request. Note: the API key is hidden.

In [None]:
import requests
from oganesson.descriptors import Describe, BACD
from oganesson.ogstructure import OgStructure
from pymatgen.core.structure import Structure
import pandas as pd

headers = {
    'accept': 'application/json',
    'X-API-KEY': 'iGRUQOIQAcPMw00QWQKIEegfhF8O7Gmm'
}
materials_summary = requests.get('https://api.materialsproject.org/materials/summary/?formula=ABO3&deprecated=false&_per_page=1000&_skip=0&_limit=1000&_all_fields=true&is_stable=true', headers=headers)
materials_summary = materials_summary.json()['data']
datasets = {'material_ids':[],'structures':[],'bacd':[],'formation_energy_per_atom':[]}
for material in materials_summary:
    structure = OgStructure(Structure.from_dict(material['structure']))
    datasets['material_ids'] += [material['material_id']]
    datasets['structures'] += [structure]
    datasets['bacd'] += [Describe.describe(structure,descriptor=BACD)]
    datasets['formation_energy_per_atom'] += [material['formation_energy_per_atom']]


We can now use the dataset to train an `Xgboost` ML model to predict the formation energy per atom (the only target quantity we have extracted above). The following code demonstrates how to perform such training using the `joltml` package. You will notice that with such a limited dataset, the prediction results are pretty good. The results are logged in the `jolt_lab` folder.

In [None]:

from joltml import Experiment, Xgboost

training_set = pd.DataFrame(datasets['bacd'])
experiment = Experiment(training_set.iloc[:800],experiment_id='bacd')
y = experiment.add_models([Xgboost()]).regression(targets=pd.DataFrame(datasets['formation_energy_per_atom'][:800]), splits=[0.8,0.2]).predict(training_set.iloc[800:])

# Performing AIMD simulations

Here we subject a lithium-rich material, Li3PO4, to a temperature of 1000 K, and then calculate the diffusivities of the three elements. `m3gnet` is applied as the ML structure optimizer, and is currently not reliable for AIMD simulations. More work is currently being done to build better ML optimizers.

The following example performs 1000 steps AIMD simulation with LGPS, and then computes the diffusivity of the elements using the `diffusivity` package.

In [None]:
from oganesson.ogstructure import OgStructure
structure = OgStructure(file_name='examples/structures/LGPS_ChemMater_2018_30_4995_Opt.cif')
structure.simulate(temperature=1000,steps=1000,loginterval=1)
coeffs = structure.calculate_diffusivity()
print('Diffusion coefficients:', coeffs)

# Genetic algorithms

Using `m3gnet` as a structure optimizer as well as a calculator for the fitness function (which is the value of the optimized total energy of a structure), `oganesson` generates structure populations based on the `ase.ga` library. The following examples shows how the equilibrium structure of NaH can be found. All what is needed is: identify the number of species, create a `GA` object, and call `evolve()`.

First, we create the initial population. Note, this will not relax any structure yet.

In [None]:
from oganesson.genetic_algorithms import GA
ga = GA(species=['Na']*4 + ['H']*4, rmax=20, population_size=10)

Next, you evolve the population.

In [None]:
for i in range(10):
    ga.evolve(num_offsprings=5)

You can also create your own random population and evolve it. For example, you can use the `substitutions()` method in `OgStructure` to create candidate alloy materials based a specific mix of elements in a given crystal structure, and then evolve that population.

In [None]:
from oganesson.genetic_algorithms import GA
from oganesson.ogstructure import OgStructure
from ase.build import bulk

Cu = bulk('Cu', 'fcc', a=3.6)
structure = OgStructure(Cu.repeat([2,2,2]))
print(structure())
ga = GA(population=structure.substitutions('Cu',{'Fe':4,'Cu':4}))
for i in range(10):
    ga.evolve()