# Primer on AiiDA

## * AiiDA stands for <u>A</u>utomated <u>I</u>nteractive <u>I</u>nfrastructure and <u>Da</u>tabase for Computational Science<br>
## * Great for reproducibility

#### AiiDA is a database comprised of <b>Process Nodes</b> and <b>Data Nodes</b><br>
#### The <b>Nodes</b> are connected with directed <b>edges</b> or <b>links</b> 

<img src="fig1.jpg" width="800" height="400">

<div style="text-align: center"><span style="font-size:0.9em;">Computational Materials Science
Volume 111, January 2016, Pages 218-230</span></div>

Here <b>Process Nodes</b> are represented by squares.<br>
They orchestrate the calculations done, by gathering the necessary input data and outputting the results<br><br>
The <b>Data Nodes</b> are represented by circles and store inputs and/or outputs of the calculations<br><br>

### Example of <b>Node</b> attributes

<img src="fig3.jpg" width="800" height="400">

<div style="text-align: center"><span style="font-size:0.9em;">Computational Materials Science
Volume 111, January 2016, Pages 218-230</span></div>

Here we have a <b>Process Node</b> with the <u>Primary Key (pk)</u>: 7<br>
This node contains the information needed to run some kind of calculation.<br><br>
The second node with pk: 8 is a <b>Data Node</b> with the results of a calculation

### An example of a Calculation

<img src="fig4.jpg" width="800" height="400">

<div style="text-align: center"><span style="font-size:0.9em;">Computational Materials Science
Volume 111, January 2016, Pages 218-230</span></div>

### Overview of AiiDA

<img src="fig2.jpg" width="800" height="400">

<div style="text-align: center"><span style="font-size:0.9em;">Computational Materials Science
Volume 111, January 2016, Pages 218-230</span></div>

There are three ways of using the AiiDA database. Here we'll only look at link between the AiiDA API and the Storage (to retrieve data) and omit the AiiDA daemon from our discussion

First you'll need to install necessary prerequisites and the aiida python package (via pip), see info here https://aiida.readthedocs.io/projects/aiida-core/en/latest/intro/get_started.html <br>
(I'm not sure if installing PostgreSQL and RabbitMQ is necessary if we're only looking to access results from a database)

In [101]:
from aiida import load_profile
from aiida.orm import QueryBuilder
from aiida.orm import load_node
from aiida.orm import Node, Group, Data, Dict, CifData
import pandas as pd
load_profile()

<aiida.manage.configuration.profile.Profile at 0x7f775273e150>

## We'll focus on the DataBase created by Ongari et al. (https://pubs.acs.org/doi/full/10.1021/acscentsci.0c00988 and https://www.materialscloud.org/discover/curated-cofs and https://archive.materialscloud.org/record/2020.107)

<img src="fig5.jpeg" width="1600" height="800">

<div style="text-align: center"><span style="font-size:0.9em;">
ACS Cent. Sci. 2020, XXXX, XXX, XXX-XXX
</span></div>

## Querying (more on querying and filtering here https://aiida.readthedocs.io/projects/aiida-core/en/latest/topics/database.html)

In [33]:
qb = QueryBuilder()
qb.append(Dict)
qb.limit(10)
results = qb.all()

In [36]:
results[0][0].attributes

{'framework_1': {'general': {'box_ax_dev': 0.0,
   'box_by_dev': 0.0,
   'box_cz_dev': 0.0,
   'box_ax_unit': 'A^3',
   'box_by_unit': 'A^3',
   'box_cz_unit': 'A^3',
   'energy_unit': 'kJ/mol',
   'box_beta_dev': 0.0,
   'box_alpha_dev': 0.0,
   'box_beta_unit': 'degrees',
   'box_gamma_dev': 0.0,
   'box_alpha_unit': 'degrees',
   'box_ax_average': 54.8488,
   'box_by_average': 27.41776,
   'box_cz_average': 26.97482,
   'box_gamma_unit': 'degrees',
   'cell_volume_dev': 0.0,
   'box_beta_average': 58.8519,
   'cell_volume_unit': 'A^3',
   'box_alpha_average': 89.1655,
   'box_gamma_average': 90.7951,
   'exceeded_walltime': False,
   'framework_density': '747.682946025899',
   'cell_volume_average': 40565.57463,
   'adsorbate_density_dev': 0.51053,
   'adsorbate_density_unit': 'kg/m^3',
   'energy_ads/ads_tot_dev': 0.50066298811601,
   'energy_ads/ads_vdw_dev': 0.50604161547211,
   'framework_density_unit': 'kg/m^3',
   'energy_host/ads_tot_dev': 2.4135866486332,
   'energy_host/ads

In [37]:
qb = QueryBuilder()
qb.append(Dict, tag='result_dictionary')
qb.append(CifData, with_outgoing='result_dictionary')
qb.limit(10)
results = qb.all()
results

[]

In [17]:
qb = QueryBuilder()
qb.append(CifData, tag='cif_structure', filters={'uuid': '2a2a4822-aaba-4de5-bbae-dba4ccb7d191'})
qb.append(Dict, with_ancestors='cif_structure')
qb.limit(10)
results = qb.all()
results

[[<Dict: uuid: dab00f6a-a97a-43ac-a8bd-a42e78754164 (pk: 481548)>],
 [<Dict: uuid: 65e7f083-00a0-41e4-b4fb-d9238e8d6d71 (pk: 246430)>],
 [<Dict: uuid: 71a85f9f-a96d-4404-b8c8-4b902c374315 (pk: 270035)>],
 [<Dict: uuid: 59e0e875-3427-4ac1-8684-24585150c022 (pk: 222434)>],
 [<Dict: uuid: 59e0e875-3427-4ac1-8684-24585150c022 (pk: 222434)>],
 [<Dict: uuid: 152d33c3-7e74-4190-82cc-ea866f9f5692 (pk: 84588)>],
 [<Dict: uuid: 152d33c3-7e74-4190-82cc-ea866f9f5692 (pk: 84588)>],
 [<Dict: uuid: 6a95eda9-55fb-4683-8547-71bdfb5a881e (pk: 255909)>],
 [<Dict: uuid: 152d33c3-7e74-4190-82cc-ea866f9f5692 (pk: 84588)>],
 [<Dict: uuid: 6a95eda9-55fb-4683-8547-71bdfb5a881e (pk: 255909)>]]

In [31]:
results

[[{'md5': 'e409872be2d869073a20b1a8f68ccae0',
   'filename': '16480N2.cif',
   'formulae': None,
   'scan_type': 'standard',
   'parse_policy': 'lazy',
   'spacegroup_numbers': None},
  {'Density': 0.58249,
   'Input_ha': 'DEF',
   'POAV_A^3': 2357.67,
   'isotherm': {'pressure': [1, 5, 10, 20, 30, 50, 80, 100, 140, 200],
    'pressure_unit': 'bar',
    'loading_absolute_dev': [0.0065109909529743,
     0.043252400216982,
     0.074954614137809,
     0.10160498687647,
     0.082171181833133,
     0.12954829764376,
     0.091573803512162,
     0.073603364552317,
     0.063229164814447,
     0.14498232845544],
    'loading_absolute_unit': 'mol/kg',
    'loading_absolute_average': [0.24622244104334,
     1.1741308675562,
     2.2198037282031,
     4.1451062819997,
     5.8225738003748,
     8.4799041127392,
     11.335709658305,
     12.757324213904,
     14.995289898901,
     17.26405703741],
    'enthalpy_of_adsorption_dev': [0.19284141910487,
     0.26572547454541,
     0.17861252317644

## The above methods didn't really work well enough

Turns out each Cif structure has its own <b>Group</b> that contains the following results:

|<span>|<span>|<span>|<span>|<span>|<span>|<span>|<span>|<span>|
|---------------|----------------|-----------------|--------------------|----------------|----------------|-------------|--------------|--------------|
| appl_ch4n2sel | appl_h2sh2osel | appl_ch4storage | appl_h2sh2osel_err | appl_h2storage | appl_o2storage | appl_pecoal | appl_peng    | appl_xekrsel |
    | dftopt        | <b>isot_ch4</b>       | <b>isot_co2</b>        | <b>isot_n2</b>            | <b>isot_o2</b>        | <b>isotmt_h2</b>      | kh_co2qeq   | <b>kh_h2o</b>       | orig_zeopp   |
    | kh_h2o_err    | <b>kh_h2s</b>         | <b>kh_kr</b>           | <b>kh_xe</b>              | opt_cif_ddec   | opt_zeopp      | <b>orig_cif</b>    | orig_cif_qeq |              |

In [92]:
qb = QueryBuilder()
qb.append(Group, tag='group', filters={'uuid': '000cd891-e0e9-41ca-9bb0-5be57038e54e'})
qb.append(Node, with_group='group', filters={'extras.tag4': 'appl_ch4storage'}, project=['attributes.wc_65bar_mol/kg_average'])
qb.append(Node, with_group='group', filters={'extras.tag4': 'appl_o2storage'}, project=['attributes.wc_140bar_wt%_dev'])
qb.limit(2)
res = qb.all()
res

[[13.332031016555, 0.3028980780736]]

In [98]:
qb = QueryBuilder()
qb.append(Group, tag='group', project='uuid')
glist = qb.all()
for i in range(len(glist)):
    if  i % 25 == 0:
        print(i)

627
0
25
50
75
100
125
150
175
200
225
250
275
300
325
350
375
400
425
450
475
500
525
550
575
600
625


In [107]:
qb = QueryBuilder()
qb.append(Group, tag='group', project='uuid', filters={'type_string': {'!==': 'core.import'}})
group_list = qb.all()

In [108]:
group_list

[['000cd891-e0e9-41ca-9bb0-5be57038e54e'],
 ['00226a6f-0a54-4293-8f9e-6c53fa155ae5'],
 ['017d1e1a-7848-4251-92e3-56ac6656639a'],
 ['017f3d02-3934-47b5-bf65-97b42776e8b7'],
 ['018d9bcf-a998-4175-82bc-ef43a7dc763b'],
 ['020fdc98-bae1-4625-b1f2-aa4305cc85d9'],
 ['0275a7da-fae2-46e2-8f6c-22823057cf14'],
 ['02ce9652-1152-497f-826e-5cb452f92179'],
 ['02d76331-8f15-4c96-a105-8d86a22e9428'],
 ['02f98ba0-23e6-45fa-93ce-e11bff385427'],
 ['03432e3f-4ee6-4dfc-8da7-0483b85d8d05'],
 ['034b8ecd-980e-46ec-ae98-dc84618c6e62'],
 ['03a66bce-7d4d-4f5c-90f5-b5b619b60209'],
 ['03caf051-8376-4806-b9de-53586d75fb40'],
 ['053be00e-1928-433a-bf79-0ac7a80cac9c'],
 ['053eccd1-8a66-4388-9535-3bbff9aa525c'],
 ['05e5eb77-340e-4d81-b8ad-ea809512e458'],
 ['05f2d0b8-9595-452c-8eb5-8f24311008b8'],
 ['06127d3d-16cb-4991-bbbd-830f9b9c1e3d'],
 ['0655a2fc-14db-40e3-b48a-9e50195fca51'],
 ['06611df0-22f4-49fb-a0d2-081a25e2723c'],
 ['07234570-3d35-4e63-be95-d8762a608865'],
 ['075e5f47-f4d5-4633-a51b-0470bef2be25'],
 ['076961be

In [113]:
qb = QueryBuilder()
qb.append(Group, tag='cif_group', filters={'uuid': group_list[0][0]})
qb.append(CifData, with_group='cif_group', filters={'extras.tag4': 'orig_cif'}, project='attributes')
qb.append(Dict, with_group='cif_group', filters={'extras.tag4': 'isot_ch4'}, project='attributes')
results = qb.all()
results

[[{'md5': 'd50454ca17af699615b416fca5747e07',
   'filename': '20040N2.cif',
   'formulae': [None],
   'scan_type': 'standard',
   'parse_policy': 'eager',
   'spacegroup_numbers': [None]},
  {'Density': 0.464398,
   'Input_ha': 'DEF',
   'POAV_A^3': 6518.73,
   'isotherm': {'pressure': [1.0, 5.8, 20, 35, 50, 65],
    'pressure_unit': 'bar',
    'loading_absolute_dev': [0.091465782971059,
     0.10643056579762,
     0.2515710518384,
     0.11813181790562,
     0.098621988028515,
     0.20258721012854],
    'loading_absolute_unit': 'mol/kg',
    'loading_absolute_average': [1.1149499874431,
     4.2150522857816,
     9.7188967840883,
     13.199184394717,
     15.661772457412,
     17.547083302337],
    'enthalpy_of_adsorption_dev': [0.66341837320097,
     1.040702393721,
     0.3604837245267,
     0.56759690869938,
     0.4397069333932,
     0.34233739908375],
    'enthalpy_of_adsorption_unit': 'kJ/mol',
    'enthalpy_of_adsorption_average': [-17.233875765966,
     -13.621280420377,
   

In [None]:
resul