# NOMAD Metainfo 2.0 demonstration

You can find more complete documentation [here](https://labdev-nomad.esc.rzg.mpg.de/fairdi/nomad/testing/docs/metainfo.html)

In [2]:
from nomad.metainfo import MSection, SubSection, Quantity, Datetime, units
import numpy as np
import datetime

## Sections and quantities

To define sections and their quantities, we use Python classes and attributes. Quantities have *type*, *shape*, and *unit*.

In [3]:
class System(MSection):
    """ The simulated system """
    number_of_atoms = Quantity(type=int, derived=lambda system: len(system.atom_labels))
    atom_labels = Quantity(type=str, shape=['number_of_atoms'])
    atom_positions = Quantity(type=np.dtype(np.float64), shape=['number_of_atoms', 3], unit=units.m)

Such *section classes* can then be instantiated like regular Python classes. Respectively, *section instances* are just regular Python object and section quantities can be get and set like regular Python object attributes.

In [4]:
system = System()
system.atom_labels = ['H', 'H', '0']
system.atom_positions = np.array([[6, 0, 0], [0, 0, 0], [3, 2, 0]]) * units.angstrom

Of course the metainfo is not just about dealing with physics data in Python. Its also about storing and managing data in various fileformats and databases. Therefore, the created data can be serialized, e.g. to JSON. All *section 
instances* have a set of additional `m_`-methods that provide addtional functions. Note the unit conversion.

In [5]:
system.m_to_json()

'{"atom_labels": ["H", "H", "0"], "atom_positions": [[6e-10, 0.0, 0.0], [0.0, 0.0, 0.0], [3e-10, 2e-10, 0.0]]}'

## Sub-sections to form hiearchies of data

*Section instances* can be nested to form data hierarchies. To achive this, we first have to create *section 
definitions* that have sub-sections.

In [6]:
class Run(MSection):
    timestamp = Quantity(type=Datetime, description='The time that this run was conducted.')
    systems = SubSection(sub_section=System, repeats=True)

Now we can add *section instances* for `System` to *instances* of `Run`.

In [7]:
run = Run()
run.timestamp = datetime.datetime.now()

system = run.m_create(System)
system.atom_labels = ['H', 'H', '0']
system.atom_positions = np.array([[6, 0, 0], [0, 0, 0], [3, 2, 0]]) * units.angstrom

system = run.m_create(System)
system.atom_labels = ['H', 'H', '0']
system.atom_positions = np.array([[5, 0, 0], [0, 0, 0], [2.5, 2, 0]]) * units.angstrom
run.m_to_json()

'{"timestamp": "2019-10-09T14:48:43.663363", "systems": [{"atom_labels": ["H", "H", "0"], "atom_positions": [[6e-10, 0.0, 0.0], [0.0, 0.0, 0.0], [3e-10, 2e-10, 0.0]]}, {"atom_labels": ["H", "H", "0"], "atom_positions": [[5e-10, 0.0, 0.0], [0.0, 0.0, 0.0], [2.5e-10, 2e-10, 0.0]]}]}'

The whole data hiearchy can be navigated with regular Python object/attribute style programming and values can be
used for calculations as usual.

In [8]:
(run.systems[1].atom_positions - run.systems[0].atom_positions).to(units.angstrom)

## Reflection, inspection, and code-completion

Since all definitions are available as *section classes*, Python already knows about all possible quantities. We can 
use this in Python notebooks, via *tab* or the `?`-operator. Furthermore, you can access the *section definition* of all *section instances* with `m_def`. Since a *section defintion* itself is just a piece of metainfo data, you can use it to programatically explore the definition itselve.

In [9]:
run.systems[0].m_def.quantities

[number_of_atoms:Quantity, atom_labels:Quantity, atom_positions:Quantity]

In [10]:
run.m_def.all_quantities['timestamp'].description

'The time that this run was conducted.'

In [11]:
System.atom_labels.shape

['number_of_atoms']

In [32]:
t = np.dtype(np.i64)

In [33]:
t.type

numpy.int64