# Introduction

`pyiron_ontology` uses the `owlready2` library to build up pyiron-specific ontologies, and provides some extra tools to help you leverage these.

At present, the only ontology implemented is for the realm of atomistic calculations, and the scope of this ontology is still fairly limited.

First, let's import `pyiron_ontology` and grab the atomistics ontology (an `owlready2.namespace.Ontology` object) we define in there

In [1]:
import pyiron_ontology as po

In [2]:
onto = po.dynamic.atomistics()

We can look at various properties of the ontology, just like other owl ontologies, e.g. the classes and individuals defined in this space.

There are four key-classes common to all ontologies made in the scope of `pyiron_ontology`: `Generic`, `Input`, `Function`, and `Output`. `Generic` is the parent class used for defining domain knowledge; the remaining three are used to represent how computations are performed in any knowledge-space. (You'll also see `PyironOntoThing`, `Parameter`, `WorkflowThing`, and `IO` are parent classes used under the hood).

`Generic` will be heavily sub-classed in each specific ontology, and then instantiated and paired with inputs and outputs so that we will know what sort of information is moving around our computation graphs. The workflow elements will (so far) only be instantiated, defining all the possible calculation available.

Let's first look at the classes for our atomistics knowledge-space:

In [3]:
list(onto.classes())

[atomistics.PyironOntoThing,
 atomistics.Parameter,
 atomistics.Generic,
 atomistics.WorkflowThing,
 atomistics.Function,
 atomistics.IO,
 atomistics.Output,
 atomistics.Input,
 atomistics.AtomisticsFunction,
 atomistics.UserInput,
 atomistics.PyironObject,
 atomistics.PhysicalProperty,
 atomistics.ChemicalElement,
 atomistics.MaterialProperty,
 atomistics.BulkModulus,
 atomistics.BPrime,
 atomistics.SurfaceEnergy,
 atomistics.Dimensional,
 atomistics.OneD,
 atomistics.TwoD,
 atomistics.ThreeD,
 atomistics.Structure,
 atomistics.Defected,
 atomistics.HasDislocation,
 atomistics.HasVacancy,
 atomistics.HasInterface,
 atomistics.HasGB,
 atomistics.HasSurface,
 atomistics.HasPB,
 atomistics.Bulk,
 atomistics.PyironProject,
 atomistics.AtomisticsProject,
 atomistics.PyironJob,
 atomistics.AtomisticsJob,
 atomistics.Lammps,
 atomistics.Vasp]

We can also look at the individuals. Some of these should have very descriptive names -- these are the `Input`, `Output`, and `Function` individuals. The rest are instances of our `Generic` class (and its children) and receive their name automatically.

In [4]:
list(onto.individuals())

[atomistics.project,
 atomistics.project_input_name,
 atomistics.project_output_atomistics_project,
 atomistics.bulk_structure,
 atomistics.generic1,
 atomistics.bulk_structure_input_element,
 atomistics.structure1,
 atomistics.bulk_structure_output_structure,
 atomistics.surface_structure,
 atomistics.generic2,
 atomistics.surface_structure_input_element,
 atomistics.structure2,
 atomistics.surface_structure_output_structure,
 atomistics.lammps,
 atomistics.structure3,
 atomistics.lammps_input_structure,
 atomistics.lammps1,
 atomistics.lammps_output_job,
 atomistics.vasp,
 atomistics.generic3,
 atomistics.vasp_input_structure,
 atomistics.vasp1,
 atomistics.vasp_output_job,
 atomistics.murnaghan,
 atomistics.atomisticsproject1,
 atomistics.murnaghan_input_project,
 atomistics.atomisticsjob1,
 atomistics.structure4,
 atomistics.murnaghan_input_job,
 atomistics.bulkmodulus1,
 atomistics.murnaghan_output_bulk_modulus,
 atomistics.bprime1,
 atomistics.murnaghan_output_b_prime,
 atomistic

We can make the usual owlready queries of these objects, e.g.

In [5]:
onto.vasp_input_structure.INDIRECT_is_a

[atomistics.Parameter,
 atomistics.WorkflowThing,
 atomistics.Input,
 owl.Thing,
 atomistics.PyironOntoThing,
 atomistics.IO]

In [6]:
onto.vasp_input_structure.generic.INDIRECT_is_a

[atomistics.Structure,
 atomistics.Parameter,
 atomistics.Generic,
 atomistics.Dimensional,
 atomistics.PyironObject,
 owl.Thing,
 atomistics.ThreeD,
 atomistics.PyironOntoThing]

We can also look into some of the atomistics-specific relationships that have been defined:

In [7]:
onto.vasp.mandatory_inputs, onto.optional_inputs, onto.vasp.inputs

([atomistics.vasp_input_structure], None, [atomistics.vasp_input_structure])

In [8]:
onto.vasp.outputs

[atomistics.vasp_output_job]

and we can chain these queries together in meaningful ways:

In [9]:
some_code = onto.vasp
first_input = some_code.mandatory_inputs[0]
appears_elsewhere = first_input.generic.parameters
can_come_from = first_input.get_sources()
which_is_produced_by = can_come_from[0].output_of
print('some_code', some_code)
print('first_input', first_input)
print('appears_elsewhere', appears_elsewhere)
print('can_come_from', can_come_from)
print('which_is_produced_by', which_is_produced_by)

some_code atomistics.vasp
first_input atomistics.vasp_input_structure
appears_elsewhere [atomistics.vasp_input_structure]
can_come_from [atomistics.bulk_structure_output_structure, atomistics.surface_structure_output_structure]
which_is_produced_by atomistics.bulk_structure


This is powerful, but can be a bit unwieldly. 

`pyiron_ontology` also comes with helper tools for building this sort of chain, or "workflow" in a guided or automatic way.

First, let's see all the possible chains for getting input to a Lammps calculation:

In [10]:
onto.lammps.get_source_tree().render()

lammps
	lammps_input_structure
		bulk_structure_output_structure
			bulk_structure
		surface_structure_output_structure
			surface_structure


This tool also passes requirements upstream in the workflow. For instance, we see above that Lammps can take either bulk-like or non-bulk-like structure input. Instead of querying the ontology about what's needed to run a particular code, let's ask for a workflow to produce a particular material property: the bulk modulus. In this case, we know the workflow only makes sense if the structures going into it are bulk-like!

When we ask for this workflow, we again see Lammps (and Vasp) coming up as part of our tree, but now we see that it is precluded from taking surface structures because the condition for a bulk-like structure got passed up through our workflow!

(Note, these tools only work on _individuals_, so we'll just reinstantiate a copy of our `BulkModulus` generic class and query that)

In [11]:
onto.BulkModulus().get_source_tree().render()

bulkmodulus2
	murnaghan_output_bulk_modulus
		murnaghan
			murnaghan_input_project
			murnaghan_input_job
				vasp_output_job
					vasp
						vasp_input_structure
							bulk_structure_output_structure
								bulk_structure
				lammps_output_job
					lammps
						lammps_input_structure
							bulk_structure_output_structure
								bulk_structure


Instead of seeing *all* possible paths, we can build one particular path iteratively, looking at the choices available at each step and selecting which one we want:

In [12]:
b_prime = onto.BPrime()

In [13]:
b_prime.get_source_path()

(<pyiron_ontology.workflow.NodeTree at 0x7faeff903cd0>,
 [atomistics.murnaghan_output_b_prime])

In [14]:
b_prime.get_source_path(0)

(<pyiron_ontology.workflow.NodeTree at 0x7faeff900580>, [atomistics.murnaghan])

In [15]:
b_prime.get_source_path(0, 0) 

(<pyiron_ontology.workflow.NodeTree at 0x7faeff9032b0>,
 [atomistics.murnaghan_input_project, atomistics.murnaghan_input_job])

The project is a bit of a boring path to follow, so let's choose `1` here to follow the job path:

In [16]:
b_prime.get_source_path(0, 0, 1)

(<pyiron_ontology.workflow.NodeTree at 0x7faeff9034f0>,
 [atomistics.vasp_output_job, atomistics.lammps_output_job])

In [17]:
b_prime.get_source_path(0, 0, 1, 1)

(<pyiron_ontology.workflow.NodeTree at 0x7faeff88b700>, [atomistics.lammps])

In [18]:
b_prime.get_source_path(0, 0, 1, 1, 0)

(<pyiron_ontology.workflow.NodeTree at 0x7faeff9003a0>,
 [atomistics.lammps_input_structure])

In [19]:
b_prime.get_source_path(0, 0, 1, 1, 0, 0)

(<pyiron_ontology.workflow.NodeTree at 0x7faeff88b970>,
 [atomistics.bulk_structure_output_structure])

In [20]:
b_prime.get_source_path(0, 0, 1, 1, 0, 0, 0)

(<pyiron_ontology.workflow.NodeTree at 0x7faeff688250>,
 [atomistics.bulk_structure])

In [21]:
b_prime.get_source_path(0, 0, 1, 1, 0, 0, 0, 0)

(<pyiron_ontology.workflow.NodeTree at 0x7faeff6662b0>, [])

In [22]:
path, _ = b_prime.get_source_path(0, 0, 1, 1, 0, 0, 0, 0)
path.render()

bprime2
	murnaghan_output_b_prime
		murnaghan
			murnaghan_input_job
				lammps_output_job
					lammps
						lammps_input_structure
							bulk_structure_output_structure
								bulk_structure


Note: this only traces _one path_ of the required input to get to the result we originally queried -- as noted above where we ignored the project input; you need to choose paths for _all_ the required input at each `Function` step of the path.

# Working with pyiron data

We also have tools for leveraging the ontology to search through existing pyiron data in your storage and database 

Here we'll need import `pyiron_atomistics.Project` so we can create some data to work with.

In [23]:
from pyiron_ontology import AtomisticsReasoner
from pyiron_atomistics import Project
import numpy as np



In [24]:
reasoner = AtomisticsReasoner(onto) 

Next, we'll produce some data and then use the a tool on the reasoner to search over it for data that matches a particular ontological property.

First, we'll need to produce some data to search over. In this case, let's calculate the bulk modulus for a couple of alloys with varying Nickle content. On a single-core laptop, this might take two or three minutes.

In [25]:
pr = Project('example')
pr.remove_jobs(silently=True, recursive=True)

So that we can compare results for different compositions, let's quickly find a potential contains all our elements and use it consistently.

In [26]:
host = "Cu"
solutes = ["Ag", "Ni"]
all_species = pr.atomistics.structure.bulk(host, cubic=True)
all_species[0] = solutes[0]
all_species[1] = solutes[1]

from pyiron_atomistics.lammps import list_potentials
potential = list_potentials(all_species)[0]

In [27]:
for solute in solutes:
    for frac in [0., 0.10, 0.25]:
        ref = pr.atomistics.job.Lammps(f"Lammps_{host}_{solute}_frac{frac:.2f}".replace(".", "d"))
        ref.structure = pr.atomistics.structure.bulk(host, cubic=True).repeat(3)
        random_ids = np.random.choice(
            np.arange(len(ref.structure), dtype=int), 
            int(frac * len(ref.structure))
        )
        ref.structure[random_ids] = solute
        ref.potential = potential

        murn = pr.atomistics.job.Murnaghan(f"Murn_{host}_{solute}_frac{frac:.2f}".replace(".", "d"))
        murn.input['num_points']=7
        murn.ref_job = ref
        murn.run()

The job Murn_Cu_Ag_frac0d00 was saved and received the ID: 296
The job Murn_Cu_Ag_frac0d00_0_9 was saved and received the ID: 297
The job Murn_Cu_Ag_frac0d00_0_9333333 was saved and received the ID: 298
The job Murn_Cu_Ag_frac0d00_0_9666667 was saved and received the ID: 299
The job Murn_Cu_Ag_frac0d00_1_0 was saved and received the ID: 300
The job Murn_Cu_Ag_frac0d00_1_0333333 was saved and received the ID: 301
The job Murn_Cu_Ag_frac0d00_1_0666667 was saved and received the ID: 302
The job Murn_Cu_Ag_frac0d00_1_1 was saved and received the ID: 303
The job Murn_Cu_Ag_frac0d10 was saved and received the ID: 304
The job Murn_Cu_Ag_frac0d10_0_9 was saved and received the ID: 305
The job Murn_Cu_Ag_frac0d10_0_9333333 was saved and received the ID: 306
The job Murn_Cu_Ag_frac0d10_0_9666667 was saved and received the ID: 307
The job Murn_Cu_Ag_frac0d10_1_0 was saved and received the ID: 308
The job Murn_Cu_Ag_frac0d10_1_0333333 was saved and received the ID: 309
The job Murn_Cu_Ag_frac0d10_



The job Murn_Cu_Ni_frac0d00 was saved and received the ID: 320
The job Murn_Cu_Ni_frac0d00_0_9 was saved and received the ID: 321
The job Murn_Cu_Ni_frac0d00_0_9333333 was saved and received the ID: 322
The job Murn_Cu_Ni_frac0d00_0_9666667 was saved and received the ID: 323
The job Murn_Cu_Ni_frac0d00_1_0 was saved and received the ID: 324
The job Murn_Cu_Ni_frac0d00_1_0333333 was saved and received the ID: 325
The job Murn_Cu_Ni_frac0d00_1_0666667 was saved and received the ID: 326
The job Murn_Cu_Ni_frac0d00_1_1 was saved and received the ID: 327
The job Murn_Cu_Ni_frac0d10 was saved and received the ID: 328
The job Murn_Cu_Ni_frac0d10_0_9 was saved and received the ID: 329
The job Murn_Cu_Ni_frac0d10_0_9333333 was saved and received the ID: 330
The job Murn_Cu_Ni_frac0d10_0_9666667 was saved and received the ID: 331
The job Murn_Cu_Ni_frac0d10_1_0 was saved and received the ID: 332
The job Murn_Cu_Ni_frac0d10_1_0333333 was saved and received the ID: 333
The job Murn_Cu_Ni_frac0d10_

Now let's search the pyiron database for instances of some of our physically-meaningful properties:

In [28]:
reasoner.search_database_for_property(onto.BulkModulus(), pr)

Unnamed: 0,Chemical Formula,atomistics.BulkModulus,unit,Engine
0,Cu108,141.949581,GPa,Lammps
1,Ag9Cu99,134.436849,GPa,Lammps
2,Ag23Cu85,,GPa,Lammps
3,Cu108,141.949581,GPa,Lammps
4,Cu100Ni8,145.466199,GPa,Lammps
5,Cu86Ni22,151.484981,GPa,Lammps


We can also filter our search by chemistry:

In [29]:
reasoner.search_database_for_property(onto.BPrime(), pr, select_alloy="Cu")

Unnamed: 0,Chemical Formula,atomistics.BPrime,unit,Engine
0,Cu108,4.393195,,Lammps
1,Ag9Cu99,6.169798,,Lammps
2,Ag23Cu85,,,Lammps
3,Cu108,4.393195,,Lammps
4,Cu100Ni8,4.320935,,Lammps
5,Cu86Ni22,4.167212,,Lammps


# Cleanup

In [30]:
pr.remove_jobs_silently(recursive=True)
pr.remove(enable=True)

  pr.remove_jobs_silently(recursive=True)


  0%|          | 0/48 [00:00<?, ?it/s]