Importing packages

In [2]:
import os, glob
from ase.io import read
from mofchecker import MOFChecker
from oximachinerunner import OximachineRunner
from moffragmentor import MOF
import pymatgen
from pymatgen.core.structure import Structure

We will start with [UiO66](https://pubs.acs.org/doi/10.1021/acs.cgd.9b00955) as an example. You can practice converting the "UiO_66_orig.cif" cif file into an xyz file, and then back again to a cif file (check section [3.](https://github.com/bmourino/ch359/blob/main/1-MOF101/README.md#3-managing-structure-files) Managing structure files). Name your cif file "UiO_66.cif". This might be very useful in the future, so don't skip it!

Now we are ready to proceed, and we can load the structures we want to evaluate. Can you print the name of all structures that we have? *Hint*: you can use the method ['.split()'](https://python-reference.readthedocs.io/en/latest/docs/str/split.html).

In [3]:
THIS_DIR = os.path.dirname(os.path.abspath("__file__"))
s_folder = THIS_DIR + '/structures/'
all_structures = glob.glob(os.path.join(s_folder, "*.cif"))

for s in all_structures:
    print('##') #complete '##'

[Chung](https://pubs.acs.org/doi/10.1021/cm502594j) et. al. have stated that "A publicly available database of computation-ready MOF structures would be an invaluable tool for researchers interested in metal–organic frameworks". Think about this (*Q$*): what does it mean for a structure to be computationally ready, and why is this important for us?



MOFChecker is a tool developed by KM [Jablonka](https://doi.org/10.26434/chemrxiv-2022-4g7rx) et. al. that allows for quick detection of structures that are not computationally ready. Run a general check in the line below and analyze the outputs - can you explain what they mean?

In [5]:
for structure in all_structures:
    if '##' == 'UiO_66.cif': #complete '##'
        s = structure

mofchecker = MOFChecker.from_cif(s)
descriptors = mofchecker.get_mof_descriptors()
print(descriptors)


Apart from performing boolean checks, you can also obtain some basic information and pinpoint possible issues, e.g., where a carbon atom has more bonds than allowed. Try to obtain more information for our UiO66. You can think about how to do it in the package [documentation](https://mofchecker.readthedocs.io/en/latest/api.html).

In [12]:
### type your code here

When structures are not clean, you might have to do it yourself to avoid misleading results (if you manage to get anything at all!)

Follow the steps in [1.](https://github.com/bmourino/ch359/blob/main/1-MOF101/README.md#cleaning-on-avogadro) Visualizing structures on Avogadro - cleaning on Avogadro to try to clean UiO66.

After this you can get a new cif file, and run the tests again to see if you solved the puzzle!

Moving forward...

Oximachine is a machine-learning-based tool developed by KM [Jablonka](https://www.nature.com/articles/s41557-021-00717-y) et. al. that allows for predicting oxidation states for the metal sites in MOFs. It predicts the oxidation state with 4 models, and also informs us on the maximum confidence of each model.

Each model is based on a different approach (extremely randomized decision trees, boosted decision trees, nearest-neighbour and linear functions), and it is proposed "to make the estimates maximally uncorrelated, similarly to chemists who use different ways to rationalize oxidation states".

Let's try to figure out the oxidation state of Zr in UiO66! Complete the line below (check [here](https://github.com/kjappelbaum/oximachinerunner) for tips), run it and take note on the oxidation state that was predicted. Is there a difference between models? 

*Q$*: Why do we care about oxidation states? Does it change anything when performing a DFT calculation? In which way could this be relevant for ML?

In [12]:
runner = ### complete here
s_ase = read(s)
mof_ox = runner.run_oximachine(structure=s_ase)
print(mof_ox)

Metal organic frameworks are put together through reticular chemistry, the science of linking discrete chemical entities by strong bonds to make extended structures. (OM Yaghi 2019, DOI:10.1002/9783527821099)

Those chemical entities are what we call "building blocks": organic linkers and metal nodes or clusters. 

When we are dealing with a MOF computationally, we usually have a cif file of the whole periodic structure, where the building blocks are already put together - thanks to either experimental efforts or *in silico* design.

Sometimes, however, we want to be able to separate, or fragment them again.

One of the huge advantages of crystalline, periodical, building-blocks-based structures is that they make our lives easier when we want to establish structure-property relationship.

By fragmenting many MOFs into their building blocks, we can try to see if there is any correlation between their properties and the presence of a fragment.

In this regard, MOFChecker is a useful tool developed by KM [Jablonka](https://www.nature.com/articles/s41557-021-00717-y) et. al. It can split the MOF into its fragments. Let's try to do this with UiO66! Check [here](https://github.com/kjappelbaum/moffragmentor) for tips.

*Q$*: Can you point out common linking points between organic linkers and metal nodes? If you think about this from an experimental perspective, where would you expect a reaction to occur? Why is this important, from a computational perspective?

In [15]:
mof = MOF.from_cif(s)
fragments = ### complete here
print(fragments)

We can visualize the fragments:

In [16]:
fragments.linkers[0].show_molecule()

In [None]:
fragments.nodes[0].show_molecule()

And we can also search PubChem for the building blocks:

In [None]:
fragments.linkers[0].search_pubchem()

We can also get the topology/net. Try it yourself! What is the observed net for UiO66?

In [None]:
fragments### complete here

Try to obtain a standardized name for UiO66, and its topology, by using the tools mentioned in section [4.](https://github.com/bmourino/ch359/blob/main/1-MOF101/README.md#4-other-tools) Other tools.

Compare the topologies obtained with moffragmentor and topcryst.

*Q$*: How relevant is it to know the topology? Can properties change drastically if we have similar building blocks, but different topologies? If so, what would you expect to change?

Finally, pymatgen is a very versatile tool for materials of different kinds.
It allows you to deal with defects, surfaces, slabs, get the primitive structure, and can even serve as an interface with some software for DFT calculations.

We will try to get some basic information with pymatgen, and the primitive cell. A primitive cell is a unit cell that contains *one* lattice point, and is therefore the smallest possible unit cell.

Feel free to explore different functionalities of pymatgen. You can get more information [here](https://pymatgen.org/introduction.html) and you can find tutorials [here](https://matgenb.materialsvirtuallab.org/).

In [13]:
pmg_structure = Structure.from_file(s)

Print the following information: composition, density and charge.

In [None]:
print(pmg_structure'###') ### complete here '###'
print('###') ### complete here '###'
print('###') ### complete here '###'

Get the primitive cell, and visualize it (for example, you can save the structure and open with your software of choice). Does anything change?

In [16]:
primitive = pmg_structure.get_primitive_structure()

Your turn: analyze structures 1-6 from scratch - are they computationally ready? If not, clean them. 
Finally, do the same for MOF-5 and save the structure for the future: we will use it for our DFT calculations on part 3.