# MatSE580 Guest Lecture 1
## Introduction

In this guest lecture we will cover:
1. [Manipulating and analyzing materials](#Manipulating-and-analyzing-materials) - using [pymatgen](https://github.com/materialsproject/pymatgen)
2. [Setting up a small NoSQL database on the cloud to synchronize decentralized processing](#Setting-up-MongoDB) - using [MongoDB Atlas](https://www.mongodb.com/atlas) Free Tier
3. [Interacting with the database](#pymongo) and visualizing the results - using [pymongo](https://github.com/mongodb/mongo-python-driver) library and [MongoDB Charts](https://www.mongodb.com/docs/charts/) service
4. [Installing machine learning (ML) tools](#pysipfenn-install) to predict stability of materials - using [pySIPFENN](https://pysipfenn.readthedocs.io/en/stable/)

## Setting everything up

In [2]:
!pip install pymatgen

Collecting pymatgen
  Downloading pymatgen-2023.10.11.tar.gz (7.3 MB)
[2K     [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m7.3/7.3 MB[0m [31m22.9 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25h  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Installing backend dependencies ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
[?25hCollecting matplotlib>=1.5 (from pymatgen)
  Obtaining dependency information for matplotlib>=1.5 from https://files.pythonhosted.org/packages/98/a7/3883b2bd4e5cff02bdb578eadf09910581220660257183145b6d2253e018/matplotlib-3.8.0-cp310-cp310-macosx_11_0_arm64.whl.metadata
  Downloading matplotlib-3.8.0-cp310-cp310-macosx_11_0_arm64.whl.metadata (5.8 kB)
Collecting monty>=3.0.2 (from pymatgen)
  Obtaining dependency information for monty>=3.0.2 from https://files.pythonhosted.org/

In [106]:
!pip install pysipfenn

Collecting pysipfenn
  Obtaining dependency information for pysipfenn from https://files.pythonhosted.org/packages/97/70/d3e58845aef72c328e24ef966f3c68adaf67cec6806bf01d77b3fabf1a85/pysipfenn-0.13.0-py3-none-any.whl.metadata
  Downloading pysipfenn-0.13.0-py3-none-any.whl.metadata (9.9 kB)
Collecting torch>=1.11.0 (from pysipfenn)
  Obtaining dependency information for torch>=1.11.0 from https://files.pythonhosted.org/packages/ab/6a/0debe1ec3c63b1fd7487ec7dd8fb1adf19898bef5a8dc151265d79ffd915/torch-2.1.0-cp310-none-macosx_11_0_arm64.whl.metadata
  Downloading torch-2.1.0-cp310-none-macosx_11_0_arm64.whl.metadata (24 kB)
Collecting onnx2torch>=1.5.2 (from pysipfenn)
  Obtaining dependency information for onnx2torch>=1.5.2 from https://files.pythonhosted.org/packages/72/92/70b5cc3658d8abf77c968bbcf191b52d5c85707ba0bcb6a4d5fc1bb08613/onnx2torch-1.5.12-py3-none-any.whl.metadata
  Downloading onnx2torch-1.5.12-py3-none-any.whl.metadata (22 kB)
Collecting onnx>=1.13.0 (from pysipfenn)
  Obta

## Manipulating and analyzing materials

To start working with atomic structures, often referred to as atomic configurations or simply materials, we need to be able to represent and manipulate them. One of the most powerful and mature tools to do so is [pymatgen](https://github.com/materialsproject/pymatgen) which we just installed. The critical component of pymatgen is its library of representations of fundamental materials objects, such as `Structure` and `Molecule`, contained in the `pymatgen.core` module. Let's import it and create a simple cubic structure of Al, like we did in the DFTTK tutorial last week:

### Basics

In [38]:
from pymatgen.core import Structure

s = Structure(lattice=[[4.0384, 0, 0], [0, 4.0384, 0], [0, 0, 4.0384]],
              species=['Al', 'Al', 'Al', 'Al'],
              coords=[[0.0, 0.0, 0.0], [0, 0.5, 0.5], [0.5, 0.0, 0.5], [0.5, 0.5, 0.0]])

Now, `s` holds our initialized structure and we can apply print on it to see what it looks like:

In [39]:
print(s)

Full Formula (Al4)
Reduced Formula: Al
abc   :   4.038400   4.038400   4.038400
angles:  90.000000  90.000000  90.000000
pbc   :       True       True       True
Sites (4)
  #  SP      a    b    c
---  ----  ---  ---  ---
  0  Al    0    0    0
  1  Al    0    0.5  0.5
  2  Al    0.5  0    0.5
  3  Al    0.5  0.5  0


**Initialized** is a critical word here, because the `Structure` object is not just a collection of "numbers". It holds a lot of information we can access using the `Structure` object's attributes and methods. For example, density of the material is immediately available:

In [40]:
s.density

2.721120664587368

We can also "mutate" the object with a few intuitive methods like `apply_strain`:

In [41]:
s.apply_strain(0.1)

Structure Summary
Lattice
    abc : 4.442240000000001 4.442240000000001 4.442240000000001
 angles : 90.0 90.0 90.0
 volume : 87.66092623767148
      A : 4.442240000000001 0.0 0.0
      B : 0.0 4.442240000000001 0.0
      C : 0.0 0.0 4.442240000000001
    pbc : True True True
PeriodicSite: Al (0.0, 0.0, 0.0) [0.0, 0.0, 0.0]
PeriodicSite: Al (0.0, 2.221, 2.221) [0.0, 0.5, 0.5]
PeriodicSite: Al (2.221, 0.0, 2.221) [0.5, 0.0, 0.5]
PeriodicSite: Al (2.221, 2.221, 0.0) [0.5, 0.5, 0.0]

Importantly, as you can see `s` has been printed out when we ran the command, as if the `s.apply_strain` returned a modified `Structure` object. This is true! However, by default, pymatgen will also strain the original object, as you can see looking at the `s` density:

In [42]:
s.density

2.0444182303436262

This is a very convenient feature, but it can be dangerous if you are not careful and, for instance, try to generate 10 structures with increasing strains

In [43]:
strainedList = [s.apply_strain(0.1 * i) for i in range(1, 11)]
for strained in strainedList[:2]:
    print(strained)

Full Formula (Al4)
Reduced Formula: Al
abc   : 297.826681 297.826681 297.826681
angles:  90.000000  90.000000  90.000000
pbc   :       True       True       True
Sites (4)
  #  SP      a    b    c
---  ----  ---  ---  ---
  0  Al    0    0    0
  1  Al    0    0.5  0.5
  2  Al    0.5  0    0.5
  3  Al    0.5  0.5  0
Full Formula (Al4)
Reduced Formula: Al
abc   : 297.826681 297.826681 297.826681
angles:  90.000000  90.000000  90.000000
pbc   :       True       True       True
Sites (4)
  #  SP      a    b    c
---  ----  ---  ---  ---
  0  Al    0    0    0
  1  Al    0    0.5  0.5
  2  Al    0.5  0    0.5
  3  Al    0.5  0.5  0


we will now end up with a single object with 67 times the original volume (1.1 * 1.2 * ... * 2.0) repeated 10 times. To avoid this, we can get regenerate original `s` and use the `copy` method to create a new object each time:

In [68]:
from copy import copy

s = Structure(lattice=[[4.0384, 0, 0], [0, 4.0384, 0], [0, 0, 4.0384]],
              species=['Al', 'Al', 'Al', 'Al'],
              coords=[[0.0, 0.0, 0.0], [0, 0.5, 0.5], [0.5, 0.0, 0.5], [0.5, 0.5, 0.0]])

In [82]:
strainedList = [copy(s).apply_strain(0.1 * i) for i in range(0, 11)]
for strained in strainedList[:2]:
    print(strained)

Full Formula (Ni3 Au1)
Reduced Formula: Ni3Au
abc   :   4.038400   4.038400   4.038400
angles:  90.000000  90.000000  90.000000
pbc   :       True       True       True
Sites (4)
  #  SP      a    b    c
---  ----  ---  ---  ---
  0  Au    0    0    0
  1  Ni    0    0.5  0.5
  2  Ni    0.5  0    0.5
  3  Ni    0.5  0.5  0
Full Formula (Ni3 Au1)
Reduced Formula: Ni3Au
abc   :   4.442240   4.442240   4.442240
angles:  90.000000  90.000000  90.000000
pbc   :       True       True       True
Sites (4)
  #  SP      a    b    c
---  ----  ---  ---  ---
  0  Au    0    0    0
  1  Ni    0    0.5  0.5
  2  Ni    0.5  0    0.5
  3  Ni    0.5  0.5  0


And now everything works as expected! We can also easily do some modifications to the structure, like replacing one of the atoms with another

In [83]:
s.replace(0, "Au")
print(s)

Full Formula (Ni3 Au1)
Reduced Formula: Ni3Au
abc   :   4.038400   4.038400   4.038400
angles:  90.000000  90.000000  90.000000
pbc   :       True       True       True
Sites (4)
  #  SP      a    b    c
---  ----  ---  ---  ---
  0  Au    0    0    0
  1  Ni    0    0.5  0.5
  2  Ni    0.5  0    0.5
  3  Ni    0.5  0.5  0


or all of the atoms of a given element at once

In [71]:
s.replace_species({"Al": "Ni"})

Structure Summary
Lattice
    abc : 4.0384 4.0384 4.0384
 angles : 90.0 90.0 90.0
 volume : 65.860951343104
      A : 4.0384 0.0 0.0
      B : 0.0 4.0384 0.0
      C : 0.0 0.0 4.0384
    pbc : True True True
PeriodicSite: Au (0.0, 0.0, 0.0) [0.0, 0.0, 0.0]
PeriodicSite: Ni (0.0, 4.038, 4.038) [0.0, 0.5, 0.5]
PeriodicSite: Ni (4.038, 0.0, 4.038) [0.5, 0.0, 0.5]
PeriodicSite: Ni (4.038, 4.038, 0.0) [0.5, 0.5, 0.0]

Lastly, with `Structure` objects, we also have access to lower-order primitives, such as `Composition`

In [72]:
c = s.composition
c

Composition('Au1 Ni3')

which may look like a simple string but is actually a powerful object that can be used to do things like calculate the fraction of each element in the structure:

In [73]:
c.fractional_composition

Composition('Au0.25 Ni0.75')

including the weight fractions (I wrote this part of pymatgen üôÇ):

In [75]:
c.to_weight_dict

{'Au': 0.5279943035775228, 'Ni': 0.47200569642247725}

### Symmetry Analysis

With some basics of the way, let's look at some more advanced features of pymatgen that come from integration with 3rd party libraries like [spglib](https://spglib.readthedocs.io/en/latest/index.html) which is a high performance library for symmetry analysis (1) written in C, (2) wrapped in Python by the authors, and finally (3) wrapped in pymatgen for convenience.

Such approach introduces a lot of perfromance bottlenecks (4-20x slower and 50x RAM needs compared to my interface written in [Nim](https://nim-lang.org)), but allowing us to get started with things like symmetry analysis in with just one line of code, where `SpacegroupAnalyzer` puts `s` in a new context:

In [76]:
from pymatgen.symmetry.analyzer import SpacegroupAnalyzer
spgA = SpacegroupAnalyzer(s)

Now many useful methods are available to us, allowing quickly getting `crystal_system`, `space_group_symbol`, and `point_group_symbol`:

In [89]:
spgA.get_crystal_system()

'cubic'

In [90]:
spgA.get_space_group_symbol()

'Pm-3m'

In [91]:
spgA.get_point_group_symbol()

'm-3m'

We can also do some more advanced operations involving symmetry. For example, as some may have noticed, the `s` structure we created is primitive, but if we fix its symmetry, we can describe it with just 1 face centered atom instead of 3 as they are symmetrically equivalent. We can do this with the `get_symmetrized_structure` 

In [99]:
symmetrized = spgA.get_symmetrized_structure()
symmetrized

SymmetrizedStructure
Full Formula (Ni3 Au1)
Reduced Formula: Ni3Au
Spacegroup: Pm-3m (221)
abc   :   4.038400   4.038400   4.038400
angles:  90.000000  90.000000  90.000000
Sites (4)
  #  SP      a    b    c  Wyckoff
---  ----  ---  ---  ---  ---------
  0  Au      0  0    0    1a
  1  Ni      0  0.5  0.5  3c

which we can then use to get the primitive or conventional structure back (here they are the same):

In [104]:
symmetrized.to_primitive()

Structure Summary
Lattice
    abc : 4.0384 4.0384 4.0384
 angles : 90.0 90.0 90.0
 volume : 65.860951343104
      A : 4.0384 0.0 2.472806816838336e-16
      B : -2.472806816838336e-16 4.0384 2.472806816838336e-16
      C : 0.0 0.0 4.0384
    pbc : True True True
PeriodicSite: Ni (-1.236e-16, 2.019, 2.019) [0.0, 0.5, 0.5]
PeriodicSite: Ni (2.019, 0.0, 2.019) [0.5, 0.0, 0.5]
PeriodicSite: Ni (2.019, 2.019, 2.473e-16) [0.5, 0.5, 0.0]
PeriodicSite: Au (0.0, 0.0, 0.0) [0.0, 0.0, 0.0]

In [105]:
symmetrized.to_conventional()

Structure Summary
Lattice
    abc : 4.0384 4.0384 4.0384
 angles : 90.0 90.0 90.0
 volume : 65.860951343104
      A : 4.0384 0.0 2.472806816838336e-16
      B : -2.472806816838336e-16 4.0384 2.472806816838336e-16
      C : 0.0 0.0 4.0384
    pbc : True True True
PeriodicSite: Ni (-1.236e-16, 2.019, 2.019) [0.0, 0.5, 0.5]
PeriodicSite: Ni (2.019, 0.0, 2.019) [0.5, 0.0, 0.5]
PeriodicSite: Ni (2.019, 2.019, 2.473e-16) [0.5, 0.5, 0.0]
PeriodicSite: Au (0.0, 0.0, 0.0) [0.0, 0.0, 0.0]

### More Complex Structures

Armed with all the basics, let's look at some more complex structures.

<p align="center">
  <img src="assets/112-Cr12Fe10Ni8.png" width="500"/>
</p>

## Setting up MongoDB

# Pymongo

# pySIPFENN Install