### Materials Project Workshop – August 8–10 2018, Berkeley, California
#### Link to notebook: [http://workshop.materialsproject.org/pymatgen/core/pymatgen_core.ipynb](http://workshop.materialsproject.org/pymatgen/core/pymatgen_core.ipynb)

![pymatgen logo](http://pymatgen.org/_images/pymatgen.png)

# 0. What is pymatgen?

Pymatgen (Python Materials Genomics) is the code that powers all of the scientific analysis behind the Materials Project.  It includes a robust and efficient libraries for the handling of crystallographic structures and molecules, in addition to various mathematical and scientific tools for the handling and generation of materials data.

## 0.0 Core functionality

Here are a few things you can do with pymatgen:

- Create, identify, and manipulate crystal structures and molecules
- Write input and output files for most electronic structure codes
- Analyze density of states, bandstructures, X-ray diffraction spectra
- Perform tensor-based analysis, including elastic and piezoelectric tensors
- Characterize the local chemical environment of structural sites
- Create pourbaix diagrams and phase diagrams
- Match crystal structures to each other and perform symmetry analysis
- Match substrates based on geometry and elastic behavior
- Create and manipulate surfaces
- Do unit conversions
- Get basic information about chemical identity
- Includes a wide variety of other analysis tools, including estimating the cost of a material based on chemical abundance, or its geographical distribution of elements


## 0.1 How do I install pymatgen?

For the workshop, pymatgen has been pre-installed for use in your Jupyter notebooks.

Otherwise, pymatgen can be installed via pip:

`pip install pymatgen`

or conda:

`conda install --channel matsci pymatgen`

We recommend using Python 3.6 or above. Until 2018, pymatgen was developed simultaneously for Python 2.x and 3.x, but following the rest of the Python community we are phasing out support for Python 2.x, and will be developing exclusively for Python 3.x from version 2019.1.1.


## 0.2 Where can I find help and how do I get involved?

* **For general help:** [pymatgen Google Groups mailing list](https://groups.google.com/forum/#!forum/pymatgen) is a place to ask questions.

* **To report bugs:** The [Github Issues](https://github.com/materialsproject/github) page is a good place to report bugs.

* **For Materials Project data and website discussions:** The Materials Project  has its community [Materials Project Discussion](https://discuss.materialsproject.org) forum. 

* **For more example notebooks:** [matgenb](http://matgenb.materialsvirtuallab.org) is a new resource of Jupyter notebooks demonstrating various pymatgen functionality.

If you want specific new features, you're welcome to ask! We try to respond to community needs. If you're a developer and can add the feature yourself, we actively encourage you to do so by creating a Pull Request on Github with your additional functionality. To date, pymatgen has seen over 15,000 commits and over 100 contributors, and we try to have an inclusive and welcoming development community. All contributors are also individually acknowledged on [materialsproject.org/about](https://materialsproject.org/about).


# 1. Verify we have pymatgen installed

First, let's verify we have pymatgen installed. The following command should produce no error or warning:

In [None]:
import pymatgen

We can show the specific version of pymatgen installed:

In [None]:
print(pymatgen.__version__)

For a list of new features, bug fixes and other changes, consult the [changelog on pymatgen.org](http://pymatgen.org/change_log.html).

You can also see where pymatgen is installed on your computer:

In [None]:
print(pymatgen.__file__)

We can also see which version of the Python programming language we are using:

In [None]:
import sys
print(sys.version)

If you have problems or need to report bugs when using pymatgen after the workshop, the above information is often very useful to help us identify the problem.

# 2. Structures and Molecules

Most of the fundamentals of pymatgen are expressed in terms of [**`Structure`**](http://pymatgen.org/pymatgen.core.structure.html#pymatgen.core.structure.Structure) and [**`Molecule`**](http://pymatgen.org/pymatgen.core.structure.html#pymatgen.core.structure.Molecule) objects.

While we will mostly be using `Structure`, `Stucture` and `Molecule` are very similar conceptually. The main difference i that `Structure` supports full periodicity required to describe crystallographic structures.

Creating a `Structure` is very easy, and can be done in one line, even for complicated crystallographic structures. However, let's start by introducing the simpler `Molecule` object, and then use this understanding of `Molecule` to introduce `Structure`.


## 2.0 Creating a Molecule

Start by importing `Molecule`:

In [None]:
from pymatgen import Molecule

In a Jupyter notebook, you can show help for any Python object by clicking on the object and pressing **Shift+Tab**. This will give you a list of arguments and keyword arguments necessary to construct the object, as well as the documentation ('docstring') which gives more information on what each argument means.

Molecule takes input **arguments** `species` and `coords`, and input **keyword arguments** `charge`, `spin_multiplicity`, `validate_proximity` and `site_properties`.

Keyword arguements come with a default value (the value after the equals sign), and so keyword arguments are optional.

Arguments (without default values) are mandatory.

In [None]:
my_molecule = Molecule(['C','O'], [[0, 0, 0], [0, 0, 1.2]])

In [None]:
print(my_molecule)

## 2.1 What's in a Molecule? Introducing Sites, Elements and Species

You can access properties of the molecule, such as the Cartesian co-ordinates of its sites:

In [None]:
print(my_molecule.cart_coords)

or properties that are computed on-the-fly such as its centre of mass:

In [None]:
print(my_molecule.center_of_mass)

To see the full list of available properties and methods, press **Tab** after typing `my_molecule.` in your Jupyter notebook. There are methods used to modify the molecule, and these take additional argument(s), for example to add a charge to the molecule:

In [None]:
my_molecule.set_charge_and_spin(1)

In [None]:
print(my_molecule)

A molecule is essentially a list of `Site` objects. We can access these sites like we would a list in Python. For example, to obtain the total number of sites in the molecule:

In [None]:
len(my_molecule)

Or to access the first site (note that Python is a 0-indexed programming language, so the first site is site 0):

In [None]:
my_molecule[0]

Within this site are information on the site's position in space as well as what that site contains.

In [None]:
# as shorthand, we assign the first site of our molecule to a new variable, site0
site0 = my_molecule[0]

In [None]:
site0.coords

In [None]:
site0.specie

Here, the site olds the element C. In general, a site can hold either an [**`Element`**](http://pymatgen.org/pymatgen.core.periodic_table.html#pymatgen.core.periodic_table.Element), a [**`Specie`**](http://pymatgen.org/pymatgen.core.periodic_table.html#pymatgen.core.periodic_table.Specie) or a [**`Composition`**](http://pymatgen.org/pymatgen.core.composition.html#pymatgen.core.composition.Composition). Let's look at each of these in turn.

In [None]:
from pymatgen import Element, Specie, Composition

An `Element` is simply an element from the Periodic Table.

In [None]:
my_element = Element('C')

Elements have properties such as atomic mass, average ionic radius and more:

In [None]:
my_element.average_ionic_radius

A `Specie` can contain additional information, such as oxidation state:

In [None]:
Specie('O', oxidation_state=-2)

Or, for convenience:

In [None]:
Specie.from_string('O2-')

Finally, a `Composition` is an object that can hold certain amounts of different elements or specie. This is most useful in a disordered Structure, and would rarely be used in a Molecule. For example, this a site that holds 50% Au and 50% Cu would be set as follows:

In [None]:
Composition({'Au': 0.5, 'Cu': 0.5})

When we construct a `Molecule`, the input argument will automatically be converted into one of `Element`, `Specie` or `Composition`. Thus, in the previous example, the input `['C', 'O']` was converted to `[Element C, Element O]`.

## 2.2 Creating a Structure and Lattice

Creating a `Structure` is very similar to creating a `Molecule`, except we now also have to specify a `Lattice`. 

In [None]:
from pymatgen import Structure, Lattice

A `Lattice` can be created in one of several ways. Such as by inputting a 3x3 matrix describing the individual lattice vectors. For example, a cubic lattice of length 5 Ångstrom:

In [None]:
my_lattice = Lattice([[5, 0, 0], [0, 5, 0], [0, 0, 5]])

In [None]:
my_lattice

Equivalently, we can create it from its lattice parameters:

In [None]:
my_lattice_2 = Lattice.from_parameters(5, 5, 5, 90, 90, 90)  # a, b, c, alpha, beta, gamma

Or, since we know in this case we have a cubic lattice, so a == b == c and alpha == beta == gamma == 90 degrees, we can simply put:

In [None]:
my_lattice_3 = Lattice.cubic(5)

In all cases, these lattices are the same:

In [None]:
my_lattice == my_lattice_2 == my_lattice_3

Now, we can create a crystal structure very easily. Let's start with body-centered-cubic iron:

In [None]:
bcc_fe = Structure(Lattice.cubic(2.8), ["Fe", "Fe"], [[0, 0, 0], [0.5, 0.5, 0.5]])

In [None]:
print(bcc_fe)

Creating this `Structure` was similar to `Molecule`: we provided a list of elements and a list of positions. However, there are two key differences to `Molecule`: first is that we had to include our `Lattice` object when creating structure, and secondly since we have a lattice, we can define the positions of our sites in *fractional co-ordinates* with respect to that lattice instead of Cartesian co-ordinates.

It's also possible to create an equivalent `Structure` using Cartesian co-ordinates:

In [None]:
bcc_fe_from_cart = Structure(Lattice.cubic(2.8), ["Fe", "Fe"], [[0, 0, 0], [1.4, 1.4, 1.4]],
                             coords_are_cartesian=True)

In [None]:
print(bcc_fe_from_cart)

We see check that both structures are equivalent:

In [None]:
bcc_fe == bcc_fe_from_cart

As in molecule, we can access properties of the structure, such as its volume:

In [None]:
bcc_fe.volume  # in Ångstroms^3

## 2.3 Modifying a Structure

We can create a supercell by multiplying the structure by a number of repeats:

In [None]:
bcc_fe_repeated = bcc_fe*(2,2,2)

In [None]:
bcc_fe_repeated

There are many methods to modify the structure, such as scaling the volume or substituting one species with another species. There are also various *transformations* in pymatgen that can do more complicated structure manipulations, such as creating surfaces, grain boundaries or creating ordered approximations of disordered structure.

## 2.4 Creating Structure from Spacegroups

Structures can also be created directly from their spacegroup:

In [None]:
bcc_fe = Structure.from_spacegroup("Im-3m", Lattice.cubic(2.8), ["Fe"], [[0, 0, 0]])
print(bcc_fe)

In [None]:
nacl = Structure.from_spacegroup("Fm-3m", Lattice.cubic(5.692), ["Na+", "Cl-"],
                                 [[0, 0, 0], [0.5, 0.5, 0.5]])
print(nacl)

## 2.5 Creating a Disordered Structure

Disordered structures are created using the syntax for compositions shown earlier. Here, we create a CuAu solid solution:

In [None]:
composition = {"Cu": 0.5, "Au":0.5}
cu_au = Structure.from_spacegroup("Fm-3m", Lattice.cubic(3.677), [composition], [[0, 0, 0]])
print(cu_au)

# 3. Input and Output

## 3.0 Input/output from other standard file formats

Pymatgen supports a wide range of input/output.

* **Plane-wave DFT codes** including:
  * [VASP](https://www.vasp.at)
  * [Quantum ESPRESSO pwscf](https://www.quantum-espresso.org)
  * [ABINIT](https://www.abinit.org)
  * [exciting](http://exciting-code.org)
* **Quantum chemistry codes** including:
  * [Q-Chem](http://www.q-chem.com)
  * [Gaussian](http://gaussian.com)
  * [NWChem](http://www.nwchem-sw.org/index.php/Main_Page)
* **Visualization and standard file formats** including:
  * [CIF](https://www.iucr.org/resources/cif)
  * [XCrySDen](http://www.xcrysden.org)
  * [xyz](https://en.wikipedia.org/wiki/XYZ_file_format)
* **Many others, including ...**
    * [AiiDA](http://www.aiida.net)
    * [FEFF](http://feff.phys.washington.edu)
    * [ADF](https://www.scm.com/doc/ADF/index.html)
    * [LAMMPS](https://lammps.sandia.gov)
    * [Zeo++](http://www.zeoplusplus.org)
    * [Fiesta](http://perso.neel.cnrs.fr/xavier.blase/fiesta/)
    * [Phonopy](https://atztogo.github.io/phonopy/)
    * CSSR
    * xr
    * [ATAT (mcsqs)](https://www.brown.edu/Departments/Engineering/Labs/avdw/atat/)
    * [LOBSTER](http://www.cohp.de)
* **and also adaptors to use input/output routines from other codes** including:
  * [Atomic Simulation Environment (ASE)](https://wiki.fysik.dtu.dk/ase/)
  * [Open Babel](http://openbabel.org/wiki/Main_Page)

For example, let's import from a CIF file:

In [None]:
struct = Structure.from_file('Nb2O5.cif')
print(struct)

There is a lot of additional functionality in pymatgen for several of these codes, including automating the generation of sensible input *sets* (configuration files for various codes, including VASP).

# 3.1 Input/output within Materials Project codes

Most objects like `Structure` in Materials Project codes including pymatgen and also [atomate](https://atomate.org), [custodian](https://pythonhosted.org/custodian/), [FireWorks](https://materialsproject.github.io/fireworks/), are "MSONable", named after "Monty JSON" from the [monty](http://guide.materialsvirtuallab.org/monty/) package. This means they can be easily converted to and from the JSON file format. This makes it really easy for us to pass objects between the different codes, and also to store them in a database like [MongoDB](https://www.mongodb.com) or save them to disk.

Generally, objects that are MSONable will have `.as_dict()` and `.from_dict()` methods. To save or load from disk, the helpful `dumpfn` and `loadfn` functions from Monty can be used. These allow you to save not just, for example, a single `Structure` object, but a list or dictionary of many kinds of objects in Materials Project codes.

In [None]:
from monty.serialization import dumpfn, loadfn

In [None]:
dumpfn(bcc_fe, "bcc_fe.json")
new_struct = loadfn("bcc_fe.json")
bandstructure = loadfn("li2o_bs.json")

In [None]:
bcc_fe.as_dict()

# 3.3 A Note on Visualizing Structures

The usual way to visualize your structure is to export to a format like CIF and open in a program like VESTA or CrystalMaker. For this workshop, we have a new tool (in beta) to make this easier. To view your structure, you can do it like so:

In [None]:
from mp_workshop import get_viewer_link

In [None]:
get_viewer_link(struct)

# 4. Symmetry Analysis with SymmetryAnalyzer

In addition to book-keeping of structures using `Structure` objects, pymatgen contains powerful tools for analyzing crystal symmetry and comparing structures.  The `SymmetryAnalyzer` object uses the powerful spglib symmetry analysis library, which is written in C for more efficient determination of invariant symmetry operations and crystal symmetry. 
The symmetry analyzer can be used to get primitive and standardized conventional cell settings of structures.

These examples shows how to get the primitive structure of BCC iron using `SpacegroupAnalyzer` and how to get the point group of the CO molecule created above using `PointGroupAnalyzer`.

In [None]:
from pymatgen.symmetry.analyzer import SpacegroupAnalyzer, PointGroupAnalyzer

In [None]:
sga = SpacegroupAnalyzer(bcc_fe)
prim = sga.get_primitive_standard_structure()
print(prim)  # note the primitive structure has only a single site

In [None]:
std = sga.get_conventional_standard_structure()  # whereas the conventional structure has two
print(std)

In [None]:
print("Crystal system:", sga.get_crystal_system())
print("Spacegroup symbol:", sga.get_space_group_symbol())

Similarly, we can use `PointGroupAnalyzer` to get the point group of a molecule:

In [None]:
pga = PointGroupAnalyzer(my_molecule)
print(pga.get_pointgroup())

# 5. Example of calculating X-ray Diffraction (XRD) Pattern

There are various tools in pymatgen that allow for the analysis and plotting of structural and electronic information.  The `XRDCalculator` is perhaps the most straightforward of these tools, since it only requires a `Structure` object.

In [None]:
from pymatgen.analysis.diffraction.xrd import XRDCalculator
xrdc = XRDCalculator()

In [None]:
xrdc.get_plot(nacl)  # plot the XRD pattern of NaCl

## 6.0 Matching an unknown structure to an XRD pattern

Consider an experimentalist obtains an XRD structure of a cathode material of composition Li$_x$S$_y$, but unknown crystallographic structure:

![LiS XRD](https://raw.githubusercontent.com/materialsproject/workshop-2017/master/pymatgen/core/LiS_XRD.png)

We can generate a series of XRD plots for different structures in the Li-S chemical system to find one that matches. Later, we will show how to obtain these structures from the Materials Project database, but for now let's load them from a file:

In [None]:
lis_structures = loadfn("li_s_structures.json")

In [None]:
xrdc.get_plot(lis_structures[0])  # let's examine each in turn, starting with the first (0)

In [None]:
for structure in lis_structures:
    xrdc.get_plot(structure)

# 6. Example: Creating a surface

Here, we show how to generate all of the low-index facets for BCC Fe.

In [None]:
from pymatgen.core.surface import generate_all_slabs

In [None]:
slabs = generate_all_slabs(bcc_fe, 1, 4, 10)

In [None]:
first_slab = slabs[0]
print(first_slab)

In [None]:
for slab in slabs:
    print(slab.miller_index)

# 7. Example: Manipulating Tensors

Here, we show how to fit a "noisy" tensor to a provided crystal structure,

In [None]:
import numpy as np
from pymatgen.analysis.elasticity.elastic import ElasticTensor

In [None]:
data = loadfn("sample_elastic.json")
print(data)

In [None]:
si_struct = data[0]
elastic_tensor = ElasticTensor.from_voigt(data[1])

In [None]:
print(np.array(data[1]))

In [None]:
print(elastic_tensor.fit_to_structure(si_struct).voigt.round(2))

# Summary

This notebook is intended to provide a short introduction to some of the functionality of pymatgen. We've examined the building blocks of pymatgen: the `Structure` and `Molecule` objects, and the `Lattice`, `Element`, `Specie` and `Composition` objects used to make them. We have also seen some simple examples of pymatgen's analysis capabilities.