# Notebook basics

The jupyter notebook is an interactive python environment that enables users to enter and execute commands in "cells," much like the matlab interactive environment.  We use it primarily as a teaching tool, since live coding is much easier to follow in the notebook and markdown documentation can be inserted inline.  Here, we cover a few basic features of the notebook environment and python before we begin with any of the materials science.

To execute a cell, you can either click the "play" button in the toolbar, or press `shift+enter` while the cell is active.  For example:

In [1]:
print("Hello world!")

Hello world!


In [2]:
a = 3
b = 4
print(a + b)

7


In [4]:
list_of_numbers = [1, 2, 3]
list_sum = 0
# A basic for loop
for number in list_of_numbers:
    list_sum = list_sum + number

print("Sum of list is", list_sum)

Sum of list is 6


Note that `print` is a *function* for which the arguments are enclosed in parentheses.  We'll be using a variety of functions in this lesson.  Note that you can import functions from libraries and modules you have installed.  For example, from the math library, you might want to import the sine function:

In [5]:
from math import cos

You can find information about functions and modules via the following commands in the jupyter notebook.  Use the built-in function `help` to print information about a function or object.  Using the question mark will print a tooltip at the bottom of the screen.  I tend to use the `shift+tab` command while the cursor is inside of a function call's parentheses most regularly.

In [10]:
help(cos)
cos?
cos(5) # press shift+tab with the cursor inside the parentheses

Help on built-in function cos in module math:

cos(...)
    cos(x)
    
    Return the cosine of x (measured in radians).



0.2836621854632263

Python has a rich ecosystem of packages, of which pymatgen is one, that are accessible via both the conda and python package index software managers.  Some of the most popular include **numpy** (for matrix/vector numerical operations), **scipy** (for various scientific computing tools, including ODEs, numerical optimization), **matplotlib** (for plotting), and **pandas** (for dataframe-based analysis akin to R).  The following links might be useful, if you're interested in learning more about python or scientific computing using python generally.

* [PyPI - the Python Package Index](https://pypi.python.org/pypi)
* [Scipy and Numpy documentation](https://docs.scipy.org/doc/)
* [Matplotlib examples](http://matplotlib.org/gallery.html)
* [A gallery of interesting notebooks](https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks)


# Pymatgen core functionality

Pymatgen is the code that powers all of the analysis that's used in the Materials Project.  It includes a robust and efficient libraries for the handling of structures and molecules, in addition to various mathematical and scientific tools for the handling and generation of materials data.  Here are a few things you can do with pymatgen:

- Create, identify, and manipulate crystal structures and molecules
- Write input and output files for most electronic structure codes
- Analyze density of states, XRD spectra, and bandstructure data
- Tensor-based analysis, including Elastic and Piezoelectric tensors
- Analysis of the local chemical environment of structural sites
- Create pourbaix and phase diagrams
- Match substrates based on geometry and elastic behavior
- Create and manipulate surfaces
- Do unit conversions
- Get basic information about chemical identity
- Estimate the cost of a material based on chemical abundance

## Structures, sites, and lattices

Most of the fundamentals of pymatgen are expressed in the Structure and Lattice objects.  These objects contain data on the lattice parameters and the location of individual sites within lattices.  Let's start by importing those objects, along with the MPRester in case we want to find data online.

In [None]:
from pymatgen import Structure, Lattice, MPRester

The general lattice constructor takes a 3x3 array as it's argument, which consists of the vectors that compose the unit cell.  There are also convenience constructors that allow you to construct lattices from lengths and angles, as well as from specific crystal systems with appropriate input parameters.

In [None]:
# Making lattices
lattice = Lattice([[2.8, 0, 0], [0, 2.8, 0], [0, 0, 2.8]])
lattice = Lattice.from_lengths_and_angles([2.8, 2.8, 2.8], [90, 90, 90])
lattice.cubic(2.8)

lattice.hexagonal(a = 2.8, c = 3.6)
lattice.rhombohedral(a = 2.8, alpha = 60)

# Getting lattice info
print "a = ", lattice.a
print "alpha = ", lattice.alpha
print "volume = ", lattice.volume

Structures objects are lattices with the addition of contained species.  Structures are constructed from a lattice, a list of species, and a list of coordinates that correspond to each species.  Note that species in this string can contain occupancies (and sometimes must in order to use other tools!).  You can also create structures from spacegroups, and from cif files.

In [None]:
# Making structures
bcc_fe = Structure(lattice, ["Fe", "Fe"], [[0, 0, 0], [0.5, 0.5, 0.5]])
site0 = bcc_fe[0]
site0.coords
site0.species_string
site0.x



bcc_fe = Structure.from_spacegroup("Im-3m", Lattice.cubic(2.8), ["Fe"], [[0, 0, 0]])
print(bcc_fe)
nacl= Structure.from_spacegroup("Fm-3m", Lattice.cubic(5.692), ["Na+", "Cl-"], 
                                [[0, 0, 0], [0.5, 0.5, 0.5]])
big_structure = Structure.from_file("Nb2O5.cif")
big_structure.formula

Disordered structures can also be constructed using dictionaries that correspond to the species and its occupancy.

In [None]:
# Making disordered structures
specie = {"Cu0+": 0.5, "Au0+":0.5}
cu_au = Structure.from_spacegroup("Fm-3m", Lattice.cubic(3.677), [specie], [[0, 0, 0]])
print(cu_au)

You can also assign site properties flexibly, and some site properties, like `selective_dynamics` will be used in other methods, such as writing a file to POSCAR.

In [None]:
# Manipulating structures and assigning properties to sites
big_structure[0] = "V"
big_structure.formula
big_structure[0] = "Nb"

bcc_fe.append("C", [0.25, 0.25, 0.25])
bcc_fe.pop(-1)
bcc_fe.make_supercell([2, 2, 2])

sd = []
names = []
for n in range(big_structure.num_sites):
    if big_structure[n].species_string == "Nb":
        sd.append([False, False, False])
    else:
        sd.append([True, True, True])
big_structure.add_site_property("selective_dynamics", sd)
big_structure.to(filename="POSCAR")

### Exercise 1: 

You're studying materials used in the chlor-alkali process for the production of Cl<sub>2</sub>.  Find your favorite oxide using the materials project rester.  Replace each oxygen atom with chlorine.

In [None]:
mpr = MPRester()
# Potential solution:
structure = mpr.get_structures("BaNiO3")[0]
for n in range(structure.num_sites):
    if structure[n].species_string == 'O':
        structure[n] = 'Cl'

# Bonus solution
structure.replace_species({'Cl':'O'})

## Transformations

Sometimes it's useful to store a transformation, like the one we've just used to replace species, and have it operate on various structures in a workflow.  Pymatgen has transformation objects which can be used to achieve this.  Transformations can be used to replace or modify sites, deform or rotate structures, or even create a set of orderings for disordered structures.

In [None]:
# Using transformations
from pymatgen.transformations.standard_transformations import SubstitutionTransformation, DeformStructureTransformation, \
OrderDisorderedStructureTransformation, RotationTransformation
structure = mpr.get_structures("BaNiO3")[0]
st = SubstitutionTransformation({"O":"F"})
new_structure = st.apply_transformation(structure)
old_structure = st.inverse.apply_transformation(new_structure)
old_structure == structure

# Order disorder
odst = OrderDisorderedStructureTransformation()
ss = odst.apply_transformation(cu_au)
len(ss)

---------
## SymmetryAnalyzer and StructureMatcher

In addition to bookkeeping of structures using Structure objects, pymatgen contains powerful tools for analyzing crystal symmetry and comparing structures.  The SymmetryAnalyzer object is essentially a wrapper around spglib, which is written in c for more efficient determination of invariant symmetry operations and thus crystal symmetry.  The symmetry analyzer can be used to get primitive and standardized conventional cell settings of structures.

In [None]:
# Get primitive structure of BCC iron
from pymatgen.symmetry.analyzer import SpacegroupAnalyzer
sga = SpacegroupAnalyzer(bcc_fe)
prim = sga.get_primitive_standard_structure() # Note only a single site
sga.get_conventional_standard_structure()
sga.get_crystal_system()
sga.get_spacegroup_symbol()

The StructureMatcher object allows you to check whether a "fit" between two structures can be achieved.

In [None]:
from pymatgen.analysis.structure_matcher import StructureMatcher
sm = StructureMatcher()
sm.fit(bcc_fe, prim)

### Exercise 2:

A collaborator is interested in testing high piezoelectric moduli materials.  You look at the materials project and find that the highest piezo response corresponds to Pr<sub>3</sub>NF<sub>6</sub> (mp-33319).  Use the structure matcher to find similar structures with the same anonymous formula.

In [None]:
# Could limit structures to the same number of sites, just to make it go faster
query = mpr.query({"anonymous_formula":{"A":1, "B":3, "C":6}},#"nsites":{"$lte":30}},
                  {"structure":1})

In [None]:
structures = [q["structure"] for q in query]
wo3 = mpr.get_structure_by_material_id("mp-33319")
sm = StructureMatcher()

matches = []

for structure in structures:
    if sm.fit_anonymous(structure, wo3):
        matches.append(structure)

--------
## XRD, Bandstructure, and Density of States

Pymatgen has various tools that allow for the analysis and plotting of structural and electronic information.  The XRDCalculator is perhaps the most straightforward of these tools, since it only requires a structure object.

In [None]:
from pymatgen.analysis.diffraction.xrd import XRDCalculator, ATOMIC_SCATTERING_PARAMS
XRDCalculator.AVAILABLE_RADIATION
xrdc = XRDCalculator()
big_xrd = xrdc.get_xrd_data(big_structure)

In [None]:
%matplotlib inline
#xrdc.show_xrd_plot(nacl)

 ### Exercise 3: XRD spectra example
 
Your experimental collaborator finds an interesting Li-S cathode and performs powder XRD on it, resulting in the spectra below.  Identify the structure using the pymatgen XRD calculator.
![title](LiS_XRD.png)


In [None]:
lis_structures = mpr.get_structures("Li-S")
data = xrdc.get_xrd_data(lis_structures[0], two_theta_range=[10,80])
for n, lis_structure in enumerate(lis_structures):
    data = xrdc.get_xrd_data(lis_structure, two_theta_range=[10,80])
    if len(data) == 8:
        print(lis_structure)
    #xrdc.show_xrd_plot(lis_structure)

### Exercise: Plot bandstructure of TiO<sub>2</sub>.  Plot the p and d-projected electronic DOS of rutile TiO<sub>2</sub> (mp-2657) on O and Ti, respectively.

In [None]:
from pymatgen.analysis.diffraction.xrd import XRDCalculator
from pymatgen.electronic_structure.plotter import BSPlotter, DosPlotter
from pymatgen.electronic_structure.core import OrbitalType

bs = mpr.get_bandstructure_by_material_id("mp-2657")
print(bs.get_band_gap())
plotter=BSPlotter(bs)
#plotter.get_plot().show()
#plotter.plot_brillouin()

dos = mpr.get_dos_by_material_id("mp-2657")
dp = DosPlotter()
dos_ti = dos.get_element_spd_dos("Ti")
dos_o = dos.get_element_spd_dos("O")
dp.add_dos("O p-states", dos_o[OrbitalType.p])
dp.add_dos("Ti d-states", dos_ti[OrbitalType.d])
#dp.get_plot().show()

## Tensors

### Exercise 5:

Fit a "noisy" tensor to a particular crystal structure

In [None]:
import json
data = json.load(open("sample_elastic.json"))
si_struct = Structure.from_dict(data[0])
et = ElasticTensor.from_voigt(data[1])

print(data[1])
et.fit_to_structure(si_struct).voigt.round(2)

## Surfaces

### Generate all of the low-index facets for BCC Fe

In [None]:
from pymatgen.symmetry.analyzer import SpacegroupAnalyzer
from pymatgen.core.surface import generate_all_slabs
lattice = Lattice.cubic(2.85)
structure = Structure(lattice, ["Fe", "Fe"],
                         [[0, 0, 0], [0.5, 0.5, 0.5]])

slabs = generate_all_slabs(structure, 1, 4, 10)
first_slab = slabs[0]
first_slab.miller_index
for slab in slabs:
    print slab.miller_index