# pymatgen使用指南

**UsagePage**: http://pymatgen.org/usage.html;  
**模块索引**: http://pymatgen.org/genindex.html;  
**API**: http://pymatgen.org/modules.html;  

## Overview

案例教程(jupyter notebook): https://github.com/materialsvirtuallab/matgenb/tree/master/notebooks;    
pymatgen采用了面向对象的编程方式, 集合任何东西都是对象, Element, Site, Structure; 

模块核心**pymatgen.core**的组成:  
1. `core.periodic_table`:  
    * 定义元素对象Element和Specie对象(带氧化态的元素对象);
2. `core.lattice`:   
    * 定义Lattice对象, 其包含三个方向的晶格矢量;  
    * 进行分数坐标和笛卡尔坐标的转换;
3. `core.sites`:  
    * 定义位点对象Site和周期位点对象PeriodicSite;  
    * 位点对象是一个包含元素的坐标点, 周期位点对象包含一个晶格;
4. `core.structure`:  
    * 定义结构对象Structure和分子对象Molecule;
    * 结构对象(其元素是周期位点)和分子对象(其元素是随机位点)被组织成列表的形式;
5. `core.composition`:  
    * 组成物对象 Composition是元素与数量(amounts)的映射;

所有单位(units)都是原子单位, 对象本身(per se)不显示单位;

## Side-note : as_dict / from_dict

In [85]:
import json, yaml
import pymatgen as mg

In [41]:
lattice = mg.Lattice.cubic(4.2)
structure = mg.Structure(lattice, ['Cs','Cl'], [[0,0,0],[0.5,0.5,0.5]])
%C lattice;; structure

          lattice          
---------------------------
Lattice                    
    abc : 4.2 4.2 4.2      
 angles : 90.0 90.0 90.0   
 volume : 74.08800000000001
      A : 4.2 0.0 0.0      
      B : 0.0 4.2 0.0      
      C : 0.0 0.0 4.2      

                            structure                             
------------------------------------------------------------------
Structure Summary                                                 
Lattice                                                           
    abc : 4.2 4.2 4.2                                             
 angles : 90.0 90.0 90.0                                          
 volume : 74.08800000000001                                       
      A : 4.2 0.0 0.0                                             
      B : 0.0 4.2 0.0                                             
      C : 0.0 0.0 4.2                                             
PeriodicSite: Cs (0.0000, 0.0000, 0.0000) [0.0000, 0.0000, 0.0000]
PeriodicSi

### json和yaml  

json和yaml是两种类似于字典的配置文件格式:  

晶体结构可与字典结构进行互相转换:
* **as_dict**方法: 将结构保存为一个字典;  
* **from_dict**方法: 从字典导入结构;

将晶体结构保存为json配置文件, 从json文件提取晶体结构

In [42]:
with open('structure.json','w') as f:
    json.dump(structure.as_dict(), f)

In [43]:
br.showfile('structure.json',nhead=4)

structure.json:>>                                 
  1: {"@module": "pymatgen.core.structure", "@class": "Structure", "charge": null, "lattice": {"matrix": [[4.2, 0.0, 0.0], [0.0, 4.2, 0.0], [0.0, 0.0, 4.2]], "a": 4.2, "b": 4.2, "c": 4.2, "alpha": 90.0, "beta": 90.0, "gamma": 90.0, "volume": 74.08800000000001}, "sites": [{"species": [{"element": "Cs", "occu": 1}], "abc": [0.0, 0.0, 0.0], "xyz": [0.0, 0.0, 0.0], "label": "Cs"}, {"species": [{"element": "Cl", "occu": 1}], "abc": [0.5, 0.5, 0.5], "xyz": [2.1, 2.1, 2.1], "label": "Cl"}]}

In [44]:
with open('structure.json', 'r') as f:
    d = json.load(f)
    structure = mg.Structure.from_dict(d)

将结构保存为yaml配置文件

In [45]:
with open('structure.yaml','w') as f:
    yaml.dump(structure.as_dict(), f)

In [46]:
with open('structure.yaml', 'r') as f:
    d = yaml.load(f)
    structure = mg.Structure.from_dict(d)

In [47]:
br.showfile('structure.yaml',nhead=4)

structure.yaml:>>                                 
  1: '@class': Structure
  2: '@module': pymatgen.core.structure
  3: charge: null
  4: lattice:


### MontyEncoder/Decoder

MontyEncoder封装了晶体结构向字典结构的转换过程, 因此使用`MontyEncoder`可以使晶体结构直接倾倒成json字符串;

In [48]:
json_string = json.dumps(structure, cls=mg.MontyEncoder)

MontyDecoder封装了字典结构向晶体结构的转换过程, 因此使用`MontyDecoder`可以使json字符串直接载入成晶体结构;

In [49]:
structure = json.loads(json_string, cls=mg.MontyDecoder)

## Structures and Molecules

### Creating a Structure manually

创建基本的Si晶体

In [50]:
from pymatgen import Lattice, Structure, Molecule

In [51]:
coords = [[0, 0, 0], [0.75,0.5,0.75]]
lattice = Lattice.from_parameters(a=3.84, b=3.84, c=3.84, alpha=120,
                                  beta=90, gamma=60)
struct = Structure(lattice, ["Si", "Si"], coords)
print(struct)

Full Formula (Si2)
Reduced Formula: Si
abc   :   3.840000   3.840000   3.840000
angles: 120.000000  90.000000  60.000000
Sites (2)
  #  SP       a    b     c
---  ----  ----  ---  ----
  0  Si    0     0    0
  1  Si    0.75  0.5  0.75


In [52]:
coords = [[0.000000, 0.000000, 0.000000],
          [0.000000, 0.000000, 1.089000],
          [1.026719, 0.000000, -0.363000],
          [-0.513360, -0.889165, -0.363000],
          [-0.513360, 0.889165, -0.363000]]
methane = Molecule(["C", "H", "H", "H", "Fe2+"], coords)
print(methane)

Full Formula (Fe1 H3 C1)
Reduced Formula: FeH3C
Charge = 0, Spin Mult = 2
Sites (5)
0 C     0.000000     0.000000     0.000000
1 H     0.000000     0.000000     1.089000
2 H     1.026719     0.000000    -0.363000
3 H    -0.513360    -0.889165    -0.363000
4 Fe2+    -0.513360     0.889165    -0.363000


>Note that both elements and species (elements with oxidation states) are supported. So both “Fe” and “Fe2+” are valid specifications.

### Reading and writing Structures/Molecules

读取POSCAR然后写为CIF:  
The format is automatically guessed from the filename.

In [53]:
structure = Structure.from_file("POSCAR")
structure.to(filename="CsCl.cif")

br.showfile('POSCAR',nspec=[1,7,8,10])
br.showfile("CsCl.cif",nspec=[1,7,8,10])

POSCAR:>>                                         
  1: Cs1 Cl1
  7: 1 1
  8: direct
 10: 0.500000 0.500000 0.500000 Cl
CsCl.cif:>>                                       
  1: # generated using pymatgen
  7: _cell_angle_alpha   90.00000000
  8: _cell_angle_beta   90.00000000
 10: _symmetry_Int_Tables_number   1


读取xyz文件然后写为高斯输入文件(gjf)

In [54]:
methane = Molecule.from_file("methane.xyz")
methane.to(filename="methane.gjf")

br.showfile('methane.xyz',nspec=(1,3,10))
br.showfile('methane.gjf',nspec=(1,3,10))

methane.xyz:>>                                    
  1: 45
  3:  H    0.632633    0.632633    0.632633
 10:  H    0.632633    4.340895    4.340895
methane.gjf:>>                                    
  1: #P HF/6-31G(d) 
  3: H36 C9
 10: H 3 B4 4 A4 2 D4


精确控制读取文件时的解析器

In [55]:
from pymatgen.io.cif import CifParser
parser = CifParser("CsCl.cif")
structure = parser.get_structures()[0] #parser.get_structures()返回一个列表

In [56]:
from pymatgen.io.vasp import Poscar
poscar = Poscar.from_file("POSCAR")
structure = poscar.structure

将POSCAR文件转为CIF:  

In [57]:
from pymatgen.io.vasp import Poscar
from pymatgen.io.cif import CifWriter
p = Poscar.from_file('POSCAR')
w = CifWriter(p.structure)
w.write_file('mystructure.cif')

>**pymatgen.io.vasp.sets** provides a powerful way to generate complete sets of VASP input files from a Structure.   
http://pymatgen.org/pymatgen.io.vasp.sets.html#module-pymatgen.io.vasp.sets;

对于分子来说, pymatgen对XYZ格式和gaussian格式提供了内嵌的支持;

In [58]:
from pymatgen.io.xyz import XYZ
from pymatgen.io.gaussian import GaussianInput

xyz = XYZ.from_file('methane.xyz')
gau = GaussianInput(xyz.molecule,
                    route_parameters={'SP': "", "SCF": "Tight"})
gau.write_file('methane.inp')

通过OpenBabel的Python绑定, pymatgen可以提供对100多种文件类型的支持;

### Things you can do with Structures

一项正在进行的工作(a work in progress):  
* 以更直接的方式修改结构: `pymatgen .transformations`, `pymatgen.alchemy`
* 分析结构,计算Ewald和: `pymatgen.analysis.ewald`;  
比较两个结构的相似性: `pymatgen.analysis.structure_matcher`;

Structure and Molecule are designed to be mutable(可变的).  
If you need guarantees of immutability(不可变性) for Structure/Molecule, you should use the IStructure and IMolecule classes instead.

#### Modifying Structures or Molecules

在结构的位点1处修改元素种类

In [59]:
structure[1] = "F"
# molecule[1] = "F"

同时改变物种和坐标

In [60]:
structure[1] = "Cl", [0.51, 0.51, 0.51]
# molecule[1] = "F", [1.34, 2, 3]

结构对象和分子对象支持典型的类似于列表式的操作:   
reverse, extend, pop, index, count;

In [61]:
structure.reverse()
# molecule.reverse()

In [62]:
structure.append("F", [0.9, 0.9, 0.9])
# molecule.append("F", [2.1, 3,.2 4.3])

In [63]:
structure

Structure Summary
Lattice
    abc : 4.2 4.2 4.2
 angles : 90.0 90.0 90.0
 volume : 74.08800000000001
      A : 4.2 0.0 0.0
      B : 0.0 4.2 0.0
      C : 0.0 0.0 4.2
PeriodicSite: Cl (2.1420, 2.1420, 2.1420) [0.5100, 0.5100, 0.5100]
PeriodicSite: Cs (0.0000, 0.0000, 0.0000) [0.0000, 0.0000, 0.0000]
PeriodicSite: F (3.7800, 3.7800, 3.7800) [0.9000, 0.9000, 0.9000]

一些典型的转换

将结构扩展为超胞:  
每运行一次就扩展一倍

In [64]:
# structure.make_supercell([2, 2, 2])
# structure

获得结构的原始形式

In [67]:
# structure.get_primitive_structure()
# structure

## Entries - Basic analysis unit

### 条目是最基本的分析单元:  
* 包含一个已经计算好的能量和一个组成成分(composition), 还可以包含其他的输入或者计算数据;    
* ComputedEntry对象和ComputedStructureEntry对象在`pymatgen.entries.computed_entries`中定义;  
* ComputedEntry对象既可以通过分析计算数据手动生成, 也可以使用`pymatgen.apps.borg`包生成;  


### Compatibility - Mixing GGA and GGA+U runs

In [69]:
%%pas
from pymatgen.entries.compatibility import MaterialsProjectCompatibility
from pymatgen.analysis.phase_diagram import PhaseDiagram, PDPlotter

# Get unprocessed_entries using pymatgen.borg or other means.

# Process the entries for compatibility
compat = MaterialsProjectCompatibility()
processed_entries = compat.process_entries(unprocessed_entries)

# These few lines generates the phase diagram using the ComputedEntries.
pd = PhaseDiagram(processed_entries)
plotter = PDPlotter(pd)
plotter.show()

## pymatgen.borg - High-throughput data assimilation

**borg**: 提供一个简便的方式提取(assimilation)目录结构中大量的数据;  
可以将vasp计算的整个目录结构变为一个可用的条目, 由此可用来计算相图或者进行其他分析;

工作流框架:  
1. **Drones**对象 
    * 在`pymatgen.apps.borg.hive`定义;
    * 通过Drone, 可以将一个目录解析成一个pymatgen对象;
    * **VaspToComputedEntryDrone**定义了如何将包含vasprun.xml文件的目录转化成一个ComputedEntry的操作;
2. **BorgQueen**对象
    * 位于pymatgen.apps.borg.queen;
    * 使用Drones提取(assimilation)整个因平行计算而产生的子目录结构;

### Simple example - Making a phase diagram

在`Li-O_runs`目录下, 分别计算了`Li`,`O`,和`Li-O`化合物;   
使用该目录结构计算计算Li-O相图;

In [81]:
%%pas
import pymatgen
from pymatgen.apps.borg.hive import VaspToComputedEntryDrone
from pymatgen.apps.borg.queen import BorgQueen
from pymatgen.analysis.phase_diagram import PhaseDiagram, PDPlotter

# 将数据提取到ComputedEntries;
drone = VaspToComputedEntryDrone()
queen = BorgQueen(drone, "Li-O_runs", 2)
entries = queen.get_data()

# 如果只是提取数据就会话费很长时间的话,
# 可以将提取后的数据保存到json结构中,
# 从json中提取数据比在普通文件中提取数据明显快的多;
queen.save_data("Li-O_entries.json")

# 使用ComputedEntries产生相图
pd = PhaseDiagram(entries)
plotter = PDPlotter(pd)
plotter.show()

### Another example - Calculating reaction energies

In [82]:
%%pas
from pymatgen.apps.borg.hive import VaspToComputedEntryDrone
from pymatgen.apps.borg.queen import BorgQueen
from pymatgen.analysis.reaction_calculator import ComputedReaction

# 从ComputedEntries中提取数据
drone = VaspToComputedEntryDrone()
queen = BorgQueen(drone)
queen.load_data("Li-O_entries.json")
entries = queen.get_data()

# 提取正确的条目并计算反应能
rcts = filter(lambda e: e.composition.reduced_formula in ["Li", "O2"], entries)
prods = filter(lambda e: e.composition.reduced_formula == "Li2O", entries)
rxn = ComputedReaction(rcts, prods)
print(rxn)
print(rxn.calculated_reaction_energy)

## pymatgen.transformations

结构变换:  
添加或删除位点; 替换位点物种;  
应用静电能量标准, 只去除结构中某一物种的一小部分;

$$1

In [94]:
from pymatgen.io.cif import CifParser
from pymatgen.transformations.standard_transformations import RemoveSpeciesTransformation

# 从cif读取LiFePO4的结构
parser = CifParser('LiFePO4.cif')
struct = parser.get_structures()[0]
# 定义一个删除物种类实例, 然后调用实例的方法删除物种
t = RemoveSpeciesTransformation(["Li"])
modified_structure = t.apply_transformation(struct)

## pymatgen.alchemy - High-throughput transformations

$$1

 replace Fe with Mn and remove all Li in all structures

In [None]:
from pymatgen.alchemy.transmuters import CifTransmuter
from pymatgen.transformations.standard_transformations import SubstitutionTransformation, RemoveSpeciesTransformation

trans = []
trans.append(SubstitutionTransformation({"Fe":"Mn"}))
trans.append(RemoveSpecieTransformation(["Lu"]))
transmuter = CifTransmuter.from_filenames(["MultiStructure.cif"], trans)
structures = transmuter.transformed_structures

## pymatgen.matproj.rest - Integration with the Materials Project REST API

This allows users to efficiently perform structure manipulation and analyses without going through **the web interface**.  
In parallel, we have coded in the **pymatgen.ext.matproj** module a **MPRester**, a user-friendly high-level interface to the **Materials API** to obtain useful pymatgen objects for further analyses. 

在访问`material project`之前应该首先注册一个账户, 然后在该账户下(dashboard)生成一个API密令;

通过id获得结构

In [123]:
with mg.MPRester("mTWQ1pJdf9bKaX3O") as m:

    # Structure for material id
    structure = m.get_structure_by_material_id("mp-1234")

    # Dos for material id
    dos = m.get_dos_by_material_id("mp-1234")

    # Bandstructure for material id
    bandstructure = m.get_bandstructure_by_material_id("mp-1234")

通过化学式获得结构, 将获得一个条目列表

In [144]:
data = m.get_data("Fe2O3")
print(len(data))# 条目长度
%col 2 list(data[0]) #每个条目包含的信息, 字典键

17
energy                   |energy_per_atom          |
volume                   |formation_energy_per_atom|
nsites                   |unit_cell_formula        |
pretty_formula           |is_hubbard               |
elements                 |nelements                |
e_above_hull             |hubbards                 |
is_compatible            |spacegroup               |
task_ids                 |band_gap                 |
density                  |icsd_id                  |
icsd_ids                 |cif                      |
total_magnetization      |material_id              |
oxide_type               |tags                     |
elasticity               |full_formula             |



`prop`参数: 获取每个条目的单个键

In [145]:
# prop的可选值
%col 2 mg.MPRester.supported_task_properties

energy                   |energy_per_atom          |
volume                   |formation_energy_per_atom|
nsites                   |unit_cell_formula        |
pretty_formula           |is_hubbard               |
elements                 |nelements                |
e_above_hull             |hubbards                 |
is_compatible            |spacegroup               |
band_gap                 |density                  |
icsd_id                  |cif                      |



>https://materialsproject.org/docs/api

In [135]:
# 条目id及其能量
energies = m.get_data("Fe2O3",prop="energy")
energies[:2]

[{'material_id': 'mvc-12005', 'energy': -129.40518463},
 {'material_id': 'mp-777192', 'energy': -262.93588696}]

In [143]:
# 条目id及其空间群
spacegroups = m.get_data("Fe2O3",prop='spacegroup')
spacegroups[:2]

[{'material_id': 'mvc-12005',
  'spacegroup': {'source': 'spglib',
   'symbol': 'Cmcm',
   'number': 63,
   'point_group': 'mmm',
   'crystal_system': 'orthorhombic',
   'hall': '-C 2c 2'}},
 {'material_id': 'mp-777192',
  'spacegroup': {'source': 'spglib',
   'symbol': 'P1',
   'number': 1,
   'point_group': '1',
   'crystal_system': 'triclinic',
   'hall': 'P 1'}}]

MPRester提供了获取化学系统中所有条目的方法;  
结合borg框架, 可以将自己的计算与MP中的数据进行对比分析;  

确定新计算材料的相位稳定性

In [152]:
%%pas
from pymatgen.ext.matproj import MPRester
from pymatgen.apps.borg.hive import VaspToComputedEntryDrone
from pymatgen.apps.borg.queen import BorgQueen
from pymatgen.entries.compatibility import MaterialsProjectCompatibility
from pymatgen.analysis.phase_diagram import PhaseDiagram, PDPlotter

# 将vasp的计算目录, 转化为`计算条目`对象
drone = VaspToComputedEntryDrone()
queen = BorgQueen(drone, rootpath=".")
entries = queen.get_data()

# 获取保存在MP数据库中的所有Li-Fe-O相
with MPRester("USER_API_KEY") as m:
    mp_entries = m.get_entries_in_chemsys(["Li", "Fe", "O"])

# 将MP的条目与自计算的条目结合起来
entries.extend(mp_entries)

# 使用MaterialsProjectCompatibility处理条目
compat = MaterialsProjectCompatibility()
entries = compat.process_entries(entries)

# 绘制Li-Fe-O相图
pd = PhaseDiagram(entries)
plotter = PDPlotter(pd)
plotter.show()

### The query method

MPRester的`query()`方法提供了更加灵活的读取MP数据库的方法;  
https://materialsproject.org/open

In [154]:
%%pas
from pymatgen.ext.matproj import MPRester

with MPRester("USER_API_KEY") as m:

    # Get all energies of materials with formula "*2O".
    results = m.query("*2O", ['energy'])

    # Get the formulas and energies of materials with materials_id mp-1234
    # or with formula FeO.
    results = m.query("FeO mp-1234", ['pretty_formula', 'energy'])

    # Get all compounds of the form ABO3
    results = m.query("**O3", ['pretty_formula', 'energy'])

### Setting the PMG_MAPI_KEY in the config file

MPRester可以自动在配置文件中查询`USER_API_KEY`, 如果配置文件中保存了该变量, MPRester在以后调用时可以不设置该参数;

MAPI(Materials API)

在命令行下执行该条语句才会生成配置文件:`C:\Users\tf..\.pmgrc.yaml`   
$ pmg config --add PMG_MAPI_KEY 'mTWQ....'