使用内嵌的**structure_prediction**包(package)预测结构;  
通过MP-API收集所有可能的结构(the chemical systems), 重新替换原始物种, 过滤掉与MP重复的结构, 输出新的结构

In [1]:
from pymatgen import MPRester
from pymatgen import Specie, Element
from pymatgen.analysis.structure_prediction.substitutor import Substitutor 
from pymatgen.analysis.structure_prediction.substitution_probability import SubstitutionPredictor
from pymatgen.analysis.structure_matcher import StructureMatcher, ElementComparator
from pymatgen.transformations.standard_transformations import AutoOxiStateDecorationTransformation

In [2]:
mpr = MPRester()

In [3]:
threshold = 0.001 #阈值?
num_subs = 5     #替换物种的个数

## 寻找最有可能的替换物种

**SubstitutionPredictor()**:     
使用[data-mined](https://pubs.acs.org/doi/10.1021/ic102031h)方法寻找替换物种(specie substitutions);    

In [4]:
original_species = [Specie('Y',3), Specie('Mn',3), Specie('O',-2)]
original_species

[Specie Y3+, Specie Mn3+, Specie O2-]

In [5]:
subs = SubstitutionPredictor(threshold=threshold).list_prediction(original_species) #len(subs)
subs[0]

{'probability': 0.0014141089949547632,
 'substitutions': {Specie Tb3+: Specie Y3+,
  Specie Fe3+: Specie Mn3+,
  Specie O2-: Specie O2-}}

> `FileNotFoundError`: No such file or directory: '..pymatgen\\analysis\\structure_prediction\\data\\lambda.json';  
解决: 将[pymatgen-git库](https://github.com/materialsproject/pymatgen/tree/master/pymatgen/analysis/structure_prediction)中的data文件复制到本地目录中;

在`data-mined`中, 每个替换物种都有一个最大可能性概率(highest probability), 通过该值进行排序;

In [6]:
subs.sort(key = lambda x: x['probability'], reverse = True)
subs = subs[0:num_subs]

创建列表, 存储每种替换物种组合(substituted specie combination);

In [7]:
trial_subs = [list(sub['substitutions'].keys()) for sub in subs]
trial_subs

[[Specie Y3+, Specie Mn3+, Specie O2-],
 [Specie Na+, Specie Mn3+, Specie O2-],
 [Specie Re5+, Specie Mn3+, Specie O2-],
 [Specie Y3+, Specie Fe3+, Specie O2-],
 [Specie Y3+, Specie Sc3+, Specie O2-]]

创建列表, 存储每种独特的化学体系, 元素由破折号(dashes)分隔:

In [8]:
elem_sys_list = [[specie.element for specie in sub] for sub in trial_subs]
elem_sys_list

[[Element Y, Element Mn, Element O],
 [Element Na, Element Mn, Element O],
 [Element Re, Element Mn, Element O],
 [Element Y, Element Fe, Element O],
 [Element Y, Element Sc, Element O]]

In [9]:
chemsys_set = set()
for sys in elem_sys_list:
    chemsys_set.add("-".join(map(str,sys))) #map(str,sys): 将`[Element Y, Element Mn, Element O]`变成'Y-Mn-O'
chemsys_set

{'Na-Mn-O', 'Re-Mn-O', 'Y-Fe-O', 'Y-Mn-O', 'Y-Sc-O'}

## 通过Mp-api获得新化学体系的所有结构

创建一个字典: {化学体系: 结构};

In [10]:
all_structs = {}
for chemsys in chemsys_set:
    all_structs[chemsys] = mpr.get_structures(chemsys) # 五个结构的获取时间应该不长

In [11]:
print(all_structs['Y-Fe-O'][0])

Full Formula (Y2 Fe4 O8)
Reduced Formula: Y(FeO2)2
abc   :   6.321493   6.321493   6.321493
angles:  60.267342  60.267348  60.267346
Sites (14)
  #  SP           a         b         c    magmom
---  ----  --------  --------  --------  --------
  0  Y     0.625014  0.625014  0.625014     0.059
  1  Y     0.374986  0.374986  0.374986     0.059
  2  Fe    0         0.5       0            4.026
  3  Fe    0         0         0            4.342
  4  Fe    0.5       0         0            4.02
  5  Fe    0         0         0.5          3.996
  6  O     0.207125  0.768584  0.768584     0.162
  7  O     0.231416  0.231416  0.792875     0.155
  8  O     0.231416  0.792875  0.231416     0.158
  9  O     0.238322  0.238322  0.238322     0.102
 10  O     0.768584  0.768584  0.207125     0.155
 11  O     0.792875  0.231416  0.231416     0.162
 12  O     0.768584  0.207125  0.768584     0.158
 13  O     0.761678  0.761678  0.761678     0.102


创建一个`自动氧化态装饰转化器`: 在每个晶格点处检测氧化态;

In [12]:
auto_oxi = AutoOxiStateDecorationTransformation() 

将每个新化学体系添加氧化态

In [13]:
all_structs.keys()

dict_keys(['Y-Fe-O', 'Y-Sc-O', 'Re-Mn-O', 'Na-Mn-O', 'Y-Mn-O'])

In [14]:
oxi_structs = {}
for chemsys in all_structs:
    oxi_structs[chemsys] = [] #chemsys键所对应的值的类型
    for num, struct in enumerate(all_structs[chemsys]): #每个新化学体系可能多个结构
        try:
            oxi_structs[chemsys].append({'structure': auto_oxi.apply_transformation(struct), 
                                         'id': str(chemsys + "_" + str(num))}) #len(oxi_structs)=5
        except:
            continue # if auto oxidation fails, try next structure            

In [15]:
print(oxi_structs['Y-Fe-O'][0])

{'structure': Structure Summary
Lattice
    abc : 6.16736763 6.167367296628219 11.84594428
 angles : 90.0 90.0 120.00000173446008
 volume : 390.2114063259548
      A : 6.16736763 0.0 0.0
      B : -3.08368381 5.34109666 0.0
      C : 0.0 0.0 11.84594428
PeriodicSite: Y3+ (3.0837, 1.7804, 2.7653) [0.6667, 0.3333, 0.2334]
PeriodicSite: Y3+ (-0.0000, 3.5607, 8.6882) [0.3333, 0.6667, 0.7334]
PeriodicSite: Y3+ (3.0837, 1.7804, 8.6882) [0.6667, 0.3333, 0.7334]
PeriodicSite: Y3+ (-0.0000, 3.5607, 2.7653) [0.3333, 0.6667, 0.2334]
PeriodicSite: Y3+ (0.0000, 0.0000, 9.1881) [0.0000, 0.0000, 0.7756]
PeriodicSite: Y3+ (0.0000, 0.0000, 3.2652) [0.0000, 0.0000, 0.2756]
PeriodicSite: Fe3+ (4.1074, 0.0000, 5.9339) [0.6660, 0.0000, 0.5009]
PeriodicSite: Fe3+ (2.0537, 3.5571, 0.0110) [0.6660, 0.6660, 0.0009]
PeriodicSite: Fe3+ (-1.0300, 1.7840, 0.0110) [0.0000, 0.3340, 0.0009]
PeriodicSite: Fe3+ (4.1137, 3.5571, 5.9339) [1.0000, 0.6660, 0.5009]
PeriodicSite: Fe3+ (1.0300, 1.7840, 5.9339) [0.3340, 0.3340

## 替换原始物种变为新的结构

创建一个新的字典trans_structures:{化学体系: 预测结构(使用替换元素直接对原始结构中的元素替换)};  
这些新的预测结构是**改造对象**`TransformedStructure`;

创建一个物种替换器

In [17]:
sbr = Substitutor(threshold = threshold) 

In [18]:
trans_structs = {}
for chemsys in oxi_structs:
    trans_structs[chemsys] = sbr.pred_from_structures(original_species,oxi_structs[chemsys])

In [19]:
trans_structs

{'Y-Fe-O': [<pymatgen.alchemy.materials.TransformedStructure at 0x27a5603acf8>,
  <pymatgen.alchemy.materials.TransformedStructure at 0x27a5603ac50>,
  <pymatgen.alchemy.materials.TransformedStructure at 0x27a5604b4e0>,
  <pymatgen.alchemy.materials.TransformedStructure at 0x27a5603ada0>,
  <pymatgen.alchemy.materials.TransformedStructure at 0x27a56041390>,
  <pymatgen.alchemy.materials.TransformedStructure at 0x27a56054550>],
 'Y-Sc-O': [<pymatgen.alchemy.materials.TransformedStructure at 0x27a5609e208>,
  <pymatgen.alchemy.materials.TransformedStructure at 0x27a5609e390>,
  <pymatgen.alchemy.materials.TransformedStructure at 0x27a560a07b8>],
 'Re-Mn-O': [],
 'Na-Mn-O': [],
 'Y-Mn-O': []}

使用**StructureMatcher()**: 创建结构匹配器, 用于过滤掉重复结构

In [20]:
sm = StructureMatcher(comparator=ElementComparator(),primitive_cell=False)

In [21]:
filtered_structs = {} # 过滤字典
seen_structs = [] # 与化学系无关的结构

In [22]:
print("Number of entries BEFORE filtering: " + str(sum([len(sys) for sys in trans_structs.values()])))

Number of entries BEFORE filtering: 9


In [23]:
for chemsys in trans_structs:
    filtered_structs[chemsys] = []
    for struct in trans_structs[chemsys]:
        found = False
        for struct2 in seen_structs:
            if sm.fit(struct.final_structure, struct2.final_structure):
                found = True
                break
        if not found:
            filtered_structs[chemsys].append(struct)
            seen_structs.append(struct)

In [24]:
print("Number of entries AFTER filtering: " + str(sum([len(sys) for sys in filtered_structs.values()])))

Number of entries AFTER filtering: 6


注意: 
重新运行程序之后, 过滤结构可能会发生改变;  
Since we are filtering for duplicates across chemical systems, either of the two systems may be reported in the filtered dictionary. Which of the two systems it is simply depends on the order in that the filter algorithm follows (and it's reading from a naturally unordered dictionary!)

删除所有重复结构

In [25]:
known_structs = mpr.get_structures("Y-Mn-O") # get all known MP structures for original system

In [26]:
final_filtered_structs = {}
print("Number of entries BEFORE filtering against MP: " + str(sum([len(sys) for sys in filtered_structs.values()])))

Number of entries BEFORE filtering against MP: 6


In [27]:
for chemsys in filtered_structs:
    final_filtered_structs[chemsys] = []
    for struct in filtered_structs[chemsys]:
        found = False
        for struct2 in known_structs:
            if sm.fit(struct.final_structure, struct2):
                found = True
                break 
        if not found:
            final_filtered_structs[chemsys].append(struct)                

In [28]:
print("Number of entries AFTER filtering against MP: " + str(sum([len(sys) for sys in final_filtered_structs.values()])))
print(final_filtered_structs)

Number of entries AFTER filtering against MP: 1
{'Y-Fe-O': [<pymatgen.alchemy.materials.TransformedStructure object at 0x0000027A56054550>], 'Y-Sc-O': [], 'Re-Mn-O': [], 'Na-Mn-O': [], 'Y-Mn-O': []}


Create final structure dictionary with StructureNL objects for each transformed structure (Note: this requires installation of pybtex):

In [29]:
final_structs = {}
for chemsys in final_filtered_structs:
    final_structs[chemsys] = [struct.to_snl([{"name":"Matthew McDermott", "email":"N/A"}]) 
                              for struct in final_filtered_structs[chemsys]]
with open('br_data/Y-Fe_O.txt', 'w') as f:
    print(final_structs['Y-Fe-O'][0].as_dict(), file=f) # Printing one of the StructureNL objects - this is a large dictionary

  warn('Data in TransformedStructure.other_parameters discarded '
