<a href="https://colab.research.google.com/github/mipypf/scej-mi/blob/develop/chapter11/scej_mi_chapter11_example_feature_matminer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Google Colabの準備

#### 右上の「接続」をクリックし、ランタイムに接続

In [1]:
# Google Colabを利用する場合はTrue、そうでない場合はFalseとする
colab = True

In [2]:
# Google Colabのファイルをクリックし、SrTiO3.cifをドラッグ＆ドロップしてアップロード
if colab:
  INPUT_FILE_PATH = "./"
  OUTPUT_FILE_PATH = "./"
else:
  INPUT_FILE_PATH = "../input/"
  OUTPUT_FILE_PATH = "../output/"

#### ライブラリをインストール

In [3]:
! pip install matminer==0.9.3

Collecting matminer==0.9.3
  Downloading matminer-0.9.3-py3-none-any.whl.metadata (4.9 kB)
Collecting pymongo~=4.5 (from matminer==0.9.3)
  Downloading pymongo-4.13.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (22 kB)
Collecting monty>=2023 (from matminer==0.9.3)
  Downloading monty-2025.3.3-py3-none-any.whl.metadata (3.6 kB)
Collecting pymatgen>=2023 (from matminer==0.9.3)
  Downloading pymatgen-2025.5.28-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)
Collecting ruamel.yaml (from monty>=2023->matminer==0.9.3)
  Downloading ruamel.yaml-0.18.12-py3-none-any.whl.metadata (24 kB)
Collecting bibtexparser>=1.4.0 (from pymatgen>=2023->matminer==0.9.3)
  Downloading bibtexparser-1.4.3.tar.gz (55 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m55.6/55.6 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting palettable>=3.3.3 (from pymatgen>=2023->matminer==0.9.

### 無機材料の特徴量化（matminerを使用）

#### ①ライブラリの呼び出し

In [4]:
import numpy as np
import pandas as pd

from matminer.featurizers.conversions import StrToComposition
from matminer.featurizers.composition import ElementProperty

import warnings
warnings.simplefilter("ignore")

#### ②対象の組成式の定義

In [5]:
data = {'composition_formula': ['SrTiO3']}
df = pd.DataFrame(data)
df

Unnamed: 0,composition_formula
0,SrTiO3


#### ③Compositionオブジェクト作成

In [6]:
df = StrToComposition().featurize_dataframe(df, col_id="composition_formula", ignore_errors=True)
df

Unnamed: 0,composition_formula,composition
0,SrTiO3,"(Sr, Ti, O)"


#### ④Compositionオブジェクトから元素由来の特徴量を作成

In [7]:
df = ElementProperty.from_preset(preset_name="magpie").featurize_dataframe(df, col_id="composition", ignore_errors=True)
df

Unnamed: 0,composition_formula,composition,MagpieData minimum Number,MagpieData maximum Number,MagpieData range Number,MagpieData mean Number,MagpieData avg_dev Number,MagpieData mode Number,MagpieData minimum MendeleevNumber,MagpieData maximum MendeleevNumber,...,MagpieData range GSmagmom,MagpieData mean GSmagmom,MagpieData avg_dev GSmagmom,MagpieData mode GSmagmom,MagpieData minimum SpaceGroupNumber,MagpieData maximum SpaceGroupNumber,MagpieData range SpaceGroupNumber,MagpieData mean SpaceGroupNumber,MagpieData avg_dev SpaceGroupNumber,MagpieData mode SpaceGroupNumber
0,SrTiO3,"(Sr, Ti, O)",8.0,38.0,30.0,16.8,10.56,8.0,8.0,87.0,...,2.3e-05,5e-06,7e-06,0.0,12.0,225.0,213.0,91.0,94.8,12.0


### 無機材料の特徴量化（CIF由来）

#### ①ライブラリの呼び出し

In [8]:
import numpy as np
import pandas as pd

from matminer.featurizers.structure import DensityFeatures
from pymatgen.core.structure import Structure

import warnings
warnings.simplefilter("ignore")

#### ②データフレームの定義（組成式は名称の把握のためのみに使用）


In [9]:
data = {'composition_formula': ['SrTiO3']}
df_ =pd.DataFrame(data)
df_

Unnamed: 0,composition_formula
0,SrTiO3


#### ③CIFの読み込み

In [10]:
with open(f'{INPUT_FILE_PATH}/SrTiO3.cif', 'r') as f:
    cif_content = f.read()

#### ④Structure オブジェクトを作成し、データフレームに格納


In [11]:
crystal_tmp = Structure.from_str(cif_content, fmt = "cif")
crystal_tmp

Structure Summary
Lattice
    abc : 3.91270131 3.91270131 3.91270131
 angles : 90.0 90.0 90.0
 volume : 59.90045030664282
      A : np.float64(3.91270131) np.float64(0.0) np.float64(2.395838567655578e-16)
      B : np.float64(6.292103598030447e-16) np.float64(3.91270131) np.float64(2.395838567655578e-16)
      C : np.float64(0.0) np.float64(0.0) np.float64(3.91270131)
    pbc : True True True
PeriodicSite: Sr0 (Sr2+) (0.0, 0.0, 0.0) [0.0, 0.0, 0.0]
PeriodicSite: Ti1 (Ti4+) (1.956, 1.956, 1.956) [0.5, 0.5, 0.5]
PeriodicSite: O2 (O2-) (1.956, 0.0, 1.956) [0.5, 0.0, 0.5]
PeriodicSite: O3 (O2-) (1.956, 1.956, 2.396e-16) [0.5, 0.5, 0.0]
PeriodicSite: O4 (O2-) (3.146e-16, 1.956, 1.956) [0.0, 0.5, 0.5]

In [12]:
df_['structure'] = [crystal_tmp]
df_

Unnamed: 0,composition_formula,structure
0,SrTiO3,"[[0. 0. 0.] Sr2+, [1.95635066 1.95635066 1.956..."


#### ⑤Structure オブジェクトからCIF由来の特徴量を作成

In [13]:
df_ = DensityFeatures().featurize_dataframe(df_, col_id="structure", ignore_errors=True)
df_

Unnamed: 0,composition_formula,structure,density,vpa,packing fraction
0,SrTiO3,"[[0. 0. 0.] Sr2+, [1.95635066 1.95635066 1.956...",5.086512,11.98009,0.796633


### 実行環境の確認及び保存

In [14]:
!python3 -V

Python 3.11.12


In [15]:
!pip freeze > requirements_feature_matminer.txt

In [16]:
from google.colab import files

files.download('requirements_feature_matminer.txt')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>