# 概要
- 対応コンペ [Nomad2018 Predicting Transparent Conductors](https://www.kaggle.com/c/nomad2018-predict-transparent-conductors)

# Overview
- イノベーティブなマテリアルデザインはいろいろなところで大事
- Transparent conductors are an important class of compounds that are both electrically conductive and have a low absorption in the visible range, which are typically competing properties.
    - [Transparent conductive film](https://en.wikipedia.org/wiki/Transparent_conducting_film)
- この手の材料はあまり見つかっていない
    - Aluminum (Al), gallium (Ga), indium (In) sesquioxides（三二酸化物）が有望

# Data
- High-quality data are provided for 3,000 materials that show promise as transparent conductors
    - 空間群
    - Total number of Al, Ga, In and O atoms in the unit cell
    - Relative compositions of Al, Ga, and In (x, y, z)
    - Lattice vectors and angles
- 目的：次の 2 つの性質を予測する
    - Formation energy (an important indicator of the stability of a material)
    - Bandgap energy (an important property for optoelectronic applications)

# Discussion

## Machine Learning vs Density Functional Theory Calculation

- https://www.kaggle.com/c/nomad2018-predict-transparent-conductors/discussion/46021
- 大して役に立たない

## Suggestions on features and data
- https://www.kaggle.com/c/nomad2018-predict-transparent-conductors/discussion/47998)
- オーガナイザーのコメント
    - Any atomic features (found online such as here) or geometrical features (derived from the provided geometry.xyz files) can be used in addition to the features included in the train.csv file to improve predictions (see for example here).
    - Free software that might be useful includes vesta, atomic simulation environment, pymatgen, and jmol.
    - We discovered that there are 7 duplicate rows in the training data (id = 395/126, 1215/1886, 2075/353, 308/2154, 531/1379, 2319/2337, 2370/2333), so you may choose to ignore these.
    - pymatgen: It has a class called EwaldSummation
        - To overcome this and turn these matrices into somewhat more general features you can take their traces.
        - https://github.com/diwadd/Nomad2018/blob/master/neat_poly.py
        - If your interested in similar matrices then see 
            - https://arxiv.org/pdf/1503.07406.pdf
            - or https://arxiv.org/pdf/1307.1266.pdf
            - Hope this helps (it did not help me ;-)

## Papers/Hints on features and models
- https://www.kaggle.com/c/nomad2018-predict-transparent-conductors/discussion/49469
- Literature/Papers:
  (1) https://journals.aps.org/prb/pdf/10.1103/PhysRevB.92.085206 . A physics paper which gives results for (Inx Ga{1-x})2 O3 alloys from DFT. It only has In and Ga, but includes sufficient physics insight for the alloy. Especially in fig.4 and fig.5, the relations are plotted for different groups with different compositions. Equation (4) is also insightful.
- (2) https://arxiv.org/abs/1307.1266. A paper which discusses about learning representation. While the formation energy can be calculated from bond length, band gaps can be calculated from eigenvalues of interaction matrix. I think the idea of partial redial distribution function might be interesting.
- (1) Comparing/Bagging results for whole dataset and split datasets for each space group. From my understanding, space group is such a strong feature that it is reasonable to split the dataset for each space group and train models over split datasets separately.
- (3) More features: volume, density, indicator about whether it passes 50:50 for each compositions, ln(x)/ln(y)/ln(z)/x^2/y^2/z^2 in order to fit some potential formula as Equation 4 in Paper 1.

## 17th place solution - deep learning approach
- https://www.kaggle.com/c/nomad2018-predict-transparent-conductors/discussion/49884

## 9th place solution with geometric features
- https://www.kaggle.com/c/nomad2018-predict-transparent-conductors/discussion/49905

# Notebook

## Resistance is futile - Transparent Conductors EDA
- [該当ページ](https://www.kaggle.com/headsortails/resistance-is-futile-transparent-conductors-eda)
- R で書かれている模様