# Polycrystals classification


Author: Angelo Ziletti (angelo.ziletti@gmail.com; ziletti@fhi-berlin.mpg.de)


### Brief summary
This notebook shows how to classify a polycrystals using the strided pattern matching technique.

1. The function *make_strided_pattern_matching_dataset* will calculate the descriptor for each material patch obtained from the strided-pattern-matching procedure, and save them to file
2. The function *get_classification_map* loads the descriptor file, and make the predictions using a pre-trained neural network. Uncertainty estimates are also computed. The results are returned as two-dimensional classification maps.

In [1]:
from ai4materials.models.strided_pattern_matching import make_strided_pattern_matching_dataset
from ai4materials.utils.utils_config import set_configs
from ai4materials.utils.utils_config import setup_logger
from ai4materials.descriptors.diffraction3d import DISH
from ai4materials.models.strided_pattern_matching import get_classification_map
import os    

# define folders
# config_file = '/home/ziletti/Documents/calc_nomadml/rot_inv_3d/config_diff3d.yml'
main_folder = '/home/ziletti/Documents/calc_nomadml/rot_inv_3d/'
# prototypes_basedir = '/home/ziletti/Documents/calc_nomadml/rot_inv_3d/prototypes_aflow_new'
# db_files_prototypes_basedir = '/home/ziletti/Documents/calc_nomadml/rot_inv_3d/db_ase_prototypes'


# read config file
configs = set_configs(main_folder=main_folder)
logger = setup_logger(configs, level='INFO', display_configs=False)

# setup folder and files
checkpoint_dir = os.path.abspath(os.path.normpath(os.path.join(main_folder, 'saved_models/enc_dec_drop12.5')))
dataset_folder = os.path.abspath(os.path.normpath(os.path.join(main_folder, 'datasets')))
figure_dir = os.path.abspath(os.path.normpath(os.path.join(main_folder, 'attentive_resp_maps')))
conf_matrix_file = os.path.abspath(os.path.normpath(os.path.join(main_folder, 'confusion_matrix.png')))
results_file = os.path.abspath(os.path.normpath(os.path.join(main_folder, 'results.csv')))
lookup_file = os.path.abspath(os.path.normpath(os.path.join(main_folder, 'lookup.dat')))
control_file = os.path.abspath(os.path.normpath(os.path.join(main_folder, 'control.json')))
results_file = os.path.abspath(os.path.normpath(os.path.join(main_folder, 'results.csv')))
filtered_file = os.path.abspath(os.path.normpath(os.path.join(main_folder, 'filtered_file.json')))
results_file = os.path.abspath(os.path.normpath(os.path.join(main_folder, 'results.csv')))

configs['io']['dataset_folder'] = dataset_folder

Pymatgen will drop Py2k support from v2019.1.1. Pls consult the documentation
at https://www.pymatgen.org for more details.
  at https://www.pymatgen.org for more details.""")
Using TensorFlow backend.
This call to matplotlib.use() has no effect because the backend has already
been chosen; matplotlib.use() must be called *before* pylab, matplotlib.pyplot,
or matplotlib.backends is imported for the first time.

The backend was *originally* set to 'module://ipykernel.pylab.backend_inline' by the following code:
  File "/home/ziletti/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/home/ziletti/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/ziletti/anaconda2/lib/python2.7/site-packages/ipykernel/__main__.py", line 3, in <module>
    app.launch_new_instance()
  File "/home/ziletti/anaconda2/lib/python2.7/site-packages/traitlets/config/application.py", line 658, in launc

## 1. Descriptor calculation

Let us first define which is the file that contains the atomic coordinates of the polycrystal.
The file needs to be in the *.xyz* format.

In [2]:
structure_file = os.path.join(main_folder, 'structures_for_paper/small_edge_dislocation/small_edge_dislocation.xyz')

In [3]:
# define a descriptor to represent crystal structures
# here we use the diffraction intensity in spherical harmonics (DISH)
descriptor = DISH(configs=configs)

INFO: Metadata for descriptor DISH: [u'diffraction_3d_sh_spectrum', u'diffraction_3d_sh_spectrum_image', u'diffraction_3d_real_space', u'diffraction_3d_phase', u'diffraction_3d_coordinates']


In [4]:
# comment if you have already calculated the descriptor for the .xyz file
path_to_x_test, path_to_y_test, path_to_summary_test, path_to_strided_pattern_pos = make_strided_pattern_matching_dataset(
    polycrystal_file=structure_file, descriptor=descriptor, desc_metadata='diffraction_3d_sh_spectrum',
    configs=configs, operations_on_structure=None, stride_size=(10., 10., 20.), box_size=10.,
    init_sliding_volume=(14., 14., 14.), desc_file=None, desc_only=False, show_plot_lengths=True,
    desc_file_suffix_name='', nb_jobs=6, padding_ratio=None)

print(path_to_x_test)
print(path_to_y_test)
print(path_to_summary_test)
print(path_to_strided_pattern_pos)

In [5]:
path_to_x_test = '/home/ziletti/Documents/calc_nomadml/rot_inv_3d/datasets/small_edge_dislocation.xyz_stride_10.0_10.0_20.0_box_size_10.0_.tar.gz_x.pkl'
path_to_y_test = '/home/ziletti/Documents/calc_nomadml/rot_inv_3d/datasets/small_edge_dislocation.xyz_stride_10.0_10.0_20.0_box_size_10.0_.tar.gz_y.pkl'
path_to_summary_test = '/home/ziletti/Documents/calc_nomadml/rot_inv_3d/datasets/small_edge_dislocation.xyz_stride_10.0_10.0_20.0_box_size_10.0_.tar.gz_summary.json'
path_to_strided_pattern_pos = '/home/ziletti/Documents/calc_nomadml/rot_inv_3d/datasets/small_edge_dislocation.xyz_stride_10.0_10.0_20.0_box_size_10.0_.tar.gz_strided_pattern_pos.pkl'

get_classification_map(configs, path_to_x_test, path_to_y_test, path_to_summary_test, path_to_strided_pattern_pos, checkpoint_dir, checkpoint_filename='model.h5',
                           mc_samples=2, interpolation='none', results_file=None, calc_uncertainty=True,
                           conf_matrix_file=conf_matrix_file, train_set_name='hcp-sc-fcc-diam-bcc_pristine',
                           cmap_uncertainty='hot', interpolation_uncertainty='none')

INFO: Loading test dataset for prediction.
INFO: Loading and formatting of data completed.
INFO: Predicting...
INFO: Using multiple passes to have principles probability and uncertainty estimates
INFO: Calculating classification uncertainty.
INFO: Performing forward pass: 1/2
INFO: Accuracy: 0.0%
INFO: Confusion matrix, without normalization: 
INFO: [[ 0 84]
 [ 0  0]]
INFO: Predictions written to: /home/ziletti/Documents/calc_nomadml/rot_inv_3d/results_folder/results.csv
INFO: Confusion matrix written to /home/ziletti/Documents/calc_nomadml/rot_inv_3d/confusion_matrix.png.
INFO: Creating two-dimensional plot.
INFO: File saved at prob_class0.eps.
INFO: Creating two-dimensional plot.
INFO: File saved at prob_class1.eps.
INFO: Creating two-dimensional plot.
INFO: File saved at prob_class2.eps.
INFO: Creating two-dimensional plot.
INFO: File saved at prob_class3.eps.
INFO: Creating two-dimensional plot.
INFO: File saved at prob_class4.eps.
INFO: Creating two-dimensional plot.
INFO: File sa