# Auto XRD analyisis

In this notebook we exemplify how to train an [XRD-AutoAnalyzer](https://github.com/njszym/XRD-AutoAnalyzer) (CNN) model on a chemical space. For that, we will use the nomad-auto-xrd plugins that has come handy functions to train the model and to save it in NOMAD.

Then, we will save the model(s) trainned as an entry in NOMAD, so we can serach for them and reuse them easily. 
Once we have done this, we will analyse some of the diffraction patterns that we have already uploaded in NOMAD, to match the phases to the diffraction patterns. 

First let's run some imports

In [1]:
from nomad_auto_xrd.train_xrd_cnn import ModelConfig, run_xrd_model

2024-10-09 17:04:15.849482: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-10-09 17:04:15.850393: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-09 17:04:15.853817: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-09 17:04:15.862014: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-09 17:04:15.876488: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been 

The `ModelConfig` class is defined in `train_xrd_cnn.py` and is used to define the model architecture and training parameters. The `run_xrd_model` function is used to train the model. The `run_xrd_model` function takes a `ModelConfig` object as an argument and returns a trained model. The trained model can then be used to make predictions on new XRD data.

Let's start by inspecting the default model configuration.

In [2]:
config = ModelConfig() # default config
config

ModelConfig(references_dir='References', all_cifs_dir='All_CIFs', models_dir='Models', xrd_output_file='XRD.npy', pdf_output_file='PDF.npy', max_texture=0.5, min_domain_size=5.0, max_domain_size=30.0, max_strain=0.03, num_spectra=50, min_angle=20.02, max_angle=79.98, max_shift=0.1, separate=True, impur_amt=0.0, skip_filter=True, include_elems=True, inc_pdf=False, save_pdf=False, num_epochs=50, test_fraction=0.2, enable_wandb=False, wandb_project='xrd_model', wandb_entity='your_entity_name', save_nomad_metadata=True)

To train the model, we need to provide a folder with the structure files that we want to use to train the model. We will store them in the `All_CIFs` folder, which is the default one used by [XRD-AutoAnalyzer](https://github.com/njszym/XRD-AutoAnalyzer). One can also update the `config`  to change the default parameters of the model. Here we will reduce the number of epochs to 1, to make the training faster. Also, we will set the `save_nomad_metadata` to `True` to create an archive of this model. This will allow us to find it and reuse later on to run inference on experimental data.

In [3]:
# Redefine some model parameters
config=ModelConfig(
num_epochs = 1,
save_nomad_metadata=True)

# Run the training
run_xrd_model(config)




[1m30/30[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m23s[0m 680ms/step - categorical_accuracy: 0.4504 - loss: 2.2655 - val_categorical_accuracy: 0.3625 - val_loss: 29.6969


  - nomad.commit: 
  - nomad.deployment: oasis
  - nomad.service: unknown nomad service
  - nomad.version: 1.3.8.dev69+g69d5f7908


[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 171ms/step - categorical_accuracy: 0.3331 - loss: 30.9137
Test Accuracy: 33.00%
Model metadata saved to model_metadata.archive.json
