# Pre-trained Model Library

**XenonPy.MDL** is a library of pre-trained models that were obtained by feeding diverse materials data on structure-property relationships into neural networks and some other supervised learning algorithms.

XenonPy offers a simple-to-use toolchain to seamlessly perform **transfer learning** with the given **pre-trained models**.In this tutorial, we will focus on model querying and retrieving.

### useful functions

In [1]:
%run tools.ipynb

### access pre-trained models with MDL class

In [2]:
# --- import necessary libraries

from xenonpy.datatools import MDL

In [3]:
# --- init and check

mdl = MDL()
mdl

MDL(api_key='', save_to='.')

Here is a parameter call ``api_key``, but you have not need to worry about that. Our database system are still in hard working, the **API key** is not needed at this time.

When we reached here, the querying object ``mdl`` is ready to use. To query models, do something like this:

In [4]:
# --- query data

summary = mdl(modelset_has="Stable inorganic compounds",  # sub string in the name of modelset
              property_has="volume", # substring for property name 
              save_to=False  #  set to False to prevent download.
             )

All parameters should be a string contains part of the name. All avaliable names are listed at https://xenonpy.readthedocs.io/en/latest/features.html#xenonpy-mdl


In this case, we are querying some models want their modelset's name contain **Stable inorganic compounds** and their prediction property's contain **volume**.

If successful, a pandas.DataFrame object contains information of models will be returned.

In [5]:
summary.head(5)

Unnamed: 0_level_0,descriptor,lang,mae,method,modelSet,property,r,regress,succeed,transferred,url
mId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
M23001,xenonpy.composition,python,22.565939,pytorch.nn.neural_network,Stable inorganic compounds in materials projec...,inorganic.crystal.volume,0.996093,True,True,False,http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...
M23002,xenonpy.composition,python,295.966614,pytorch.nn.neural_network,Stable inorganic compounds in materials projec...,inorganic.crystal.volume,0.991945,True,True,False,http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...
M23003,xenonpy.composition,python,151.815582,pytorch.nn.neural_network,Stable inorganic compounds in materials projec...,inorganic.crystal.volume,0.994928,True,True,False,http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...
M23004,xenonpy.composition,python,50.362647,pytorch.nn.neural_network,Stable inorganic compounds in materials projec...,inorganic.crystal.volume,0.995265,True,True,False,http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...
M23005,xenonpy.composition,python,87.297256,pytorch.nn.neural_network,Stable inorganic compounds in materials projec...,inorganic.crystal.volume,0.993133,True,True,False,http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...


### download models

To download models, the column named **url** in the querying result is needed. Information in this column show the downloadable http links. By using these links, users can download models in their preferred way.

We also offered a simple download function to help user's downloading. Assuming we want to download the top 5 best performance models based on their **MAE**. The downloading procedure is straight-forward as below.

#### 1. sort models by the value of **MAE**

In [6]:
summary = summary.sort_values('mae')
summary.head(5)

Unnamed: 0_level_0,descriptor,lang,mae,method,modelSet,property,r,regress,succeed,transferred,url
mId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
M23001,xenonpy.composition,python,22.565939,pytorch.nn.neural_network,Stable inorganic compounds in materials projec...,inorganic.crystal.volume,0.996093,True,True,False,http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...
M23265,xenonpy.composition,python,22.759855,pytorch.nn.neural_network,Stable inorganic compounds in materials projec...,inorganic.crystal.volume,0.99542,True,True,False,http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...
M24137,xenonpy.composition,python,22.883263,pytorch.nn.neural_network,Stable inorganic compounds in materials projec...,inorganic.crystal.volume,0.995601,True,True,False,http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...
M23203,xenonpy.composition,python,22.954258,pytorch.nn.neural_network,Stable inorganic compounds in materials projec...,inorganic.crystal.volume,0.995931,True,True,False,http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...
M25054,xenonpy.composition,python,22.983179,pytorch.nn.neural_network,Stable inorganic compounds in materials projec...,inorganic.crystal.volume,0.996287,True,True,False,http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...


#### 2. get the first 5 **url**s

In [7]:
urls = summary['url'].iloc[:5]
urls

mId
M23001    http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...
M23265    http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...
M24137    http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...
M23203    http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...
M25054    http://xenon.ism.ac.jp/mdl/S1/inorganic.crysta...
Name: url, dtype: object

#### 3. download models by using ``mdl.pull`` method

In [8]:
results = mdl.pull(urls)

100%|██████████| 5/5 [00:28<00:00,  5.56s/it]


The ``results`` object is a list contains the local paths where the downloaded models are. You can also specific the savoing path by passing a path to the ``save_to`` parameter of ``mel.pull``.

In [9]:
results

['/Users/liuchang/projects/XenonPy/samples/S1/inorganic.crystal.volume/xenonpy.composition/pytorch.nn.neural_network/04cd-290-281-153-75-21@1',
 '/Users/liuchang/projects/XenonPy/samples/S1/inorganic.crystal.volume/xenonpy.composition/pytorch.nn.neural_network/ajc1-290-261-122-66-25-10@1',
 '/Users/liuchang/projects/XenonPy/samples/S1/inorganic.crystal.volume/xenonpy.composition/pytorch.nn.neural_network/rzr8-290-285-177-111-58-27@1',
 '/Users/liuchang/projects/XenonPy/samples/S1/inorganic.crystal.volume/xenonpy.composition/pytorch.nn.neural_network/p0nx-290-285-126-52-22-13@1',
 '/Users/liuchang/projects/XenonPy/samples/S1/inorganic.crystal.volume/xenonpy.composition/pytorch.nn.neural_network/r4t2-290-243-131-67-23@1']

### load downloaded models

You can use ``xenonpy.model.nn.Checker`` to load downloaded model.

In [10]:
# --- import necessary libraries

from xenonpy.model.nn import Checker

checker = Checker.load('S1/inorganic.crystal.volume/xenonpy.composition/pytorch.nn.neural_network/04cd-290-281-153-75-21@1')

If successful, use ``checker.trained_model`` to load your model.

In [11]:
checker.trained_model



Sequential(
  (0): Layer1d(
    (layer): Linear(in_features=290, out_features=281, bias=True)
    (batch_nor): BatchNorm1d(281, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act_func): ReLU()
    (dropout): Dropout(p=0.1)
  )
  (1): Layer1d(
    (layer): Linear(in_features=281, out_features=153, bias=True)
    (batch_nor): BatchNorm1d(153, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act_func): ReLU()
    (dropout): Dropout(p=0.1)
  )
  (2): Layer1d(
    (layer): Linear(in_features=153, out_features=75, bias=True)
    (batch_nor): BatchNorm1d(75, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act_func): ReLU()
    (dropout): Dropout(p=0.1)
  )
  (3): Layer1d(
    (layer): Linear(in_features=75, out_features=21, bias=True)
    (batch_nor): BatchNorm1d(21, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act_func): ReLU()
    (dropout): Dropout(p=0.1)
  )
  (4): Layer1d(
    (layer): Linear(in_featur

You can list all the information for a download model by printing the ``checker`` object.

In [12]:
checker

<04cd-290-281-153-75-21@1> under `/Users/liuchang/projects/XenonPy/samples/S1/inorganic.crystal.volume/xenonpy.composition/pytorch.nn.neural_network` includes:
"init_model": 1
"describe": 1
"y_true": 1
"y_pred": 1
"runner": 1
"y_indices": 1
"y_true_fit": 1
"y_scale": 1
"scores": 1
"trained_model": 1
"x_indices": 1
"x_scale": 1
"y_pred_fit": 1

Some models did a preprocessing on their descriptors or/and their targets before their training. You can use ``*_scaler`` to load the **scaler** they used then do the same tranform to your data. An example is below:

In [13]:
# --- scale transform

X_scaler = check.last('x_scale')
pg_desc_ = X_scaler.transform(<your_descrpitor>)
pg_desc_



array([[-2.49565241, -2.73213484, -2.06566775, ..., -0.58267778,
        -1.19611066, -1.09948681],
       [-2.65095953, -3.50616758, -1.92142292, ..., -0.58267778,
        -0.00833578, -1.09948681],
       [-2.89849631, -3.62552256, -2.15598929, ..., -0.58267778,
        -0.00833578, -1.09948681],
       ...,
       [-2.25500815, -2.8593988 , -2.01746408, ..., -0.58267778,
        -1.19611066, -1.09948681],
       [-2.2047618 , -3.04141036, -2.06635113, ..., -0.58267778,
        -1.19611066, -1.09948681],
       [-2.53906384, -3.25755408, -2.09805685, ..., -0.58267778,
        -1.19611066, -1.09948681]])

### download R model

There are also a lot of R models in **XenonPy.MDL**, download them is exactly the same works like we did above. Just use ``lang_has='r'`` when querying.

In [14]:
from xenonpy.datatools import MDL

mdl = MDL()

summary = mdl(modelset_has="QM9",  # sub string in the name of modelset
              property_has="hartree", # substring for property name
              lang_has='r',
              save_to=None  #  set to False to prevent download.
             )

In [15]:
summary.head(3)

Unnamed: 0_level_0,descriptor,lang,mae,method,modelSet,property,r,regress,succeed,transferred,url
mId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
M18001,rcdk.fp.fingerprint,r,15.8705,mxnet.nn.neural_network,QM9 Dataset from Quantum-Machine project,organic.nonpolymer.g_hartree,0.8282,True,True,False,http://xenon.ism.ac.jp/mdl/S3/organic.nonpolym...
M18002,rcdk.fp.fingerprint,r,19.7939,mxnet.nn.neural_network,QM9 Dataset from Quantum-Machine project,organic.nonpolymer.g_hartree,0.789,True,True,False,http://xenon.ism.ac.jp/mdl/S3/organic.nonpolym...
M18003,rcdk.fp.fingerprint,r,14.9261,mxnet.nn.neural_network,QM9 Dataset from Quantum-Machine project,organic.nonpolymer.g_hartree,0.851,True,True,False,http://xenon.ism.ac.jp/mdl/S3/organic.nonpolym...


In [16]:
urls = summary['url'].iloc[:3]
results = mdl.pull(urls)

100%|██████████| 3/3 [00:39<00:00, 12.63s/it]


In [17]:
results

['/Users/liuchang/projects/XenonPy/samples/S3/organic.nonpolymer.g_hartree/rcdk.fp.fingerprint/mxnet.nn.neural_network/shotgun_G_Hartree_randFP1021_corr-0.8282_mxnet_400-81-10-1_2018-04-20',
 '/Users/liuchang/projects/XenonPy/samples/S3/organic.nonpolymer.g_hartree/rcdk.fp.fingerprint/mxnet.nn.neural_network/shotgun_G_Hartree_randFP1026_corr-0.789_mxnet_266-127-94-10-1_2018-04-20',
 '/Users/liuchang/projects/XenonPy/samples/S3/organic.nonpolymer.g_hartree/rcdk.fp.fingerprint/mxnet.nn.neural_network/shotgun_G_Hartree_randFP1033_corr-0.851_mxnet_150-65-21-1_2018-04-20']

**R tutorials will be released later.**