# Create pickle file for the pre-trained model
In this notebook we train the model for DiagISM and save the results of it in a pickle file for easy access later on. First, we import the Python packages.

In [1]:
import pickle
from ast import literal_eval

import numpy as np

from astropy.table import Table

from sklearn import preprocessing
from sklearn.neural_network import MLPRegressor

## Read data
We read the luminosity data from the simulations and the preferred hyperparameters

In [2]:
dataset = Table.read('../data/raw/complete_dataset.fits', format='fits')
dataset['log(1+z)'] = np.log10(dataset['z']+1)

hyp_tab = Table.read('../data/interim/Hyperparameters_table.csv',
                     format='ascii.csv')

We arrange and define which parameters are we going to use

In [3]:
col_analt = dataset.colnames[3:-15:5]
col_analt.append('log(1+z)')

In [4]:
list_parameters = ['SFR', 'ISRF', 'ZGal', 'Pressure', 'n$(\\mathrm{H})_{\\mathrm{cloud}}$',
                   'R$_{\\mathrm{cloud}}$', 'M$_{\\mathrm{\\ast}}$', 'M$_{\\mathrm{gas}}$']

## Train models
For this model we train the regressor for each parameter and store that information in a single file

In [5]:
all_trained = []
for par in list_parameters:
    x_df = dataset[col_analt].to_pandas()
    y_df = dataset.to_pandas()[par].values.reshape(-1, 1)

    scalerx = preprocessing.RobustScaler()
    scalery = preprocessing.RobustScaler()
    x_scale = scalerx.fit_transform(x_df)
    y_scale = scalery.fit_transform(y_df)
    loc_hyp = np.where(hyp_tab['Parameter'] == par)
    regr_mlp = MLPRegressor(random_state=42, verbose=False,
                            hidden_layer_sizes=literal_eval(
                                hyp_tab[loc_hyp]['hidden_layer_sizes'][0]),
                            activation=hyp_tab[loc_hyp]['activation'][0],
                            solver='adam',
                            alpha=hyp_tab[loc_hyp]['alpha'][0],
                            batch_size=hyp_tab[loc_hyp]['batch_size'][0],
                            learning_rate_init=hyp_tab[loc_hyp]['learning_rate_init'][0],
                            max_iter=hyp_tab[loc_hyp]['max_iter'][0])
    regr_mlp.fit(x_scale, y_scale.ravel())
    all_trained.append(regr_mlp)
    print(par, 'added')

SFR added
ISRF added
ZGal added
Pressure added
n$(\mathrm{H})_{\mathrm{cloud}}$ added
R$_{\mathrm{cloud}}$ added
M$_{\mathrm{\ast}}$ added
M$_{\mathrm{gas}}$ added


We verify the hyperparameters

In [6]:
all_trained

[MLPRegressor(alpha=0.0002, batch_size=195, hidden_layer_sizes=(100, 83, 77),
              learning_rate_init=0.002, max_iter=500, random_state=42),
 MLPRegressor(alpha=0.0003, batch_size=154, hidden_layer_sizes=(91, 48, 56),
              learning_rate_init=0.003, max_iter=319, random_state=42),
 MLPRegressor(alpha=0.0002, batch_size=105, hidden_layer_sizes=(25, 68, 44),
              learning_rate_init=0.006, max_iter=357, random_state=42),
 MLPRegressor(activation='tanh', alpha=0.0005, batch_size=228,
              hidden_layer_sizes=(31, 58, 86), learning_rate_init=0.005,
              max_iter=267, random_state=42),
 MLPRegressor(alpha=0.0006, batch_size=151, hidden_layer_sizes=(38, 54, 51),
              learning_rate_init=0.009, max_iter=245, random_state=42),
 MLPRegressor(alpha=0.0006, batch_size=151, hidden_layer_sizes=(38, 54, 51),
              learning_rate_init=0.009, max_iter=245, random_state=42),
 MLPRegressor(activation='tanh', alpha=0.0016, batch_size=45,
          

## Save file
Finally, we save the file

In [8]:
pickle.dump(all_trained, open('../data/interim/AllLines_trained', 'wb'))

##### Notebook information

In [11]:
%load_ext watermark
%watermark -a "Andres Ramos" -d -v -m
print('Specific Python packages')
%watermark -iv -w -p atom,astropy

Author: Andres Ramos

Python implementation: CPython
Python version       : 3.8.3
IPython version      : 7.16.1

Compiler    : GCC 7.3.0
OS          : Linux
Release     : 3.10.0-1160.59.1.el7.x86_64
Machine     : x86_64
Processor   : x86_64
CPU cores   : 8
Architecture: 64bit

Specific Python packages
atom   : 4.10.0
astropy: 5.0

sklearn : 1.0.1
numpy   : 1.22.1
json    : 2.0.9
autopep8: 1.5.7

Watermark: 2.2.0

