# Setup

Before you run this notebook you should make sure that you use the appropriate kernel.

You can create the kernel using the repo's conda environment.

For instructions on how to setup your conda/poetry environment and create your kernel, check the documentation in  our [wiki](https://code.roche.com/one-d-ai/early-adopters/user-guide-wiki/-/wikis/How-To/sHPC-setup).

# Import the library

When you run the notebook inside the repo you import the code like a package.

The only file we will need from this package is the `functions.py`.

In [1]:
from sklearn_diabetes_example import functions

# Train the model

## Define the input parameters

- the path of the train input data _(you can use any of the files provided inside the `data` folder)_
- the path where the model output with be saved _(this should be a path that does not exist already, otherwise the code will override an existing path and replace another model)_

In order to properly define the path, we also import the `Path` package here, but feel free to use your own way to define the file paths if you want. 

In [2]:
# We use the Path package here to properly define paths
from pathlib import Path
repo_path = Path().absolute().parent.absolute()

In [3]:
# Choose the name of your model
my_model_name = "sklearn-diabetes-model-new-version.joblib"

In [4]:
# Set up the paths that the function accepts as input
training_set_path = Path(repo_path, "data")
output_model_path = Path(repo_path, "model", my_model_name)

## Run the train_model() function

If you want to make changes in the way the funtion works:

- go to the `functions.py` file in the repo
- change the code inside the `train_model()` function
- restart the kernel in this notebook
- run the notebook and the function will run with your changes

In [5]:
functions.train_model(training_set_path, output_model_path)

Linear model coefs:  [   1.19997199 -233.50009961  519.89061727  304.47222327 -726.41560392
  415.78324592   82.99787136  203.15998985  667.91424959  105.28299311]
Alpha values for Ridge model:  [0.0001     0.00039811 0.00158489 0.00630957 0.02511886 0.1       ]
Scores:  [0.7201620872332906, 0.7196548049101903, 0.7178453402676406, 0.7128274710534698, 0.7035182294760471, 0.6838578174345143]
Best Alpha:  0.0001
Ridge model coefs:  [   1.24029665 -233.44108731  519.95888412  304.41505873 -717.6212677
  408.82031947   79.05912592  202.03295583  664.58871481  105.34554582]
Alpha values for Lasso model:  [0.0001     0.00039811 0.00158489 0.00630957 0.02511886 0.1       ]
Scores:  [0.7202265744916254, 0.7198911974439465, 0.718488734564149, 0.7138336880141429, 0.7062509822470242, 0.6868281585868137]
Best Alpha:  0.0001
Lasso model coefs:  [   1.18818746 -233.41558978  519.94788291  304.40285114 -719.34770423
  410.35330424   79.60733918  201.89171373  665.37591048  105.2915302 ]
Cross validati

# Finished!

Now your model has been trained and is saved inside the path you provided in the funtions parameters.

To see how you can use the model, go to the `notebooks/use-local-model.ipynb` notebook.