# MLOps Training

This notebook give a exemple on how to use MLOps to training a ML model

### MLOpsTrainingClient

It's where you can manage your trainining experiments

In [None]:
from mlops_codex.training import MLOpsTrainingClient

### Initializing the MLOpsTrainingClient
In this cell, we are initializing the `MLOpsTrainingClient` which will be used to manage our training experiments.

In [None]:
client = MLOpsTrainingClient()
client

## MLOpsTrainingExperiment

It's where you can create a training experiment to find the best model

#### Custom training

With Custom training, you have to create the training function. For you, as a data scientist, it's common to re-run the entire notebook, over and over. To avoid creating the same experiment repeatedly, the `force = False` parameter will disallow it. If you wish to create a new experiment with the same attributes, turn `force = True`.

If you have two equal experiments and pass `force = False`, the first created experiment will be chosen.

In [None]:
# Creating a new training experiment
training = client.create_training_experiment(
    experiment_name='experiment',
    model_type='Classification',
    group='<group>',
)

In [None]:
training

In [None]:
# With the experiment class we can create multiple model runs
PATH = './samples/train/'

run = training.run_training(
    run_name='First test',
    training_type='Custom',
    train_data=PATH + 'dados.csv',
    requirements_file=PATH + 'requirements.txt',
    source_file=PATH + 'app.py',
    python_version='3.9',
    training_reference='train_model',
    wait_complete=True
)

#### AutoML

With AutoML you just need to upload the data and some configuration

In [None]:
PATH = './samples/autoML/'

run2 = training.run_training(
    run_name='First test',
    training_type='AutoML',
    conf_dict=PATH + "conf.json",
    train_data=PATH + 'dados.csv',
    wait_complete=True
)

#### External Training

Besides the autoML and custom training, you can perform a training on your own machine and upload the files!

Look the example bellow



In [None]:
PATH = './samples/uploadTrainedModel/'

run3 = training.run_training(
    run_name='First test',
    training_type="External",
    features_file=PATH + 'features.parquet',
    target_file=PATH + 'target.parquet',
    output_file=PATH + 'predictions.parquet',
    metrics_file=PATH + 'metrics.json',
    parameters_file=PATH + 'parameters.json',
    requirements_file=PATH + 'requirements.txt',
    wait_complete=True
)

---

#### Interactive External Training

However, if you wish something more interactive, take a look in the example bellow.

In [None]:
from mlops_codex.training import MLOpsTrainingClient
client = MLOpsTrainingClient()
training = client.create_training_experiment(
    experiment_name='Teste',
    model_type='Classification',
    group='<group>'
)

In [None]:
import pandas as pd
from lightgbm import LGBMClassifier
from sklearn.impute import SimpleImputer
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import cross_val_score

In [None]:
base_path = './samples/train/'
df = pd.read_csv(base_path+"/dados.csv")
X = df.drop(columns=['target'])
y = df[["target"]]

In [None]:
import matplotlib.pyplot as plt

plt.scatter(df["mean_radius"], df["mean_texture"])

# Configurar o título do gráfico
plt.title("Relação entre mean_radius e mean_texture")

# Configurar os rótulos dos eixos
plt.xlabel("mean_radius")
plt.ylabel("mean_texture")

fig = plt.gcf()

# Exibir o gráfico
plt.show()


In [None]:
pipe = make_pipeline(SimpleImputer(), LGBMClassifier(force_col_wise=True))
pipe.fit(X, y)

In [None]:
with training.log_train(name='Teste 2', X_train=X, y_train=y) as logger:
    logger.save_model(pipe)

    model_output = pd.DataFrame({"pred": pipe.predict(X), "proba": pipe.predict_proba(X)[:,1]})

    logger.save_model_output(model_output)

    logger.save_plot(fig=fig, filename="test-image")

    auc = cross_val_score(pipe, X, y, cv=5, scoring="roc_auc")
    f_score = cross_val_score(pipe, X, y, cv=5, scoring="f1")
    logger.save_metric(name='auc', value=auc.mean())
    logger.save_metric(name='f1_score', value=f_score.mean())
