# Neomaril Training

This notebook give a exemple on how to use Neomaril to training a ML model

### NeomarilTrainingClient

It's where you can manage your trainining experiments

In [1]:
# Import the client
from neomaril_codex.training import NeomarilTrainingClient

In [2]:
# Start the client. We are reading the credentials in the NEOMARIL_TOKEN env variable

client = NeomarilTrainingClient()
client

2023-10-26 09:12:51.870 | INFO     | neomaril_codex.training:__init__:690 - Loading .env
2023-10-26 09:12:51.871 | INFO     | neomaril_codex.base:__init__:90 - Loading .env
2023-10-26 09:12:53.856 | INFO     | neomaril_codex.base:__init__:102 - Successfully connected to Neomaril


NeomarilTrainingClient(url="https://neomaril.staging.datarisk.net/api", version="1.0")

## NeomarilTrainingExperiment

It's where you can create a training experiment to find the best model

#### Custom training

With Custom training you have to create the training function.

In [3]:
# Creating a new training experiment
training = client.create_training_experiment('Teste notebook Training custom', # Experiment name, this is how you find your model in MLFLow
                                            'Classification', # Model type. Can be Classification, Regression or Unsupervised
                                            'Custom', # Training type. Can be Custom or AutoML
                                            group='datarisk' # This is the default group. Create a new one when using for a new project
                                            )

2023-10-26 09:12:56.644 | INFO     | neomaril_codex.training:create_training_experiment:796 - New Training 'Teste notebook Training custom' inserted.
2023-10-26 09:12:56.645 | INFO     | neomaril_codex.training:__init__:359 - Loading .env


In [4]:
training

NeomarilTrainingExperiment(name="Teste notebook Training custom", 
                                                        group="datarisk", 
                                                        training_id="T12c2f68c80540aaa05541b4e1be282916c1089c1d5d416e978b81830ed3410f",
                                                        training_type="Custom",
                                                        model_type=Classification
                                                        )

In [5]:
# With the experiment class we can create multiple model runs
PATH = './samples/train/'

run = training.run_training('First test', # Run name
                            PATH+'dados.csv', # Path to the file with training data
                            source_file=PATH+'app.py', # Path of the source file
                            requirements_file=PATH+'requirements.txt', # Path of the requirements file, 
#                           env=PATH+'.env'  #  File for env variables (this will be encrypted in the server)
#                           extra_files=[PATH+'utils.py'], # List with extra files paths that should be uploaded along (they will be all in the same folder)
                            training_reference='train_model', # The name of the entrypoint function that is going to be called inside the source file 
                            python_version='3.9', # Can be 3.7 to 3.10
                            wait_complete=True
)

2023-10-26 09:13:02.375 | INFO     | neomaril_codex.training:__upload_training:481 - {"ExecutionId":1,"Message":"Training files have been uploaded. Use the execution id \u00271\u0027 to execute the train experiment run."}
2023-10-26 09:13:04.117 | INFO     | neomaril_codex.training:__execute_training:505 - Model training starting - Hash: T12c2f68c80540aaa05541b4e1be282916c1089c1d5d416e978b81830ed3410f
2023-10-26 09:13:05.264 | INFO     | neomaril_codex.base:__init__:279 - Loading .env
2023-10-26 09:13:08.420 | INFO     | neomaril_codex.training:__init__:72 - Loading .env


Wating the training run..........

In [6]:
run.get_status()

{'ExecutionId': '1',
 'Status': 'Succeeded',
 'Message': 'wasbs://mlflow-staging@datariskmlops.blob.core.windows.net/artifacts/5/681f9d128726486196df7ef662a51179/artifacts'}

In [7]:
run.execution_data

{'TrainingHash': 'T9b0c79d4e5440b8b0e8f3595ed4a34ef9dce58da76f452db3fd748f66e568e6',
 'ExperimentName': 'Teste notebook Training custom',
 'GroupName': 'datarisk',
 'ModelType': 'Classification',
 'TrainingType': 'Custom',
 'ExecutionId': 2,
 'RunName': 'First test',
 'ExecutionState': 'Succeeded',
 'TimeElapsed': 227514,
 'RunData': {'metrics': [{'key': 'f1_score-5_X_test-10',
    'value': 0.985915492957746,
    'timestamp': 1698247311771,
    'step': 0},
   {'key': 'auc',
    'value': 0.992620971305871,
    'timestamp': 1698247312505,
    'step': 0},
   {'key': 'f1_score',
    'value': 0.976180961216634,
    'timestamp': 1698247312505,
    'step': 0},
   {'key': 'training_precision_score',
    'value': 1.0,
    'timestamp': 1698247311953,
    'step': 0},
   {'key': 'training_recall_score',
    'value': 1.0,
    'timestamp': 1698247311953,
    'step': 0},
   {'key': 'training_f1_score',
    'value': 1.0,
    'timestamp': 1698247311953,
    'step': 0},
   {'key': 'training_accuracy_sco

In [8]:
# When the run is finished you can download the model file
run.download_result()

# or promote promete it to a deployed model

PATH = './samples/syncModel/'

model = run.promote_model('Teste notebook promoted custom', # model_name
                            'score', # name of the scoring function
                            PATH+'app.py', # Path of the source file
                            PATH+'schema.json', # Path of the schema file, but it could be a dict
#                           env=PATH+'.env'  #  File for env variables (this will be encrypted in the server)
#                           extra_files=[PATH+'utils.py'], # List with extra files paths that should be uploaded along (they will be all in the same folder)
                            operation="Sync" # Can be Sync or Async
)

2023-10-25 12:22:12.051 | INFO     | neomaril_codex.base:download_result:413 - Output saved in ./output.zip
2023-10-25 12:22:13.362 | INFO     | neomaril_codex.training:__upload_model:177 - Model 'Teste notebook promoted custom' promoted from T9b0c79d4e5440b8b0e8f3595ed4a34ef9dce58da76f452db3fd748f66e568e6 - Hash: "M9174acaa0e54298877a00f6cc765bd17fe29ecb20694c859c4ed88acac9afba"
2023-10-25 12:22:15.299 | INFO     | neomaril_codex.training:__host_model:242 - Model host in process - Hash: M9174acaa0e54298877a00f6cc765bd17fe29ecb20694c859c4ed88acac9afba
2023-10-25 12:22:15.321 | INFO     | neomaril_codex.model:__init__:69 - Loading .env


In [9]:
model

NeomarilModel(name="Teste notebook promoted custom", group="datarisk", 
                                status="Building",
                                model_id="M9174acaa0e54298877a00f6cc765bd17fe29ecb20694c859c4ed88acac9afba",
                                operation="Sync",
                                schema={'mean_radius': 17.99, 'mean_texture': 10.38, 'mean_perimeter': 122.8, 'mean_area': 1001.0, 'mean_smoothness': 0.1184, 'mean_compactness': 0.2776, 'mean_concavity': 0.3001, 'mean_concave_points': 0.1471, 'mean_symmetry': 0.2419, 'mean_fractal_dimension': 0.07871, 'radius_error': 1.095, 'texture_error': 0.9053, 'perimeter_error': 8.589, 'area_error': 153.4, 'smoothness_error': 0.006399, 'compactness_error': 0.04904, 'concavity_error': 0.05373, 'concave_points_error': 0.01587, 'symmetry_error': 0.03003, 'fractal_dimension_error': 0.006193, 'worst_radius': 25.38, 'worst_texture': 17.33, 'worst_perimeter': 184.6, 'worst_area': 2019.0, 'worst_smoothness': 0.1622, 'worst_compa

#### AutoML

With AutoML you just need to upload the data and some configuration

In [10]:
# Creating a new training experiment
training = client.create_training_experiment('Teste notebook Training AutoML', # Experiment name
                                            'Classification', # Model type. Can be Classification, Regression or Unsupervised
                                            'AutoML', # Training type. Can be Custom or AutoML
                                            group='datarisk' # This is the default group. Create a new one when using for a new project
                                            )

PATH = './samples/autoML/'

run = training.run_training('First test', # Run name
                            PATH+'dados.csv', # Path to the file with training data
                            conf_dict=PATH+'conf.json', # Path of the configuration file
                            wait_complete=True
)

2023-10-25 12:22:22.591 | INFO     | neomaril_codex.training:create_training_experiment:796 - New Training 'Teste notebook Training AutoML' inserted.
2023-10-25 12:22:22.592 | INFO     | neomaril_codex.training:__init__:359 - Loading .env
2023-10-25 12:22:30.463 | INFO     | neomaril_codex.training:__upload_training:481 - {"ExecutionId":3,"Message":"Training files have been uploaded. Use the execution id \u00273\u0027 to execute the train experiment run."}
2023-10-25 12:22:33.433 | INFO     | neomaril_codex.training:__execute_training:505 - Model training starting - Hash: T33e9bc4fac6431e836bf637b012a0d41c5ffa6d873c401d917a51d2cb8cbcb8
2023-10-25 12:22:34.533 | INFO     | neomaril_codex.base:__init__:279 - Loading .env
2023-10-25 12:22:37.419 | INFO     | neomaril_codex.training:__init__:72 - Loading .env


Wating the training run...................

In [11]:
run

NeomarilTrainingExecution(name="Teste notebook Training AutoML",
                                        exec_id="3", status="Succeeded")

In [12]:
run.get_status()

{'ExecutionId': '3',
 'Status': 'Succeeded',
 'Message': 'wasbs://mlflow-staging@datariskmlops.blob.core.windows.net/artifacts/6/8ee81051ee3d4015b9677021deb5474b/artifacts'}

In [13]:
run

NeomarilTrainingExecution(name="Teste notebook Training AutoML",
                                        exec_id="3", status="Succeeded")

In [14]:
# Promote a AutoML model is a lot easier

PATH = './samples/syncModel/'

model = run.promote_model('Teste notebook promoted autoML', # model_name
                            operation="Async" # Can be Sync or Async
)

2023-10-25 12:33:05.903 | INFO     | neomaril_codex.training:__upload_model:177 - Model 'Teste notebook promoted autoML' promoted from T33e9bc4fac6431e836bf637b012a0d41c5ffa6d873c401d917a51d2cb8cbcb8 - Hash: "M955d9193a0a48a3817c0c512ac91de7d0716214a8334059bf11fe8aa4d80a2f"
2023-10-25 12:33:08.786 | INFO     | neomaril_codex.training:__host_model:242 - Model host in process - Hash: M955d9193a0a48a3817c0c512ac91de7d0716214a8334059bf11fe8aa4d80a2f
2023-10-25 12:33:08.788 | INFO     | neomaril_codex.model:__init__:69 - Loading .env


In [15]:
model

NeomarilModel(name="Teste notebook promoted autoML", group="datarisk", 
                                status="Building",
                                model_id="M955d9193a0a48a3817c0c512ac91de7d0716214a8334059bf11fe8aa4d80a2f",
                                operation="Async",
                                schema={}
                                )