# Neomaril Training

This notebook give a exemple on how to use Neomaril to training a ML model

### NeomarilTrainingClient

It's where you can manage your trainining experiments

In [1]:
# Import the client
from neomaril_codex.training import NeomarilTrainingClient

In [2]:
# Start the client. We are reading the credentials in the NEOMARIL_TOKEN env variable

client = NeomarilTrainingClient()
client

2023-11-09 12:07:43.720 | INFO     | neomaril_codex.training:__init__:1063 - Loading .env
2023-11-09 12:07:43.722 | INFO     | neomaril_codex.base:__init__:90 - Loading .env
2023-11-09 12:07:45.565 | INFO     | neomaril_codex.base:__init__:102 - Successfully connected to Neomaril


NeomarilTrainingClient(url="http://localhost:7070/api", version="eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCIsImtpZCI6IlFnc0JWQ0I5WFc0V1YtSkVCVkJiZyJ9.eyJodHRwczovL25lb21hcmlsLmRhdGFyaXNrLm5ldC9uZW9tYXJpbC1ncm91cCI6ImRhdGFyaXNrIiwiaXNzIjoiaHR0cHM6Ly9kZXYtbWszbzdsYXp4bGUzMGh3cS51cy5hdXRoMC5jb20vIiwic3ViIjoiYXV0aDB8NjU0OTRlMWFkOTUzN2FlMGFhZDZjNGE5IiwiYXVkIjpbImh0dHBzOi8vZGV2LW1rM283bGF6eGxlMzBod3EudXMuYXV0aDAuY29tL2FwaS92Mi8iLCJodHRwczovL2Rldi1tazNvN2xhenhsZTMwaHdxLnVzLmF1dGgwLmNvbS91c2VyaW5mbyJdLCJpYXQiOjE2OTk1NDI0NjUsImV4cCI6MTY5OTU1MzI2NSwiYXpwIjoia3JCNk1sR3ZkOEdBSUNZd1hPd0labU5DMFk5VGFQQTciLCJzY29wZSI6Im9wZW5pZCBwcm9maWxlIGVtYWlsIGFkZHJlc3MgcGhvbmUgcmVhZDpjdXJyZW50X3VzZXIgdXBkYXRlOmN1cnJlbnRfdXNlcl9tZXRhZGF0YSBkZWxldGU6Y3VycmVudF91c2VyX21ldGFkYXRhIGNyZWF0ZTpjdXJyZW50X3VzZXJfbWV0YWRhdGEgY3JlYXRlOmN1cnJlbnRfdXNlcl9kZXZpY2VfY3JlZGVudGlhbHMgZGVsZXRlOmN1cnJlbnRfdXNlcl9kZXZpY2VfY3JlZGVudGlhbHMgdXBkYXRlOmN1cnJlbnRfdXNlcl9pZGVudGl0aWVzIG9mZmxpbmVfYWNjZXNzIiwiZ3R5IjoicGFzc3dvcmQifQ.A7LBdPwObV6DkP8GjEk

## NeomarilTrainingExperiment

It's where you can create a training experiment to find the best model

#### Custom training

With Custom training you have to create the training function.

In [3]:
# Creating a new training experiment
training = client.create_training_experiment('Teste notebook Training custom', # Experiment name, this is how you find your model in MLFLow
                                            'Classification', # Model type. Can be Classification, Regression or Unsupervised
                                            group='datarisk' # This is the default group. Create a new one when using for a new project
                                            )

2023-11-09 12:07:45.604 | INFO     | neomaril_codex.training:create_training_experiment:1164 - New Training 'Teste notebook Training custom' inserted.
2023-11-09 12:07:45.606 | INFO     | neomaril_codex.training:__init__:611 - Loading .env


In [4]:
training

NeomarilTrainingExperiment(name="Teste notebook Training custom", 
                                                        group="datarisk", 
                                                        training_id="T2e179e0f66e4c8b9507da56d706c0079c0d2d910b4648b6a6c5e17d7240d6f5",
                                                        model_type=Classification
                                                        )

In [5]:
# With the experiment class we can create multiple model runs
PATH = './samples/train/'

run = training.run_training('First test', # Run name
                            train_data=PATH+'dados.csv', # Path to the file with training data
                            source_file=PATH+'app.py', # Path of the source file
                            requirements_file=PATH+'requirements.txt', # Path of the requirements file, 
#                           env=PATH+'.env'  #  File for env variables (this will be encrypted in the server)
#                           extra_files=[PATH+'utils.py'], # List with extra files paths that should be uploaded along (they will be all in the same folder)
                            training_reference='train_model', # The name of the entrypoint function that is going to be called inside the source file 
                            training_type='Custom',
                            python_version='3.9', # Can be 3.7 to 3.10
                            wait_complete=True
)

2023-11-09 12:07:45.759 | INFO     | neomaril_codex.training:__execute_training:843 - Model training starting - Hash: T2e179e0f66e4c8b9507da56d706c0079c0d2d910b4648b6a6c5e17d7240d6f5
2023-11-09 12:07:45.775 | INFO     | neomaril_codex.base:__init__:279 - Loading .env
2023-11-09 12:07:45.806 | INFO     | neomaril_codex.training:__init__:329 - Loading .env


logger.info({"ExecutionId":5,"Message":"Training files have been uploaded! Use the id \u00275\u0027 to execute the train experiment."}
Wating the training run.......

In [6]:
run.get_status()

{'ExecutionId': '5',
 'Status': 'Succeeded',
 'Message': 'wasbs://mlflow-dev@datariskmlops.blob.core.windows.net/artifacts/2/9919f2e9a0314ebab75c41e17abd07b9/artifacts'}

In [7]:
run.execution_data

{'TrainingHash': 'T2e179e0f66e4c8b9507da56d706c0079c0d2d910b4648b6a6c5e17d7240d6f5',
 'ExperimentName': 'Teste notebook Training custom',
 'GroupName': 'datarisk',
 'ModelType': 'Classification',
 'TrainingType': 'Custom',
 'ExecutionId': 5,
 'RunName': 'First test',
 'ExecutionState': 'Succeeded',
 'TimeElapsed': 180229,
 'Description': '',
 'Deployable': True,
 'RunData': {'metrics': [{'key': 'training_precision_score',
    'value': 1.0,
    'timestamp': 1699542572563,
    'step': 0},
   {'key': 'training_recall_score',
    'value': 1.0,
    'timestamp': 1699542572563,
    'step': 0},
   {'key': 'training_f1_score',
    'value': 1.0,
    'timestamp': 1699542572563,
    'step': 0},
   {'key': 'training_accuracy_score',
    'value': 1.0,
    'timestamp': 1699542572563,
    'step': 0},
   {'key': 'training_log_loss',
    'value': 0.00047178298806016674,
    'timestamp': 1699542572563,
    'step': 0},
   {'key': 'training_roc_auc',
    'value': 1.0,
    'timestamp': 1699542572563,
    's

In [8]:
# When the run is finished you can download the model file
run.download_result()

# or promote promete it to a deployed model

PATH = './samples/syncModel/'

model = run.promote_model('Teste notebook promoted custom', # model_name
                            'score', # name of the scoring function
                            PATH+'app.py', # Path of the source file
                            PATH+'schema.json', # Path of the schema file, but it could be a dict
#                           env=PATH+'.env'  #  File for env variables (this will be encrypted in the server)
#                           extra_files=[PATH+'utils.py'], # List with extra files paths that should be uploaded along (they will be all in the same folder)
                            operation="Sync" # Can be Sync or Async
)

2023-11-09 12:10:48.166 | INFO     | neomaril_codex.base:download_result:413 - Output saved in ./output.zip


2023-11-09 12:10:48.208 | INFO     | neomaril_codex.training:__upload_model:429 - Model 'Teste notebook promoted custom' promoted from T2e179e0f66e4c8b9507da56d706c0079c0d2d910b4648b6a6c5e17d7240d6f5 - Hash: "Mbe371ff4bfc4f82897619da84c3c004211614c6677d431ca98a251d5da826d5"
2023-11-09 12:10:50.758 | INFO     | neomaril_codex.training:__host_model:494 - Model host in process - Hash: Mbe371ff4bfc4f82897619da84c3c004211614c6677d431ca98a251d5da826d5
2023-11-09 12:10:50.760 | INFO     | neomaril_codex.model:__init__:69 - Loading .env


In [9]:
model

NeomarilModel(name="Teste notebook promoted custom", group="datarisk", 
                                status="Building",
                                model_id="Mbe371ff4bfc4f82897619da84c3c004211614c6677d431ca98a251d5da826d5",
                                operation="Sync",
                                schema={'mean_radius': 17.99, 'mean_texture': 10.38, 'mean_perimeter': 122.8, 'mean_area': 1001.0, 'mean_smoothness': 0.1184, 'mean_compactness': 0.2776, 'mean_concavity': 0.3001, 'mean_concave_points': 0.1471, 'mean_symmetry': 0.2419, 'mean_fractal_dimension': 0.07871, 'radius_error': 1.095, 'texture_error': 0.9053, 'perimeter_error': 8.589, 'area_error': 153.4, 'smoothness_error': 0.006399, 'compactness_error': 0.04904, 'concavity_error': 0.05373, 'concave_points_error': 0.01587, 'symmetry_error': 0.03003, 'fractal_dimension_error': 0.006193, 'worst_radius': 25.38, 'worst_texture': 17.33, 'worst_perimeter': 184.6, 'worst_area': 2019.0, 'worst_smoothness': 0.1622, 'worst_compa

#### AutoML

With AutoML you just need to upload the data and some configuration

In [10]:
PATH = './samples/autoML/'

run = training.run_training('First test', # Run name
                            training_type='AutoML',
                            train_data=PATH+'dados.csv', # Path to the file with training data
                            conf_dict=PATH+'conf.json', # Path of the configuration file
                            wait_complete=True
)

logger.info({"ExecutionId":6,"Message":"Training files have been uploaded! Use the id \u00276\u0027 to execute the train experiment."}


2023-11-09 12:10:51.104 | INFO     | neomaril_codex.training:__execute_training:843 - Model training starting - Hash: T2e179e0f66e4c8b9507da56d706c0079c0d2d910b4648b6a6c5e17d7240d6f5
2023-11-09 12:10:51.120 | INFO     | neomaril_codex.base:__init__:279 - Loading .env
2023-11-09 12:10:51.147 | INFO     | neomaril_codex.training:__init__:329 - Loading .env


Wating the training run.............

In [11]:
run

NeomarilTrainingExecution(name="Teste notebook Training custom",
                                        exec_id="6", status="Succeeded")

In [12]:
run.get_status()

{'ExecutionId': '6',
 'Status': 'Succeeded',
 'Message': 'wasbs://mlflow-dev@datariskmlops.blob.core.windows.net/artifacts/2/a9d853a19095410eb3cf4aa5c57613f9/artifacts'}

In [13]:
run

NeomarilTrainingExecution(name="Teste notebook Training custom",
                                        exec_id="6", status="Succeeded")

In [14]:
# Promote a AutoML model is a lot easier

PATH = './samples/syncModel/'

model = run.promote_model('Teste notebook promoted autoML', # model_name
                            operation="Async" # Can be Sync or Async
)

2023-11-09 12:16:57.717 | INFO     | neomaril_codex.training:__upload_model:429 - Model 'Teste notebook promoted autoML' promoted from T2e179e0f66e4c8b9507da56d706c0079c0d2d910b4648b6a6c5e17d7240d6f5 - Hash: "Mba250c10ea9408284db84b02f954708e73528a67e3b49a48a04a3b641b3a247"


2023-11-09 12:16:57.947 | INFO     | neomaril_codex.training:__host_model:494 - Model host in process - Hash: Mba250c10ea9408284db84b02f954708e73528a67e3b49a48a04a3b641b3a247
2023-11-09 12:16:57.949 | INFO     | neomaril_codex.model:__init__:69 - Loading .env


In [15]:
model

NeomarilModel(name="Teste notebook promoted autoML", group="datarisk", 
                                status="Building",
                                model_id="Mba250c10ea9408284db84b02f954708e73528a67e3b49a48a04a3b641b3a247",
                                operation="Async",
                                schema={}
                                )