# Neomaril Training

This notebook give a exemple on how to use Neomaril to training a ML model

### NeomarilTrainingClient

It's where you can manage your trainining experiments

In [1]:
# Import the client
from neomaril_codex.training import NeomarilTrainingClient

In [2]:
# Start the client
token='123'

client = NeomarilTrainingClient(token)
client

2023-01-31 12:56:40.033 | INFO     | neomaril_codex._base:__init__:74 - You are using the test environment that will have the data cleaned from time to time. If your model is ready to use change the flag test_enviroment to False
2023-01-31 12:56:40.039 | INFO     | neomaril_codex._base:__init__:83 - Successfully connected to Neomaril


NeomarilTrainingClient(environment="Staging", version="1.0")

## NeomarilTrainingExperiment

It's where you can create a training experiment to find the best model

#### Custom training

With Custom training you have to create the training function.

In [3]:
# Creating a new training experiment
training = client.create_training_experiment('Teste notebook Training custom', # Experiment name, this is how you find your model in MLFLow
                                            'Classification', # Model type. Can be Classification, Regression or Unsupervised
                                            'Custom', # Training type. Can be Custom or AutoML
                                            )

2023-01-31 12:56:40.169 | INFO     | neomaril_codex.training:create_training_experiment:467 - New Training inserted with hash Tfb3274827a24dc39d5b78603f348aee8d3dbfe791574dc4a6681a7e2a6622fa.


In [4]:
training

NeomarilTrainingExperiment(name="Teste notebook Training custom", 
                                                        group="datarisk", 
                                                        environment="Staging"
                                                        training_id="Tfb3274827a24dc39d5b78603f348aee8d3dbfe791574dc4a6681a7e2a6622fa",
                                                        training_type="Custom",
                                                        model_type=Classification
                                                        )

In [5]:
# With the experiment class we can create multiple model runs
PATH = './samples/train/'

run = training.run_training('First test', # Run name
                            PATH+'dados.csv', # Path to the file with training data
                            source_file=PATH+'app.py', # Path of the source file
                            requirements_file=PATH+'requirements.txt', # Path of the requirements file, 
#                           env=PATH+'.env'  #  File for env variables (this will be encrypted in the server)
#                           extra_files=[PATH+'utils.py'], # List with extra files paths that should be uploaded along (they will be all in the same folder)
                            training_reference='train_model',
                            python_version='3.9', # Can be 3.7 to 3.10
                            wait_complete=True

)

2023-01-31 12:56:40.284 | INFO     | neomaril_codex.training:__upload_training:289 - {"Message":"Training files have been uploaded. Execution id is '6'"}
2023-01-31 12:56:40.406 | INFO     | neomaril_codex.training:__execute_training:308 - Model training starting - Hash: Tfb3274827a24dc39d5b78603f348aee8d3dbfe791574dc4a6681a7e2a6622fa


Wating the training run......

2023-01-31 12:59:11.430 | INFO     | neomaril_codex._base:get_status:231 - You can check the run info in https://mlflow.staging.datarisk.net/ 


In [6]:
run.execution_data

{'ModelHash': 'Tfb3274827a24dc39d5b78603f348aee8d3dbfe791574dc4a6681a7e2a6622fa',
 'ExperimentName': 'Teste notebook Training custom',
 'GroupName': 'datarisk',
 'ModelType': 'Classification',
 'TrainingType': 'Custom',
 'ExecutionId': 6,
 'ExecutionState': 'Running',
 'InputPayload': '{\n    "trainingReference": "train_model",\n    "pythonVersion": "Python39",\n    "executionType": "neomaril-custom-training",\n    "experimentName": "Teste notebook Training custom",\n    "runName": "First test",\n    "basePath": "/app/store/datarisk/Tfb3274827a24dc39d5b78603f348aee8d3dbfe791574dc4a6681a7e2a6622fa/6",\n    "neomarilExecutionId": "6"\n}',
 'OutputPayload': '{}',
 'RunAt': '2023-01-31T15:56:40.2797370Z',
 'Status': 'Succeeded'}

In [7]:
run.get_status()

2023-01-31 12:59:11.826 | INFO     | neomaril_codex._base:get_status:231 - You can check the run info in https://mlflow.staging.datarisk.net/ 


{'trainingExecutionId': '6',
 'Status': 'Succeeded',
 'Message': '{\n    "artifacts": "wasbs://mlflow-dev@datariskmlops.blob.core.windows.net/artifacts/1/1d4f69f9ea3f4d27bfd52a19a329f220/artifacts",\n    "mlflow_run_id": "1d4f69f9ea3f4d27bfd52a19a329f220"\n}'}

In [8]:
# When the run is finished you can download the model file
run.download_result()

# or promote promete it to a deployed model

PATH = './samples/syncModel/'

model = run.promote_model('Teste notebook promoted custom', # model_name
                            'score', # name of the scoring function
                            PATH+'app.py', # Path of the source file
                            PATH+'schema.json', # Path of the schema file, but it could be a dict
#                           env=PATH+'.env'  #  File for env variables (this will be encrypted in the server)
#                           extra_files=[PATH+'utils.py'], # List with extra files paths that should be uploaded along (they will be all in the same folder)
                            operation="Sync" # Can be Sync or Async
)

2023-01-31 12:59:11.868 | INFO     | neomaril_codex._base:download_result:261 - Output saved in ./output_6.zip
2023-01-31 12:59:12.159 | INFO     | neomaril_codex.training:__upload_model:111 - Model 'Teste notebook promoted custom' promoted from Tfb3274827a24dc39d5b78603f348aee8d3dbfe791574dc4a6681a7e2a6622fa - Hash: "M069254cf15441fe9af565d4f735840b0d1345424927424bb47611bb24965992"
2023-01-31 12:59:12.196 | INFO     | neomaril_codex.training:__host_model:133 - Model host in process - Hash: M069254cf15441fe9af565d4f735840b0d1345424927424bb47611bb24965992


In [9]:
model

NeomarilModel(name="Teste notebook promoted custom", group="datarisk", 
                                status="Building", environment="Staging"
                                model_id="M069254cf15441fe9af565d4f735840b0d1345424927424bb47611bb24965992",
                                operation="Sync",
                                schema={
  "mean radius": 17.99,
  "mean texture": 10.38,
  "mean perimeter": 122.8,
  "mean area": 1001.0,
  "mean smoothness": 0.1184,
  "mean compactness": 0.2776,
  "mean concavity": 0.3001,
  "mean concave points": 0.1471,
  "mean symmetry": 0.2419,
  "mean fractal dimension": 0.07871,
  "radius error": 1.095,
  "texture error": 0.9053,
  "perimeter error": 8.589,
  "area error": 153.4,
  "smoothness error": 0.006399,
  "compactness error": 0.04904,
  "concavity error": 0.05373,
  "concave points error": 0.01587,
  "symmetry error": 0.03003,
  "fractal dimension error": 0.006193,
  "worst radius": 25.38,
  "worst texture": 17.33,
  "worst perimeter": 

#### AutoML

With AutoML you just need to upload the data and some configuration

In [10]:
# Creating a new training experiment
training = client.create_training_experiment('Teste notebook Training AutoML', # Experiment name
                                            'Classification', # Model type. Can be Classification, Regression or Unsupervised
                                            'AutoML', # Training type. Can be Custom or AutoML
                                            )

PATH = './samples/autoML/'

run = training.run_training('First test', # Run name
                            PATH+'dados.csv', # Path to the file with training data
                            conf_dict=PATH+'conf.json', # Path of the configuration file
                            wait_complete=True
)

2023-01-31 12:59:12.323 | INFO     | neomaril_codex.training:create_training_experiment:467 - New Training inserted with hash Tc53f0a17ba04f9ca3a86b8d6af5fb45e15097d263304571bfb96b6e5b009ae5.
2023-01-31 12:59:12.350 | INFO     | neomaril_codex.training:__upload_training:289 - {"Message":"Training files have been uploaded. Execution id is '7'"}
2023-01-31 12:59:12.474 | INFO     | neomaril_codex.training:__execute_training:308 - Model training starting - Hash: Tc53f0a17ba04f9ca3a86b8d6af5fb45e15097d263304571bfb96b6e5b009ae5


Wating the training run........

2023-01-31 13:02:44.273 | INFO     | neomaril_codex._base:get_status:231 - You can check the run info in https://mlflow.staging.datarisk.net/ 


In [11]:
run

NeomarilTrainingExecution(exec_id="7", status="Succeeded")

In [12]:
run.get_status()

2023-01-31 13:02:44.667 | INFO     | neomaril_codex._base:get_status:231 - You can check the run info in https://mlflow.staging.datarisk.net/ 


{'trainingExecutionId': '7',
 'Status': 'Succeeded',
 'Message': '{\n    "artifacts": "wasbs://mlflow-dev@datariskmlops.blob.core.windows.net/artifacts/2/9c96f54bb26548c2964a076080a79237/artifacts",\n    "mlflow_run_id": "9c96f54bb26548c2964a076080a79237"\n}'}

In [13]:
run

NeomarilTrainingExecution(exec_id="7", status="Succeeded")

In [14]:
# Promote a AutoML model is a lot easier

PATH = './samples/syncModel/'

model = run.promote_model('Teste notebook promoted autoML', # model_name
                            operation="Async" # Can be Sync or Async
)

2023-01-31 13:02:45.036 | INFO     | neomaril_codex.training:__upload_model:111 - Model 'Teste notebook promoted autoML' promoted from Tc53f0a17ba04f9ca3a86b8d6af5fb45e15097d263304571bfb96b6e5b009ae5 - Hash: "M6d5c0066e984fc1a953423c0a8f3145d9c33fec32ee406e9d7e25948e885384"
2023-01-31 13:02:45.069 | INFO     | neomaril_codex.training:__host_model:133 - Model host in process - Hash: M6d5c0066e984fc1a953423c0a8f3145d9c33fec32ee406e9d7e25948e885384


In [15]:
model

NeomarilModel(name="Teste notebook promoted autoML", group="datarisk", 
                                status="Building", environment="Staging"
                                model_id="M6d5c0066e984fc1a953423c0a8f3145d9c33fec32ee406e9d7e25948e885384",
                                operation="Async",
                                schema={}
                                )