# Multiple data scientists wroking on multiple ML models
MLflow setup:
- Tracking server: yes, remote server (EC2)
- Backend store: postgresql database
- Artifacts store: s3 bucket

The experiments can be explored by accessing the remote server.\
The example uses AWS to host a remote server. In order to run the example you'll need an AWS account.

In [2]:
import mlflow
import os

os.environ["AWS_PROFILE"] = ""


# Tracking server
TRACKING_SERVER_HOST = "ec2-54-196-146-14.compute-1.amazonaws.com"
mlflow.set_tracking_uri("http://{TRACKING_SERVER_HOST}:5000")

In [3]:
print(f"Tracking URI: '{mlflow.get_tracking_uri()}'")

Tracking URI: 'http://127.0.0.1:5000'


In [5]:
mlflow.search_experiments()
# by default is 'Default'

[<Experiment: artifact_location='/Users/tdafonseca/Desktop/Github/learning/deploy_machine_learning_model/02-experiment-tracking/src_scenario/artifacts_local/0', creation_time=1686595336355, experiment_id='0', last_update_time=1686595336355, lifecycle_stage='active', name='Default', tags={}>]

### logistic regression model

In [6]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

mlflow.set_experiment("my-experiment-1")

with mlflow.start_run():
    X, y = load_iris(return_X_y=True)

    params = {"C": 0.1, "random_state": 42}
    mlflow.log_params(params)

    lr = LogisticRegression(**params).fit(X,y)
    y_pred = lr.predict(X) # predict on then same data as an example
    mlflow.log_metric('accuracy', accuracy_score(y, y_pred))

    mlflow.sklearn.log_model(lr, artifact_path='models')
    print(f"Default artifacts URI: '{mlflow.get_artifact_uri()}'") # Where are the artifacts beeing stored


2023/06/12 20:45:47 INFO mlflow.tracking.fluent: Experiment with name 'my-experiment-1' does not exist. Creating a new experiment.


Default artifacts URI: '/Users/tdafonseca/Desktop/Github/learning/deploy_machine_learning_model/02-experiment-tracking/src_scenario/artifacts_local/1/7705ec160996449f9bee9d131d432703/artifacts'




At this point the metadata about the artifacts is being stored in the backend server

In [19]:
mlflow.search_experiments()

[<Experiment: artifact_location='/Users/tdafonseca/Desktop/Github/learning/deploy_machine_learning_model/02-experiment-tracking/src_scenario/artifacts_local/1', creation_time=1686595547689, experiment_id='1', last_update_time=1686595547689, lifecycle_stage='active', name='my-experiment-1', tags={}>,
 <Experiment: artifact_location='/Users/tdafonseca/Desktop/Github/learning/deploy_machine_learning_model/02-experiment-tracking/src_scenario/artifacts_local/0', creation_time=1686595336355, experiment_id='0', last_update_time=1686595336355, lifecycle_stage='active', name='Default', tags={}>]

## Interacting with model registry


In [8]:
from mlflow.tracking import MlflowClient

client = MlflowClient('http://127.0.0.1:5000')

In [9]:
client.search_registered_models()

[]

In [17]:
client.get_experiment_by_name("my-experiment-1").experiment_id

'1'

In [22]:
run_id = '7705ec160996449f9bee9d131d432703'

mlflow.register_model(
    model_uri=f'runs:/{run_id}/models',
    name='iris_classifier'
)



Registered model 'iris_classifier' already exists. Creating a new version of this model...
2023/06/12 21:22:50 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation. Model name: iris_classifier, version 2
Created version '2' of model 'iris_classifier'.


<ModelVersion: aliases=[], creation_timestamp=1686597770222, current_stage='None', description='', last_updated_timestamp=1686597770222, name='iris_classifier', run_id='7705ec160996449f9bee9d131d432703', run_link='', source='/Users/tdafonseca/Desktop/Github/learning/deploy_machine_learning_model/02-experiment-tracking/src_scenario/artifacts_local/1/7705ec160996449f9bee9d131d432703/artifacts/models', status='READY', status_message='', tags={}, user_id='', version='2'>