## Scenario 3: Multiple data scientists working on multiple ML models

`MLflow setup:`

- Tracking server: yes, remote server (EC2).
- Backend store: postgresql database.
- Artifacts store: s3 bucket.

The experiments can be explored by accessing the remote server.

The exampe uses AWS to host a remote server.

Follow the steps described in the file `mlflow_on_aws.md` to create a new AWS account and launch the tracking server.

In [1]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

import os
import mlflow

In [3]:
# os.environ["AWS_PROFILE"] = "mlflow-user"

os.environ["AWS_ACCESS_KEY_ID"] = "****************"
os.environ["AWS_SECRET_ACCESS_KEY"] = "********************"

In [6]:
TRACKING_SERVER_HOST = "13.48.25.229" 
mlflow.set_tracking_uri(f"http://{TRACKING_SERVER_HOST}:5000")

In [7]:
print(f"tracking URI: '{mlflow.get_tracking_uri()}'")

tracking URI: 'http://13.48.25.229:5000'


In [8]:
from mlflow.entities import ViewType

mlflow.search_experiments(view_type= ViewType.ALL)

[<Experiment: artifact_location='s3://mlflow-artifacts-0523/2', creation_time=1685439534719, experiment_id='2', last_update_time=1685439534719, lifecycle_stage='active', name='my-experiment-2', tags={}>,
 <Experiment: artifact_location='s3://mlflow-artifacts-0523/1', creation_time=1685433626461, experiment_id='1', last_update_time=1685433626461, lifecycle_stage='active', name='my-experiment-1', tags={}>,
 <Experiment: artifact_location='s3://mlflow-artifacts-0523/0', creation_time=1685431338884, experiment_id='0', last_update_time=1685431338884, lifecycle_stage='active', name='Default', tags={}>]

In [9]:
mlflow.set_experiment("my-experiment-2")

with mlflow.start_run():

    X, y = load_iris(return_X_y=True)

    params = {"C": 0.1, "random_state": 42}
    mlflow.log_params(params)

    lr = LogisticRegression(**params).fit(X, y)
    y_pred = lr.predict(X)
    mlflow.log_metric("accuracy", accuracy_score(y, y_pred))

    mlflow.sklearn.log_model(lr, artifact_path="models")
    print(f"default artifacts URI: '{mlflow.get_artifact_uri()}'")



default artifacts URI: 's3://mlflow-artifacts-0523/2/0da074c2505f4593b2787874ca9766d7/artifacts'


## Interacting with Model Registry

In [29]:
from mlflow.tracking import MlflowClient


client = MlflowClient(f"http://{TRACKING_SERVER_HOST}:5000")

run_id = client.search_runs('1')[0].info.run_id
mlflow.register_model(
    model_uri=f"runs:/{run_id}/models",
    name='iris-classifier'
)

Successfully registered model 'iris-classifier'.
2023/05/30 14:10:32 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation. Model name: iris-classifier, version 1
Created version '1' of model 'iris-classifier'.


<ModelVersion: aliases=[], creation_timestamp=1685436032921, current_stage='None', description='', last_updated_timestamp=1685436032921, name='iris-classifier', run_id='12dfddda1cd34563b21042d850a53bfc', run_link='', source='s3://mlflow-artifacts-0523/1/12dfddda1cd34563b21042d850a53bfc/artifacts/models', status='READY', status_message='', tags={}, user_id='', version='1'>

'12dfddda1cd34563b21042d850a53bfc'