## Scenario 3: Multiple data scientists working on multiple ML models

MLflow setup:
* Tracking server: yes, GCP Compute Engine.
* Backend store: postgresql database.
* Artifacts store: Cloud Storage.

The experiments can be explored by accessing the remote server.

In [3]:
#! pip install google-cloud-storage
#! pip install google-cloud

In [4]:
import mlflow
import os

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'mlops-zoomcamp-387822-807c7e381ae3.json'

TRACKING_SERVER_HOST = "34.23.208.150"
mlflow.set_tracking_uri(f"http://{TRACKING_SERVER_HOST}:5001")

In [6]:
print(f"tracking URI: '{mlflow.get_tracking_uri()}'")

tracking URI: 'http://34.23.208.150:5001'


In [7]:
mlflow.search_experiments()

[<Experiment: artifact_location='gs://first-mlflow-bucket/0', creation_time=1685411846046, experiment_id='0', last_update_time=1685411846046, lifecycle_stage='active', name='Default', tags={}>]

In [8]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

mlflow.set_experiment("my-experiment-1")

with mlflow.start_run():

    X, y = load_iris(return_X_y=True)

    params = {"C": 0.1, "random_state": 42}
    mlflow.log_params(params)

    lr = LogisticRegression(**params).fit(X, y)
    y_pred = lr.predict(X)
    mlflow.log_metric("accuracy", accuracy_score(y, y_pred))

    mlflow.sklearn.log_model(lr, artifact_path="models")
    print(f"default artifacts URI: '{mlflow.get_artifact_uri()}'")

2023/05/29 21:51:11 INFO mlflow.tracking.fluent: Experiment with name 'my-experiment-1' does not exist. Creating a new experiment.


default artifacts URI: 'gs://first-mlflow-bucket/1/bde4023e6897489a99a7b9c9529f4a6d/artifacts'


In [9]:
mlflow.search_experiments()

[<Experiment: artifact_location='gs://first-mlflow-bucket/1', creation_time=1685415071637, experiment_id='1', last_update_time=1685415071637, lifecycle_stage='active', name='my-experiment-1', tags={}>,
 <Experiment: artifact_location='gs://first-mlflow-bucket/0', creation_time=1685411846046, experiment_id='0', last_update_time=1685411846046, lifecycle_stage='active', name='Default', tags={}>]

### Interacting with the model registry

In [10]:
from mlflow.tracking import MlflowClient


client = MlflowClient(f"http://{TRACKING_SERVER_HOST}:5001")

In [12]:
client.search_registered_models()

[]

In [13]:
run_id = client.search_runs(experiment_ids='1')[0].info.run_id
print(f"runs:/{run_id}/models")
mlflow.register_model(
    model_uri=f"runs:/{run_id}/models",
    name='iris-classifier'
)

runs:/bde4023e6897489a99a7b9c9529f4a6d/models


Successfully registered model 'iris-classifier'.
2023/05/29 21:52:05 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation. Model name: iris-classifier, version 1
Created version '1' of model 'iris-classifier'.


<ModelVersion: aliases=[], creation_timestamp=1685415124825, current_stage='None', description='', last_updated_timestamp=1685415124825, name='iris-classifier', run_id='bde4023e6897489a99a7b9c9529f4a6d', run_link='', source='gs://first-mlflow-bucket/1/bde4023e6897489a99a7b9c9529f4a6d/artifacts/models', status='READY', status_message='', tags={}, user_id='', version='1'>

In [14]:
client.search_registered_models()

[<RegisteredModel: aliases={}, creation_timestamp=1685415124499, description='', last_updated_timestamp=1685415124825, latest_versions=[<ModelVersion: aliases=[], creation_timestamp=1685415124825, current_stage='None', description='', last_updated_timestamp=1685415124825, name='iris-classifier', run_id='bde4023e6897489a99a7b9c9529f4a6d', run_link='', source='gs://first-mlflow-bucket/1/bde4023e6897489a99a7b9c9529f4a6d/artifacts/models', status='READY', status_message='', tags={}, user_id='', version='1'>], name='iris-classifier', tags={}>]

In [15]:
client.transition_model_version_stage(
    name='iris-classifier', 
    version=1, 
    stage="Staging"
)

<ModelVersion: aliases=[], creation_timestamp=1685415124825, current_stage='Staging', description='', last_updated_timestamp=1685415306816, name='iris-classifier', run_id='bde4023e6897489a99a7b9c9529f4a6d', run_link='', source='gs://first-mlflow-bucket/1/bde4023e6897489a99a7b9c9529f4a6d/artifacts/models', status='READY', status_message='', tags={}, user_id='', version='1'>