## Scenario 2: A cross-functional team with one data scientist working on an ML model


MLflow setup:
- tracking server: yes, local server
- backend store: sqlite database
- artifacts store: local filesystem

The experiments can be explored locally by accessing the local tracking server.

To run this example you need to launch the mlflow server locally by running the following command in your terminal:

`mlflow server --backend-store-uri sqlite:///backend.db`

In [2]:
import mlflow

mlflow.set_tracking_uri("http://127.0.0.1:5000")

In [3]:
print(f"tracking URI: '{mlflow.get_tracking_uri()}'")

tracking URI: 'http://127.0.0.1:5000'


In [4]:
mlflow.list_experiments()

[<Experiment: artifact_location='./mlruns/1', experiment_id='1', lifecycle_stage='active', name='nyc-taxi-experiment', tags={}>,
 <Experiment: artifact_location='./mlruns/2', experiment_id='2', lifecycle_stage='active', name='iris-experiment-1', tags={}>]

In [5]:
IRIS_EXP_NAME = "iris-experiment-1"

In [18]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score, log_loss

mlflow.set_experiment(IRIS_EXP_NAME)


<Experiment: artifact_location='./mlruns/2', experiment_id='2', lifecycle_stage='active', name='iris-experiment-1', tags={}>

In [21]:
X, y = load_iris(return_X_y=True)


def objective(params):
    with mlflow.start_run():
        mlflow.set_tag('model', 'LogisticRegression')        
        mlflow.log_params(params)
        
        lr = LogisticRegression(**params).fit(X, y)
        y_pred = lr.predict(X)
        accuracy = accuracy_score(y, y_pred)
        mlflow.log_metric("accuracy", accuracy)
        
        y_pred_proba = lr.predict_proba(X)
        loss = log_loss(y, y_pred_proba)
        mlflow.log_metric("log loss", loss)

        mlflow.sklearn.log_model(lr, artifact_path="models")
        # print(f"default artifacts URI: '{mlflow.get_artifact_uri()}'")
    return {'loss': loss, 'status': STATUS_OK}

In [8]:
from hyperopt import fmin, tpe, hp, STATUS_OK, Trials
from hyperopt.pyll import scope

In [22]:
search_space = {
    'C': hp.loguniform('C', -3, 0),
    'penalty': hp.choice('penalty', ['l1', 'l2']),
    'solver': 'liblinear',
}

best_result = fmin(
    fn=objective,
    space=search_space,
    algo=tpe.suggest,
    max_evals=20,
    trials=Trials(),
)
    

100%|█████████████████████████████████████| 20/20 [01:42<00:00,  5.10s/trial, best loss: 0.305859112389673]


In [16]:
mlflow.list_experiments()

[<Experiment: artifact_location='./mlruns/1', experiment_id='1', lifecycle_stage='active', name='nyc-taxi-experiment', tags={}>,
 <Experiment: artifact_location='./mlruns/2', experiment_id='2', lifecycle_stage='active', name='iris-experiment-1', tags={}>]

### Interacting with the model registry

In [26]:
from mlflow.tracking import MlflowClient
from mlflow.entities import ViewType

client = MlflowClient("http://127.0.0.1:5000")

In [25]:
client.list_registered_models()

[<RegisteredModel: creation_timestamp=1654142891212, description='', last_updated_timestamp=1654142891574, latest_versions=[<ModelVersion: creation_timestamp=1654142891574, current_stage='None', description='', last_updated_timestamp=1654142891574, name='iris-classifier', run_id='05a282bca5b8434780fd913c778b707f', run_link='', source='./mlruns/2/05a282bca5b8434780fd913c778b707f/artifacts/models', status='READY', status_message='', tags={}, user_id='', version='1'>], name='iris-classifier', tags={}>]

In [27]:
def print_run_info(runs):
    """
    From https://www.mlflow.org/docs/latest/python_api/mlflow.tracking.html
    """
    for r in runs:
        print("run_id: {}".format(r.info.run_id))
        print("lifecycle_stage: {}".format(r.info.lifecycle_stage))
        print("metrics: {}".format(r.data.metrics))

        # Exclude mlflow system tags
        tags = {k: v for k, v in r.data.tags.items() if not k.startswith("mlflow.")}
        print("tags: {}".format(tags))

In [30]:
exps = client.get_experiment_by_name(IRIS_EXP_NAME)
runs = client.search_runs(
    experiment_ids=exps.experiment_id,
    run_view_type=ViewType.ACTIVE_ONLY,
    max_results=5,
    order_by=['metrics.accuracy DESC']
)
print_run_info(runs)

run_id: 207577093dbc4a7a80d6d818b8116f87
lifecycle_stage: active
metrics: {'accuracy': 0.9666666666666667, 'log loss': 0.305859112389673}
tags: {'model': 'LogisticRegression'}
run_id: bb1fe444c1dc4beba04670ef575197a8
lifecycle_stage: active
metrics: {'accuracy': 0.9666666666666667, 'log loss': 0.3116084636110016}
tags: {'model': 'LogisticRegression'}
run_id: b47b94363b66484c9781db5e5d52ddeb
lifecycle_stage: active
metrics: {'accuracy': 0.9666666666666667}
tags: {'model': 'LogisticRegression'}
run_id: 6ebfdf9daef946988fb756e99dc873f3
lifecycle_stage: active
metrics: {'accuracy': 0.96, 'log loss': 0.32764854001932037}
tags: {'model': 'LogisticRegression'}
run_id: bc3df90f560f4bf2a594e2af8a0ba6ae
lifecycle_stage: active
metrics: {'accuracy': 0.96, 'log loss': 0.32242512006490126}
tags: {'model': 'LogisticRegression'}


In [31]:
runs[0].info.run_id

'207577093dbc4a7a80d6d818b8116f87'

In [32]:
model_uris = [f'runs:/{run.info.run_id}/models'
              for run in runs]

for model_uri in model_uris:
    mlflow.register_model(
        model_uri=model_uri,
        name='iris-classifier'
    )

Registered model 'iris-classifier' already exists. Creating a new version of this model...
2022/06/02 08:34:57 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: iris-classifier, version 2
Created version '2' of model 'iris-classifier'.
Registered model 'iris-classifier' already exists. Creating a new version of this model...
2022/06/02 08:34:57 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: iris-classifier, version 3
Created version '3' of model 'iris-classifier'.
Registered model 'iris-classifier' already exists. Creating a new version of this model...
2022/06/02 08:34:57 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: iris-classifier, version 4
Created version '4' of model 'iris-classifier'.
Registered model 'i

In [17]:
exp_list = mlflow.list_experiments()
iris_exp_id = [entity.experiment_id 
          for entity in exp_list if entity.name == IRIS_EXP_NAME]

run_id = client.list_run_infos(experiment_id=iris_exp_id[0])[0].run_id
mlflow.register_model(
    model_uri=f"runs:/{run_id}/models",
    name='iris-classifier'
)

Successfully registered model 'iris-classifier'.
2022/06/02 04:08:11 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: iris-classifier, version 1
Created version '1' of model 'iris-classifier'.


<ModelVersion: creation_timestamp=1654142891574, current_stage='None', description='', last_updated_timestamp=1654142891574, name='iris-classifier', run_id='05a282bca5b8434780fd913c778b707f', run_link='', source='./mlruns/2/05a282bca5b8434780fd913c778b707f/artifacts/models', status='READY', status_message='', tags={}, user_id='', version='1'>