## Scenario 2: A cross-functional team with one data scientist working on an ML model


MLflow setup:
- tracking server: yes, local server
- backend store: sqlite database
- artifacts store: local filesystem

The experiments can be explored locally by accessing the local tracking server.

To run this example you need to launch the mlflow server locally by running the following command in your terminal:

`mlflow server --backend-store-uri sqlite:///backend.db`

The first thing that will happen when you initialize MLflow this way is that it will create a SQLite database.

Now, mlflow has been launched in the http://127.0.0.1:5000 server, thus we must informat that in the set_tracking_uri method.

In [1]:
import mlflow

mlflow.set_tracking_uri("http://127.0.0.1:5000")

In [2]:
print(f"tracking URI: '{mlflow.get_tracking_uri()}'")

tracking URI: 'http://127.0.0.1:5000'


In [6]:
mlflow.search_experiments()

[<Experiment: artifact_location='mlflow-artifacts:/1', creation_time=1685109192696, experiment_id='1', last_update_time=1685109192696, lifecycle_stage='active', name='my-experiment-1', tags={}>,
 <Experiment: artifact_location='mlflow-artifacts:/0', creation_time=1685108796717, experiment_id='0', last_update_time=1685108796717, lifecycle_stage='active', name='Default', tags={}>]

Running this will make MLflow create two files: the backend.db, a SQL database containing information about the experiment, and the artifact_location folder. In the video, MLflow created a folder called artifacts_local to save locally the experiment artifacts (in this case, the model info). Here, it says that this folder is called mlflow-artifacts, but locally it createda folder called mlartifacts. I suspect this information is also dumped in the backend.db (where the metadata about the experiment is stored). 

In [5]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

mlflow.set_experiment("my-experiment-1")

with mlflow.start_run():

    X, y = load_iris(return_X_y=True)

    params = {"C": 0.1, "random_state": 42}
    mlflow.log_params(params)

    lr = LogisticRegression(**params).fit(X, y)
    y_pred = lr.predict(X)
    mlflow.log_metric("accuracy", accuracy_score(y, y_pred))

    mlflow.sklearn.log_model(lr, artifact_path="models")
    print(f"default artifacts URI: '{mlflow.get_artifact_uri()}'")

2023/05/26 10:53:12 INFO mlflow.tracking.fluent: Experiment with name 'my-experiment-1' does not exist. Creating a new experiment.


default artifacts URI: 'mlflow-artifacts:/1/b666d120f9ef48a4b11653da446b7f8b/artifacts'


In [7]:
mlflow.search_experiments()

[<Experiment: artifact_location='mlflow-artifacts:/1', creation_time=1685109192696, experiment_id='1', last_update_time=1685109192696, lifecycle_stage='active', name='my-experiment-1', tags={}>,
 <Experiment: artifact_location='mlflow-artifacts:/0', creation_time=1685108796717, experiment_id='0', last_update_time=1685108796717, lifecycle_stage='active', name='Default', tags={}>]

### Interacting with the model registry

In [8]:
from mlflow.tracking import MlflowClient


client = MlflowClient("http://127.0.0.1:5000")

In [10]:
client.search_registered_models()

[]

We can see that there is no registered model. We can register one with the following code:

In [39]:
# run_id = client.list_run_infos(experiment_id='1')[0].run_id
run_id = client.search_runs(experiment_ids='1')[0].info.run_id
mlflow.register_model(
    model_uri=f"runs:/{run_id}/models",
    name='iris-classifier'
)

Successfully registered model 'iris-classifier'.
2023/05/26 12:04:15 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation. Model name: iris-classifier, version 1
Created version '1' of model 'iris-classifier'.


<ModelVersion: aliases=[], creation_timestamp=1685113455627, current_stage='None', description='', last_updated_timestamp=1685113455627, name='iris-classifier', run_id='b666d120f9ef48a4b11653da446b7f8b', run_link='', source='mlflow-artifacts:/1/b666d120f9ef48a4b11653da446b7f8b/artifacts/models', status='READY', status_message='', tags={}, user_id='', version='1'>

Now, when checking the MLflow UI, we can see the model is there.

Again, we are going to delete the mlruns folder to go to the next scenario. We will also delete the backend.db file, because a new one will be created.