# Scenario 3: Multiple data scientists working on multiple ML models

MLflow setup:
| **`Tracking server`**   | yes, remote server (EC2) |
| :---------------------: | :--------------: | 
| **`Backend store`**     | postgresql database  |
| **`Artifact store`**    | s3 bucket |

The experiment can be explored by accessing the remote server.

The example uses AES to host a remote server. In order to run the example you'll need an AWS account. Foow the steps in the file [`mlflow_on_aws.md`](https://github.com/joweyel/mlops-zoomcamp/blob/main/02-experiment-tracking/mlflow_on_aws.md) to create a new AWS account and launch the tracking server.

In [None]:
import os
import mlflow

# os.environ["AWS_PROFILE"] = "" # fill in with your AWS profile. More info: https://docs.aws.amazon.com/

TRACKING_SERVER_HOST = "<public-ec2-ip>.compute-1.amazonaws.com" # fill in with the public DNS of the EC2 instance
mlflow.set_tracking_uri(f"http://{TRACKING_SERVER_HOST}:5000")

## Now after everything is set up, the fun begins!

In [None]:
print(f"Tracking-URI: '{mlflow.get_tracking_uri()}'")

In [None]:
# Should only show the default experiment, if you just set up you MLflow AWS thingie!
mlflow.search_experiments()

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

mlflow.set_experiment("my-experiment-1-aws")

with mlflow.start_run():
    X, y = load_iris(return_X_y=True)
    params = { "C": 0.1, "random_state": 42 }
    mlflow.log_params(params)

    lr = LogisticRegression(**params).fit(X, y)
    y_pred = lr.predict(X)
    mlflow.log_metric("accuracy", accuracy_score(y, y_pred))

    mlflow.sklearn.log_model(lr, artifact_path="models")
    print(f"default artifacts URI: '{mlflow.get_artifact_uri()}'")

In [None]:
# There should now be 2 experiment-runs
mlflow.search_experiments() 

## Interacting with the Model Registry

Same as with Scenario 2, but with AWS. Should be easy.

In [None]:
from mlflow.tracking import MlflowClient

client = MlflowClient(f"http://{TRACKING_SERVER_HOST}:5000")

In [None]:
# There are no registered models
client.search_registered_models()

In [None]:
# Register a model
run_id = client.search_runs(experiment_ids="1")[0].info.run_id
mlflow.register_model(
    model_uri=f"runs:/{run_id}/models", 
    name="iris-classifier"
)