# Scenario 3: Multiple data scientist working on multiple ML models

Scenerio setup: 
+ Tracking server: yes, remote server (AWS EC2).
+ Backend store: sqlite database.
+ Artifact store: s3 bucket.

The experiment can be explored by accessing the **remote** ocal tracking server.<br>
To run this example, we need to launch the mlflow server locally by this command: 

This example uses AWS to host the remote server.<br>

Follow this guide to create the AWS resoures (EC2, RDS, S3): [mlflow_on_aws](https://github.com/1412010/nda-mlops-zoomcamp-2024/blob/main/week-02/mlflow_on_aws.md)

To start the MLflow server inside EC2: 

`mlflow server -h 0.0.0.0 -p 5000 --backend-store-uri postgresql://DB_USER:DB_PASSWORD#@DB_ENDPOINT:5432/DATABASE --default-artifact-root s3://<S3 bucket name>
`

In [1]:
import os

# os.environ["AWS_PROFILE"] = "default"
# os.environ["AWS_CONFIG_FILE"] = "~/.aws/credentials"

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""

In [2]:
import mlflow

# Public IPv4 DNS server address of the EC2 instance
TRACKING_SERVER_HOST = "ec2-54-169-146-85.ap-southeast-1.compute.amazonaws.com" 

mlflow.set_tracking_uri(f"http://{TRACKING_SERVER_HOST}:5000")

In [3]:
print(f"Mlflow tracking URI: {mlflow.get_tracking_uri()}")

Mlflow tracking URI: http://ec2-54-169-146-85.ap-southeast-1.compute.amazonaws.com:5000


In [4]:
mlflow.search_experiments()

[<Experiment: artifact_location='s3://nda-mlflow-artifacts-remote/3', creation_time=1718126110387, experiment_id='3', last_update_time=1718126110387, lifecycle_stage='active', name='my-experiment-1', tags={}>,
 <Experiment: artifact_location='s3://nda-mlflow-artifacts-remote/2', creation_time=1718125936709, experiment_id='2', last_update_time=1718125936709, lifecycle_stage='active', name='my-experiment-3', tags={}>,
 <Experiment: artifact_location='s3://nda-mlflow-artifacts-remote/1', creation_time=1718124572483, experiment_id='1', last_update_time=1718124572483, lifecycle_stage='active', name='my-experiment-2', tags={}>,
 <Experiment: artifact_location='s3://nda-mlflow-artifacts-remote/0', creation_time=1718124176503, experiment_id='0', last_update_time=1718124176503, lifecycle_stage='active', name='Default', tags={}>]

## Create an experiment and log a new run

In [5]:
mlflow.end_run()

In [6]:
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

mlflow.set_experiment("my-experiment-1")

with mlflow.start_run():
    X, y = load_iris(return_X_y=True)
    
    params = { "C": 0.1, "random_state": 42 }
    mlflow.log_params(params)
    
    lr = LogisticRegression(**params)
    lr = lr.fit(X, y)
    y_pred = lr.predict(X)
    mlflow.log_metric("accuracy", accuracy_score(y, y_pred))
    
    mlflow.sklearn.log_model(lr, artifact_path="models")
    print(f"default artifact URI: {mlflow.get_artifact_uri()}") 

default artifact URI: s3://nda-mlflow-artifacts-remote/3/72e81ef324fa4ba49778aa9022480275/artifacts


In [7]:
mlflow.search_experiments()

[<Experiment: artifact_location='s3://nda-mlflow-artifacts-remote/3', creation_time=1718126110387, experiment_id='3', last_update_time=1718126110387, lifecycle_stage='active', name='my-experiment-1', tags={}>,
 <Experiment: artifact_location='s3://nda-mlflow-artifacts-remote/2', creation_time=1718125936709, experiment_id='2', last_update_time=1718125936709, lifecycle_stage='active', name='my-experiment-3', tags={}>,
 <Experiment: artifact_location='s3://nda-mlflow-artifacts-remote/1', creation_time=1718124572483, experiment_id='1', last_update_time=1718124572483, lifecycle_stage='active', name='my-experiment-2', tags={}>,
 <Experiment: artifact_location='s3://nda-mlflow-artifacts-remote/0', creation_time=1718124176503, experiment_id='0', last_update_time=1718124176503, lifecycle_stage='active', name='Default', tags={}>]

## Interacting with the Model Registry

In [None]:
from mlflow.client import MlflowClient

client = MlflowClient("http://127.0.0.1:5000/")

In [None]:
print(client.search_registered_models())

[]


In [None]:
run_id = client.search_runs(experiment_ids='1')[0].info.run_id

In [None]:
mlflow.register_model(
    model_uri=f"run:/{run_id}/models",
    name="iris_classifier"
)

Successfully registered model 'iris_classifier'.
2024/06/08 18:26:43 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: iris_classifier, version 1
Created version '1' of model 'iris_classifier'.


<ModelVersion: aliases=[], creation_timestamp=1717846003303, current_stage='None', description='', last_updated_timestamp=1717846003303, name='iris_classifier', run_id='', run_link='', source='run:/d5c99529b7214afb82191601d37db10d/models', status='READY', status_message='', tags={}, user_id='', version='1'>