# Scenario 3: Multiple data scientists working on multiple ML models

MLflow setup:
- Tracking server: yes, remote serve (EC2)
- Backend store: postgresql database
- Artifacts store: s3 bucket

The experiments can be explored by accessing the remote server

This example uses AWS to host a remote server. In order to run the example you'll need an AWS account. Follow the steps described in the file mlflow_on_aws.md to create a new AWS account and launch the tracking server.

In [1]:
import mlflow
import os

os.environ["AWS_PROFILE"] = "mlflow-client"  # https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/get-started-auth.html
TRACKING_SERVER_HOST = "ec2-54-151-13-9.us-west-1.compute.amazonaws.com"   # Public DNS of tracking server
mlflow.set_tracking_uri(f"http://{TRACKING_SERVER_HOST}:5000")

In [2]:
print(f"Get tracking URI: {mlflow.get_tracking_uri()}")

Get tracking URI: http://ec2-54-151-13-9.us-west-1.compute.amazonaws.com:5000


In [3]:
mlflow.list_experiments()

[<Experiment: artifact_location='s3://mlflow-artifacts-remote433/0', experiment_id='0', lifecycle_stage='active', name='Default', tags={}>,
 <Experiment: artifact_location='s3://mlflow-artifacts-remote433/1', experiment_id='1', lifecycle_stage='active', name='my-experiment-1', tags={}>]

In [4]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

mlflow.set_experiment("my-experiment-1")

with mlflow.start_run():

    X, y = load_iris(return_X_y=True)

    params = {'C': 0.1, 'random_state': 42}
    mlflow.log_params(params)

    lr = LogisticRegression(**params).fit(X,y)
    y_pred = lr.predict(X)
    mlflow.log_metric("accuracy", accuracy_score(y, y_pred))

    mlflow.sklearn.log_model(lr, artifact_path="models")
    print(f"get default artifacts URI: {mlflow.get_artifact_uri()}")



get default artifacts URI: s3://mlflow-artifacts-remote433/1/bb7a127dca5c43f2b07b79b84855edb7/artifacts
