MLflow concepts:

MLflow runs can be recorded to local files, to a SQLAlchemy compatible database, or remotely to a tracking server. By default, the MLflow Python API logs runs locally to files in an mlruns directory wherever you ran your program. You can then run mlflow ui to see the logged runs.

MLflow uses two components for storage: backend store and artifact store.
While the **backend store** records runs, model's parameters, metrics, tags, notes, metadata, etc), the **artifact store** records artifacts like (files, models, images, in-memory objects, or model summary, etc).



### MLflow setup:

Tracking server: no

Backend store: local filesystem

Artifacts store: local filesystem


For this example both backend store and artifact store will be done locally. 

In [1]:
import mlflow

In [2]:
print(f"tracking URI: '{mlflow.get_tracking_uri()}'")

tracking URI: 'file:///home/shivam/MLflow_examples/Experiment_tracking/mlruns'


Till now, we don't have mlruns directory. It will be created after executing next cell in the present directory only. 

In [3]:
mlflow.list_experiments()
## This will create the mlflow directory

[<Experiment: artifact_location='file:///home/shivam/MLflow_examples/Experiment_tracking/mlruns/0', experiment_id='0', lifecycle_stage='active', name='Default', tags={}>]

**Creating an example and using new run**

MLflow Tracking is organized around the concept of runs, which are executions of some piece of data science code. Each run records the following information:


1. Code Version
2. Git commit hash used for the run, if it was run from an MLflow Project.
3. Start & End Time
4. Parameters
5. Key-value input parameters of your choice. Both keys and values are strings.
6. Metrics
7. Artifacts

In [4]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score


In [8]:
mlflow.set_experiment("my-experiment-1")

2023/01/13 16:18:10 INFO mlflow.tracking.fluent: Experiment with name 'my-experiment-1' does not exist. Creating a new experiment.


<Experiment: artifact_location='file:///home/shivam/MLflow_examples/Experiment_tracking/mlruns/1', experiment_id='1', lifecycle_stage='active', name='my-experiment-1', tags={}>

We can log models in two ways. One using **mlflow.log_model** and one using **mlflow.log_artifacts**. We will see second flavor in a while.

In [9]:

with mlflow.start_run():

    X, y = load_iris(return_X_y=True)

    params = {"C": 0.1, "random_state": 42}
    mlflow.log_params(params)

    lr = LogisticRegression(**params).fit(X, y)
    y_pred = lr.predict(X)
    mlflow.log_metric("accuracy", accuracy_score(y, y_pred))

    mlflow.sklearn.log_model(lr, artifact_path="models")
    print(f"default artifacts URI: '{mlflow.get_artifact_uri()}'")
    mlflow.set_tag(key = "Message",value="This uses mlflow.log_model flavor to log the model")

default artifacts URI: 'file:///home/shivam/MLflow_examples/Experiment_tracking/mlruns/1/bb83e25868804030af92cf41cafb57f7/artifacts'


Till now for "my-experiment-1" we have one run and correpondingly one folder indise mlfruns/1/ folder. Let's now change 'C' to 0.5 and then see.



In [10]:
## let's now change c to 0.5 and then record the experiment
mlflow.set_experiment("my-experiment-1")

with mlflow.start_run():

    X, y = load_iris(return_X_y=True)

    params = {"C": 0.5, "random_state": 42}
    mlflow.log_params(params)

    lr = LogisticRegression(**params).fit(X, y)
    y_pred = lr.predict(X)
    mlflow.log_metric("accuracy", accuracy_score(y, y_pred))

    mlflow.sklearn.log_model(lr, artifact_path="models")
    print(f"default artifacts URI: '{mlflow.get_artifact_uri()}'")

default artifacts URI: 'file:///home/shivam/MLflow_examples/Experiment_tracking/mlruns/1/fbe993708ddb49f48ea894e9ea7201d2/artifacts'


Go inside mlruns folder. Folder with name mlruns\1\ denote experiment. This has now two folder correspoinding to each run. 

In [11]:
mlflow.list_experiments()

[<Experiment: artifact_location='file:///home/shivam/MLflow_examples/Experiment_tracking/mlruns/1', experiment_id='1', lifecycle_stage='active', name='my-experiment-1', tags={}>,
 <Experiment: artifact_location='file:///home/shivam/MLflow_examples/Experiment_tracking/mlruns/0', experiment_id='0', lifecycle_stage='active', name='Default', tags={}>]

### Creating an another experiment from here onwards
Changes : using a different "C" value
This will create another folder inside "mlruns" folder.

In [12]:
mlflow.set_experiment("my-experiment-2")

with mlflow.start_run():

    X, y = load_iris(return_X_y=True)

    params = {"C": 0.8, "random_state": 56}
    mlflow.log_params(params)

    lr = LogisticRegression(**params).fit(X, y)
    y_pred = lr.predict(X)
    mlflow.log_metric("accuracy", accuracy_score(y, y_pred))

    mlflow.sklearn.log_model(lr, artifact_path="models")
    print(f"default artifacts URI: '{mlflow.get_artifact_uri()}'")

2023/01/13 16:28:08 INFO mlflow.tracking.fluent: Experiment with name 'my-experiment-2' does not exist. Creating a new experiment.
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


default artifacts URI: 'file:///home/shivam/MLflow_examples/Experiment_tracking/mlruns/2/c999f40eb7324678a49936adf3e8d185/artifacts'


In [13]:
mlflow.list_experiments()

[<Experiment: artifact_location='file:///home/shivam/MLflow_examples/Experiment_tracking/mlruns/1', experiment_id='1', lifecycle_stage='active', name='my-experiment-1', tags={}>,
 <Experiment: artifact_location='file:///home/shivam/MLflow_examples/Experiment_tracking/mlruns/0', experiment_id='0', lifecycle_stage='active', name='Default', tags={}>,
 <Experiment: artifact_location='file:///home/shivam/MLflow_examples/Experiment_tracking/mlruns/2', experiment_id='2', lifecycle_stage='active', name='my-experiment-2', tags={}>]

In [18]:
print ("Atrifact uri is {}".format(str(mlflow.get_artifact_uri())))

Atrifact uri is file:///home/shivam/MLflow_examples/Experiment_tracking/mlruns/2/360754677fef4610b3c0cea0f32271d3/artifacts
