# MLFlow Tutorial

## What is it?
MLflow is an open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry

## What to use it for

### MLFlow Tracking

MLflow Tracking is an API and UI for logging parameters, code versions, metrics and output files when running your machine learning code to later visualize them.

### MLflow Models

MLflow Models is a convention for packaging machine learning models in multiple formats called "flavors". MLflow offers a variety of tools to help you deploy different flavors of models

In [1]:
## Libraries for tutorial
import logging

import mlflow
import sklearn
from dotenv import load_dotenv
from mlflow.tracking import MlflowClient
from mlflow.entities.lifecycle_stage import LifecycleStage
from mlflow.exceptions import MlflowException


## Installation
Recommended: Install the same MLFlow version that is deployed on your k8s cluster

## Setup

We have an MLFlow server deployed on the k8s cluster. 

There is not need to run a local instance. You will need the following environment variables:
- MLFLOW_TRACKING_URI = http://localhost:5001
- MLFLOW_S3_ENDPOINT_URL = http://localhost:9000
- AWS_ACCESS_KEY_ID=minio
- AWS_SECRET_ACCESS_KEY=minio123
- experiment_id: 0

In [2]:
## Creating an experiment

In [3]:
load_dotenv()

True

In [23]:
j['artifactor_location']

TypeError: 'Experiment' object is not subscriptable

In [20]:
experiments = MlflowClient().search_experiments()
all_experiments = []
for j in experiments:
    single =[]
    for i in [
        'artifact_location',
        'creation_time',
        'experiment_id',
        'last_update_time',
        'lifecycle_stage',
        'name',
        'tags',
        ]:
        single.append(j[i])
    all_experiments.append(single)

TypeError: 'Experiment' object is not subscriptable

In [18]:
mlflow.entities.Experiment.__dict__

mappingproxy({'__module__': 'mlflow.entities.experiment',
              '__doc__': '\n    Experiment object.\n    ',
              'DEFAULT_EXPERIMENT_NAME': 'Default',
              '__init__': <function mlflow.entities.experiment.Experiment.__init__(self, experiment_id, name, artifact_location, lifecycle_stage, tags=None, creation_time=None, last_update_time=None)>,
              'experiment_id': <property at 0x112c1e680>,
              'name': <property at 0x112c171d0>,
              '_set_name': <function mlflow.entities.experiment.Experiment._set_name(self, new_name)>,
              'artifact_location': <property at 0x112c17220>,
              'lifecycle_stage': <property at 0x112c17a90>,
              'tags': <property at 0x112af7130>,
              '_add_tag': <function mlflow.entities.experiment.Experiment._add_tag(self, tag)>,
              'creation_time': <property at 0x112d48c20>,
              '_set_creation_time': <function mlflow.entities.experiment.Experiment._set_creat

In [17]:
dir(mlflow.entities.Experiment.__dict__)

['__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__ior__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__ne__',
 '__new__',
 '__or__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__ror__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'copy',
 'get',
 'items',
 'keys',
 'values']

In [13]:
experiments[0].experiment_id

'0'

In [9]:
pd.DataFrame(experiments)

Unnamed: 0,0,1,2,3,4,5,6
0,"(artifact_location, s3://mlflow/0)","(creation_time, None)","(experiment_id, 0)","(last_update_time, None)","(lifecycle_stage, active)","(name, Default)","(tags, {})"
1,"(artifact_location, s3://mlflow/1)","(creation_time, 1671197047344)","(experiment_id, 1)","(last_update_time, 1671197047344)","(lifecycle_stage, active)","(name, Document-Extractor)","(tags, {})"


In [None]:
def create_experiment(experiment_name):
    mlflow_client = MlflowClient()
    try:
        experiment_id = mlflow_client.create_experiment(
            experiment_name
        )
        logging.info(
            f"Experiment {experiment_name} created with id {experiment_id}"
        )
    except MlflowException as create_error:

        experiment = mlflow_client.get_experiment_by_name(
            experiment_name
        )
        if experiment.lifecycle_stage == LifecycleStage.DELETED:
            logging.error(
                f"Experiment {experiment_name} already DELETED"
            )
            raise
        experiment_id = experiment.experiment_id
        logging.info(
            f"Experiment {experiment_name} exists with id {experiment_id}"
        )