In [None]:
#@title Environment setup
%set_env UV_PROJECT_ENVIRONMENT /usr/local
%set_env MLFLOW_TRACKING_URI https://dagshub.com/m09/landscape-classifier.mlflow
%set_env MPLBACKEND notebook
!rm -rf sample_data .config
!pip install -U uv
!git config --global user.email "jane@doe.eu"
!git config --global user.name "Jane Doe"
!git config --global init.defaultBranch main
!git clone https://github.com/shuuchuu/landscape-classifier.git
%cd /content/landscape-classifier
!uv sync --all-extras

# MLFlow, model registry & deployment

## DVC & MLFlow credentials

Enter your DagsHub credentials (password is the secret token, not your DagsHub password):

In [None]:
username = ""
token = ""

In [None]:
#@title To run
%set_env MLFLOW_TRACKING_USERNAME {username}
%set_env MLFLOW_TRACKING_PASSWORD {token}

## Data retrieval

In [None]:
!wget https://github.com/shuuchuu/dataset-landscape/archive/refs/heads/main.zip
!unzip main.zip
!mv dataset-landscape-main/seg_train train-data

## Publishing a model to a model registry

Modify the `train.py` file to train a model and track metrics with [MLFlow Tracking](https://mlflow.org/docs/latest/tracking.html). [Keras](https://mlflow.org/docs/latest/python_api/mlflow.keras.html) pages on the MLFlow Python API & pages on the [Keras flavor](https://mlflow.org/docs/latest/models.html#keras-keras) of MLFlow models will be useful.

Published the trained model on the model registry at the end of the run. [This page](https://mlflow.org/docs/latest/models.html#keras-keras) can be useful.

In [None]:
!landscape-classifier

## Using the MLFlow model registry

In this section, we are going to study how to manipulate the MLFlow model registry by performing a series of basic tasks.

### Creating an MLFlow client

Create an MLFlow client and display the tracking address to check that it is correctly used from the environment variable defined at the top of this notebook.

In [None]:
# Your code here

#### Solution

In [None]:
import mlflow

client = mlflow.client.MlflowClient()
client.tracking_uri

### Model search

Display all the registered models and for each of them display the most recent version.

In [None]:
# Your code here

#### Solution

In [None]:
def latest_version(latest_versions):
  return sorted(latest_versions, key=lambda v: -v.creation_timestamp)[0].version


for model in client.search_registered_models():
  print(model.name, latest_version(model.latest_versions))

### Retrieving a model from the model registry

Now that you know how to retrieve the name and version of a model, download it from the registry and check that you can use it as you expect.

There are different flavors for each model. Why? What is different for each of them?

In [None]:
# Your code here

#### Solution

In [None]:
import mlflow.keras


def uri_from_name_and_version(name: str, version: str | int) -> str:
  return f"models:/{name}/{version}"


uri = uri_from_name_and_version("lenet-landscape-classifier", 1)


model_keras = mlflow.keras.load_model(model_uri=uri)
model_keras.summary()
model = mlflow.pyfunc.load_model(model_uri=uri)
model

### Adding an alias & tags

Adding an alias and tags is a way to organize models in an MLFlow model registry. Aliases always target a specific version of a model while tags can also apply to all versions of a model.

Add a `champion` alias to the last version of your model, add a `domain` tag with `cv` value to all versions of your model, and a tag `size` with value `small` for the last version of your model.

In [None]:
# Your code here

You can now retrieve your model by using its alias rather than its version.

In [None]:
# Your code here

#### Solution

In [None]:
client.set_registered_model_alias("lenet-landscape-classifier", "champion", "1")
client.set_registered_model_tag("lenet-landscape-classifier", "domain", "cv")
client.set_model_version_tag("lenet-landscape-classifier", "1", "size", "small")

In [None]:
import mlflow.keras


def uri_from_name_and_alias(name: str, alias: str | int) -> str:
  return f"models:/{name}@{alias}"


uri = uri_from_name_and_alias("lenet-landscape-classifier", "champion")


model_keras = mlflow.keras.load_model(model_uri=uri)
model_keras.summary()

### Changing model phases

When you want to promote a model in your production pipeline, the recommended way is to copy the model and to give it a new name.

Copy your model and give it its current name prefixed by `staging.`.

In [None]:
# Your code here

#### Solution

In [None]:
client.copy_model_version(
    src_model_uri="models:/lenet-landscape-classifier@champion",
    dst_name="staging.lenet-landscape-classifier",
)

## API creation with FastAPI

You can code this part in a new `api.py` file.

### Creating a preprocessing function

Create a function that loads an image in the same way than what was used during training.


### Creating a model loading function

Create a function that retrieves the trained model from the MLFlow model register. This function should take the model URI as input.

### Creating a result class

FastAPI uses the amazing [Pydantic](https://docs.pydantic.dev/latest/) library to handle the types of its inputs and outputs.

Create a class to model the return type of your API using Pydantic. The class should contain at least the predicted class (as a string), and a dictionary of all classes to their predicted probability.

### API creation

Implement a function that can be used as a FastAPI POST function that accepts a file as input. [This page can help](https://fastapi.tiangolo.com/tutorial/request-files/).

## Dockerfile creation

Follow the steps on the [FastAPI website](https://fastapi.tiangolo.com/deployment/docker/) to create a Dockerfile for your model.

## Solution

Please see the following [DagsHub repository](https://dagshub.com/m09/landscape-classifier/src/solution) for all questions that do not have a `Solution` section.