#### A Jupyter notebook log a transformer model on Hugging Face

This notebook is a simplistic implementation of the tutorial [here](https://mlflow.org/docs/latest/llms/transformers/tutorials/text-generation/text-generation.html).

In [1]:
# Import libraries
import transformers
import mlflow
import os

#### Transformers pipeline
In the following step, give a name to your task and define a `transformer pipeline`. In the parameter `model`, write the name of the hugging face model you want to use. For this demo, we define a text2text-generation task using the model [declare-lab/flan-alpaca-large](https://huggingface.co/declare-lab/flan-alpaca-large).

In [2]:
# Define the task that we want to use (required for proper pipeline construction)
task = "text2text-generation"

# Define the pipeline, using the task and a model instance that is applicable for our task.
generation_pipeline = transformers.pipeline(
    task=task,
    model="declare-lab/flan-alpaca-large",
)



config.json:   0%|          | 0.00/787 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/3.13G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/142 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/2.50k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.42M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/2.20k [00:00<?, ?B/s]

In [11]:
# Define the task that we want to use (required for proper pipeline construction)
task = "time series"

# Define the pipeline, using the task and a model instance that is applicable for our task.
generation_pipeline2 = transformers.pipeline(
    task=task,
    model="amazon/chronos-t5-small",
)

KeyError: "Unknown task time series, available tasks are ['audio-classification', 'automatic-speech-recognition', 'conversational', 'depth-estimation', 'document-question-answering', 'feature-extraction', 'fill-mask', 'image-classification', 'image-feature-extraction', 'image-segmentation', 'image-to-image', 'image-to-text', 'mask-generation', 'ner', 'object-detection', 'question-answering', 'sentiment-analysis', 'summarization', 'table-question-answering', 'text-classification', 'text-generation', 'text-to-audio', 'text-to-speech', 'text2text-generation', 'token-classification', 'translation', 'video-classification', 'visual-question-answering', 'vqa', 'zero-shot-audio-classification', 'zero-shot-classification', 'zero-shot-image-classification', 'zero-shot-object-detection', 'translation_XX_to_YY']"

#### MLflow set up
In the following cell, we name the experiment, run, `artifact_path`, and name with which we want to register the model.

In [3]:
experiment_name = 'HuggingFace'
run_name = 'test_alpacav3'
artifact_path = 'text_generator'
registered_model_name = 'text_generator'
# Remote location of the S3 bucket (on AWS)
# You should have defined this as a custom key 
# in your environment
s3_bucket=os.environ['CUSTOM_KEY']
# Location to store the ML experiments locally
# This is also the location that you sync with the
# S3 bucket (see below)
tracking_uri = '/tmp/mlflow/db/'
# Sync all contents from the S3 bucket (remote) to the local location
os.system(f"aws s3 sync {s3_bucket} {tracking_uri} --quiet")
# Let mlflow where you are storing your ML experiments
mlflow.set_tracking_uri(tracking_uri)
# If the expr_name is not already in use, create one
does_experiment_exist = mlflow.get_experiment_by_name(experiment_name)
if not does_experiment_exist:
    mlflow.create_experiment(experiment_name)
else:
    print (f'Experiment with name {experiment_name} exists. Loading it...')
# If the expr_name is already in use, use it to track
# your MLflow
mlflow.set_experiment(experiment_name)


Experiment with name HuggingFace exists. Loading it...


<Experiment: artifact_location='/tmp/mlflow/db/203237740360877547', creation_time=1715163911885, experiment_id='203237740360877547', last_update_time=1715163911885, lifecycle_stage='active', name='HuggingFace', tags={}>

#### Log and register the model
Log and register the model by using the package `mlflow.transformers` provided by MLflow. We will use the function `log_model` to log and register your model. Please note that we set the parameter `save_pretrained` to `False` because we want MLflow to just remember the reference of the model to the HuggingFace Hub. This especially useful when the pretrained model is too big.

In [4]:
with mlflow.start_run(run_name=run_name) as run:
    model_info = mlflow.transformers.log_model(
        transformers_model=generation_pipeline,
        artifact_path=artifact_path,
        registered_model_name=registered_model_name,
        # input_example=input_example,
        # signature=signature,
        # Uncomment the following line to save the model in 'reference-only' mode:
        save_pretrained=False,
    )
    # extract the run_id
    # (run_name and run_id are differnt things. 
    # while run_id is unique to a run, differnt runs can have same run_names)
    run_id = run.info.run_id
print (run_id)

  model_info = mlflow.transformers.log_model(
  flavor.save_model(path=local_path, mlflow_model=mlflow_model, **kwargs)
2024/05/08 14:48:08 INFO mlflow.transformers: Skipping saving pretrained model weights to disk as the save_pretrained is set to False. The reference to HuggingFace Hub repository declare-lab/flan-alpaca-large will be logged instead.


README.md:   0%|          | 0.00/5.84k [00:00<?, ?B/s]

449547886a8c410e8848469de18e9474


Registered model 'text_generator' already exists. Creating a new version of this model...
Created version '6' of model 'text_generator'.


#### Log metrics
Log some dummy metrics

In [6]:
# Create dummy metrics
metrics = {"mse": 2500.00, "rmse": 50.00}

# Log a batch of metrics to the run_id above
with mlflow.start_run(run_id=run_id):
    mlflow.log_metrics(metrics)

#### MLflow sync (IMPORTANT)
Once your ML experiment has ended, please sync your local copy with the S3 bucket.
Failure to do so will lead to loss of experiment logs

In [7]:
# Sync the local contents with the S3 bucket
os.system(f"aws s3 sync {tracking_uri} {s3_bucket} --quiet")

0

#### Inference
In this step we load the model back from MLflow for inference purposes.

In [6]:
# Load our pipeline as a generic python function
sentence_generator = mlflow.pyfunc.load_model(model_info.model_uri)




In the following step, we define the input `data` (which should be a text as a vector/array), make predictions.

In [7]:
# Validate that our loaded pipeline, as a generic pyfunc, can produce an output that makes sense
predictions = sentence_generator.predict(
    data=[
        "What is the capital of Germany?",
        "Please tell me the name of the company running German Railways.",
    ]
)
print (predictions)



['The capital of Germany is Berlin.', 'The company running German Railways is Deutsche Bahn.']
