<a href="https://colab.research.google.com/github/gforgurups/langchain/blob/main/LLMs_LMOps_7.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LLMOps
In this example, we will walk through some key steps for taking an LLM-based pipeline to production.  Our pipeline is related to summarization of news articles using a pre-trained model from Hugging Face.  But in this walkthrough, we will be more rigorous about LLMOps.


**Develop an LLM pipeline**

Our LLMOps goals during development are (a) to track what we do carefully for later auditing and reproducibility and (b) to package models or pipelines in a format which will make future deployment easier.  Step-by-step, we will:
* Load data.
* Build an LLM pipeline.
* Test applying the pipeline to data, and log queries and results to MLflow Tracking.
* Log the pipeline to the MLflow Tracking server as an MLflow Model.


**Test the LLM pipeline**

Our LLMOps goals during testing (in the staging or QA stage) are (a) to track the LLM's progress through testing and towards production and (b) to do so programmatically to demonstrate the APIs needed for future CI/CD automation.  Step-by-step, we will:
* Register the pipeline to the MLflow Model Registry.
* Test the pipeline on sample data.
* Promote the registered model (pipeline) to production.

**Create a production workflow for batch inference**

Our LLMOps goals during production are (a) to write scale-out code which can meet scaling demands in the future and (b) to simplify deployment by using MLflow to write model-agnostic deployment code.  Step-by-step, we will:
* Load the latest production LLM pipeline from the Model Registry.
* Apply the pipeline to an Apache Spark DataFrame.
* Append the results to a Delta Lake table.


### Notes about this workflow
**This notebook vs. modular scripts**: Since this demo is in a single notebook, we will divide the workflow from development to production via notebook sections.  In a more realistic LLM Ops setup, you would likely have the sections split into separate notebooks or scripts.

**Promoting models vs. code**: We track the path from development to production via the MLflow Model Registry.  That is, we are *promoting models* towards production, rather than promoting code.  For more discussion of these two paradigms, see ["The Big Book of MLOps"](https://www.databricks.com/resources/ebook/the-big-book-of-mlops).

Learning Objectives
1. Walk through a simple but realistic workflow to take an LLM pipeline from development to production.
1. Make use of MLflow Tracking and the Model Registry to package and manage the pipeline.
1. Scale out batch inference using Apache Spark and Delta Lake.


For this notebook we'll use the <a href="https://huggingface.co/datasets/xsum" target="_blank">Extreme Summarization (XSum) Dataset</a>  with the <a href="https://huggingface.co/t5-small" target="_blank">T5 Text-To-Text Transfer Transformer</a> from Hugging Face.


## Prepare data

In [2]:
!pip install sacremoses==0.0.53
!pip install openai langchain  transformers huggingface_hub accelerate datasets sentencepiece
from huggingface_hub import login
login("")

Collecting sacremoses==0.0.53
  Downloading sacremoses-0.0.53.tar.gz (880 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m880.6/880.6 kB[0m [31m10.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: sacremoses
  Building wheel for sacremoses (setup.py) ... [?25l[?25hdone
  Created wheel for sacremoses: filename=sacremoses-0.0.53-py3-none-any.whl size=895241 sha256=d3eb3c20d2b211a379706fc1924eb9aa53c00edfeb613207284dd9a8dff30246
  Stored in directory: /root/.cache/pip/wheels/00/24/97/a2ea5324f36bc626e1ea0267f33db6aa80d157ee977e9e42fb
Successfully built sacremoses
Installing collected packages: sacremoses
Successfully installed sacremoses-0.0.53
Collecting openai
  Downloading openai-0.28.1-py3-none-any.whl (76 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.0/77.0 kB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain
  Downloading langchain-0.0

In [3]:
from datasets import load_dataset
from transformers import pipeline


In [4]:
xsum_dataset = load_dataset("xsum", version="1.2.0", cache_dir="sample_data/")  # Note: We specify cache_dir to use pre-cached data.
xsum_sample = xsum_dataset["train"].select(range(10))
display(xsum_sample.to_pandas())

Downloading builder script:   0%|          | 0.00/5.76k [00:00<?, ?B/s]

Downloading readme:   0%|          | 0.00/6.24k [00:00<?, ?B/s]

Downloading data files:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/255M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/1.00M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/204045 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/11332 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/11334 [00:00<?, ? examples/s]

Unnamed: 0,document,summary,id
0,"The full cost of damage in Newton Stewart, one...",Clean-up operations are continuing across the ...,35232142
1,A fire alarm went off at the Holiday Inn in Ho...,Two tourist buses have been destroyed by fire ...,40143035
2,Ferrari appeared in a position to challenge un...,Lewis Hamilton stormed to pole position at the...,35951548
3,"John Edward Bates, formerly of Spalding, Linco...",A former Lincolnshire Police officer carried o...,36266422
4,Patients and staff were evacuated from Cerahpa...,An armed man who locked himself into a room at...,38826984
5,Simone Favaro got the crucial try with the las...,Defending Pro12 champions Glasgow Warriors bag...,34540833
6,"Veronica Vanessa Chango-Alverez, 31, was kille...",A man with links to a car that was involved in...,20836172
7,Belgian cyclist Demoitie died after a collisio...,Welsh cyclist Luke Rowe says changes to the sp...,35932467
8,"Gundogan, 26, told BBC Sport he ""can see the f...",Manchester City midfielder Ilkay Gundogan says...,40758845
9,The crash happened about 07:20 GMT at the junc...,A jogger has been hit by an unmarked police ca...,30358490


## Develop an LLM pipeline
### Create a Hugging Face pipeline

In [5]:
# Later, we plan to log all of these parameters to MLflow.
# Storing them as variables here will help with that.
hf_model_name = "t5-small"
min_length = 20
max_length = 40
truncation = True
do_sample = True

summarizer = pipeline(model=hf_model_name,
                      task="summarization",
                      min_length=min_length,
                      max_length=max_length)


Downloading (…)lve/main/config.json:   0%|          | 0.00/1.21k [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/242M [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/2.32k [00:00<?, ?B/s]

Downloading (…)ve/main/spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.39M [00:00<?, ?B/s]

In [6]:
doc0 = xsum_sample["document"][0]
summary = summarizer(doc0)[0]['summary_text']

Token indices sequence length is longer than the specified maximum sequence length for this model (541 > 512). Running this sequence through the model will result in indexing errors


In [7]:
print(summary)

the full cost of damage in Newton Stewart is still being assessed . many roads in peeblesshire remain badly affected by standing water . a flood alert remains in place across the


### Track LLM development with MLflow

[MLflow](https://mlflow.org/) has a Tracking component that helps you to track exactly how models or pipelines are produced during development.  Although we are not fitting (tuning or training) a model here, we can still make use of tracking to:
* Track example queries and responses to the LLM pipeline, for later review or analysis
* Store the model as an [MLflow Model flavor](https://mlflow.org/docs/latest/models.html#built-in-model-flavors), thus packaging it for simpler deployment


In [8]:
import pandas as pd
results = summarizer(xsum_sample["document"])
df = pd.DataFrame(results, columns=["summary_text"])

In [9]:
display(df)

Unnamed: 0,summary_text
0,the full cost of damage in Newton Stewart is s...
1,a fire alarm went off at the Holiday Inn in Ho...
2,stewards only handed reprimand after governing...
3,the 67-year-old is accused of committing the o...
4,a man receiving psychiatric treatment at the c...
5,Gregor Townsend gave a debut to powerhouse fij...
6,"Veronica Vanessa Chango-Alverez, 31, was kille..."
7,the 25-year-old was hit by a motorbike during ...
8,gundogan will not be fit for the start of the ...
9,the crash happened about 07:20 GMT at the junc...


[MLflow Tracking](https://mlflow.org/docs/latest/tracking.html) is organized hierarchically as follows:
* **An [experiment](https://mlflow.org/docs/latest/tracking.html#organizing-runs-in-experiments)** generally corresponds to the creation of 1 primary model or pipeline.  In our case, this is our LLM pipeline.  It contains some number of *runs*.
    * **A [run](https://mlflow.org/docs/latest/tracking.html#organizing-runs-in-experiments)** generally corresponds to the creation of 1 sub-model, such as 1 trial during hyperparameter tuning in traditional ML.  In our case, executing this notebook once will only create 1 run, but a second execution of the notebook will create a second run.  This version tracking can be useful during iterative development.  Each run contains some number of logged parameters, metrics, tags, models, artifacts, and other metadata.
       * **A [parameter](https://mlflow.org/docs/latest/tracking.html#concepts)** is an input to the model or pipeline, such as a regularization parameter in traditional ML or `max_length` for our LLM pipeline.
       * **A [metric](https://mlflow.org/docs/latest/tracking.html#concepts)** is an output of evaluation, such as accuracy or loss.
       * **An [artifact](https://mlflow.org/docs/latest/tracking.html#concepts)** is an arbitrary file stored alongside a run's metadata, such as the serialized model itself.
       * **A [flavor](https://mlflow.org/docs/latest/models.html#storage-format)** is an MLflow format for serializing models.  This format uses the underlying ML library's format (such as PyTorch, TensorFlow, Hugging Face, or your custom format), plus metadata.

MLflow has an API for tracking queries and predictions [`mlflow.llm.log_predictions()`](https://mlflow.org/docs/latest/python_api/mlflow.llm.html), which we will use below.  Note that, as of MLflow 2.3.1 (Apr 28, 2023), this API is Experimental, so it may change in later releases.  See the [LLM Tracking page](https://mlflow.org/docs/latest/llm-tracking.html) for more information.

 ***Tip***: We wrap our model development workflow with a call to `with mlflow.start_run():`.  This context manager syntax starts and ends the MLflow run explicitly, which is a best practice for code which may be moved to production.  See the [API doc](https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.start_run) for more information.



In [10]:
!pip install mlflow

Collecting mlflow
  Downloading mlflow-2.7.1-py3-none-any.whl (18.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m18.5/18.5 MB[0m [31m71.7 MB/s[0m eta [36m0:00:00[0m
Collecting databricks-cli<1,>=0.8.7 (from mlflow)
  Downloading databricks_cli-0.18.0-py2.py3-none-any.whl (150 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m150.3/150.3 kB[0m [31m18.4 MB/s[0m eta [36m0:00:00[0m
Collecting gitpython<4,>=2.1.0 (from mlflow)
  Downloading GitPython-3.1.37-py3-none-any.whl (190 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m190.0/190.0 kB[0m [31m21.2 MB/s[0m eta [36m0:00:00[0m
Collecting alembic!=1.10.0,<2 (from mlflow)
  Downloading alembic-1.12.0-py3-none-any.whl (226 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m226.0/226.0 kB[0m [31m25.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting docker<7,>=4.0.0 (from mlflow)
  Downloading docker-6.1.3-py3-none-any.whl (148 kB)
[2K     [90m━━━━━━

In [11]:
import mlflow

In [12]:
mlflow.set_experiment("guru_llm_experiment")

2023/10/14 12:47:49 INFO mlflow.tracking.fluent: Experiment with name 'guru_llm_experiment' does not exist. Creating a new experiment.


<Experiment: artifact_location='file:///content/mlruns/343285995678365066', creation_time=1697287669198, experiment_id='343285995678365066', last_update_time=1697287669198, lifecycle_stage='active', name='guru_llm_experiment', tags={}>

In [13]:
#Log params

with mlflow.start_run():
  mlflow.log_params(
      {
        "hf_model_name": hf_model_name,
        "min_length":min_length,
        "max_length":max_length,
        "truncation":truncation,
        "do_sample":do_sample
      }
  )
 # mlflow.log_metrics
 # mlflow.log_artifacts

# LOG INPUTS (QUERIES) AND OUTPUTS
# Logged `inputs` are expected to be a list of str, or a list of str->str dicts.

results_list = [r["summary_text"] for r in results]
mlflow.llm.log_predictions(
    inputs = xsum_sample["document"],
    outputs = results_list,
    prompts=["" for _ in results_list],
)

# ---------
# LOG MODEL
# We next log our LLM pipeline as an MLflow model.
# This packages the model with useful metadata, such as the library versions used to create it.
# This metadata makes it much easier to deploy the model downstream.
# Under the hood, the model format is simply the ML library's native format (Hugging Face for us), plus metadata.

# It is valuable to log a "signature" with the model telling MLflow the input and output schema for the model.
signature = mlflow.models.infer_signature(
    xsum_sample["document"][0],
    mlflow.transformers.generate_signature_output(
        summarizer, xsum_sample["document"][0]
    ),
)
print(f"Signature:\n{signature}\n")


2023/10/14 12:47:49 INFO mlflow.tracking.llm_utils: Creating a new llm_predictions.csv for run 7f8a76ed2d844498bb28d17dd6e802c4.


Signature:
inputs: 
  [string]
outputs: 
  [string]
params: 
  None




In [14]:
# For mlflow.transformers, if there are inference-time configurations,
    # those need to be saved specially in the log_model call (below).
    # This ensures that the pipeline will use these same configurations when re-loaded.
inference_config = {
    "min_length": min_length,
    "max_length": max_length,
    "truncation": truncation,
    "do_sample": do_sample,
}

# Logging a model returns a handle `model_info` to the model metadata in the tracking server.
# This `model_info` will be useful later in the notebook to retrieve the logged model.
model_info = mlflow.transformers.log_model(
    transformers_model=summarizer,
    artifact_path="summarizer",
    task="summarization",
    inference_config=inference_config,
    signature=signature,
    input_example="This is an example of a long news article which this pipeline can summarize for you.",
)

  model_info = mlflow.transformers.log_model(
  flavor.save_model(path=local_path, mlflow_model=mlflow_model, **kwargs)


Downloading (…)solve/main/README.md:   0%|          | 0.00/8.47k [00:00<?, ?B/s]



### Query the MLflow Tracking server

 **MLflow Tracking API**: We briefly show how to query the logged model and metadata in the MLflow Tracking server, by loading the logged model.  See the [MLflow API](https://mlflow.org/docs/latest/python_api/mlflow.html) for more information about programmatic access.

 **MLflow Tracking UI**: You can also use the UI.  In the right-hand sidebar, click the beaker icon to access the MLflow experiments run list, and then click through to access the Tracking server UI.  There, you can see the logged metadata and model.  Note in particular that our LLM inputs and outputs have been logged as a CSV file under model artifacts.
 GIF of MLflow UI:
 ![GIF of MLflow UI](https://files.training.databricks.com/images/llm/llmops.gif)


In [15]:
loaded_summarizer = mlflow.pyfunc.load_model(model_uri=model_info.model_uri)
loaded_summarizer.predict(xsum_sample["document"][0])

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


['the full cost of damage in Newton Stewart is still being assessed . many roads in peeblesshire remain badly affected by standing water . a flood alert remains in place across the']

In [16]:
!pip install pyngrok --quiet

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/718.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m92.2/718.7 kB[0m [31m2.7 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m716.8/718.7 kB[0m [31m11.3 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m718.7/718.7 kB[0m [31m9.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
  Building wheel for pyngrok (setup.py) ... [?25l[?25hdone


In [17]:
model_info.model_uri

'runs:/7f8a76ed2d844498bb28d17dd6e802c4/summarizer'

In [18]:
results = loaded_summarizer.predict(xsum_sample.to_pandas()["document"])
display(pd.DataFrame(results, columns=["generated_summary"]))

Unnamed: 0,generated_summary
0,the full cost of damage in Newton Stewart is s...
1,fire alarm went off at the Holiday Inn in Hope...
2,Mercedes reprimanded stewards for reversing in...
3,the 67-year-old is accused of committing the o...
4,a man receiving treatment at the clinic threat...
5,Gregor Townsend gave a debut to powerhouse win...
6,"Veronica Vanessa Chango-Alverez, 31, was kille..."
7,the 25-year-old was hit by a motorbike during ...
8,"german german says he ""can see the finishing l..."
9,the crash happened about 07:20 GMT at the junc...


We are now ready to move to the staging step of deployment.  To get started, we will register the model in the MLflow Model Registry (more info below).


In [22]:
# Define the name for the model in the Model Registry.
# We filter out some special characters which cannot be used in model names.
model_name = f"summarizer-llm-guru"
model_name = model_name.replace("/", "_").replace(".", "_").replace(":", "_")
print(model_name)

summarizer-llm-guru


In [23]:
# Register a new model under the given name, or a new model version if the name exists already.
mlflow.register_model(model_uri=model_info.model_uri, name=model_name)

Successfully registered model 'summarizer-llm-guru'.
2023/10/14 13:07:04 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation. Model name: summarizer-llm-guru, version 1
Created version '1' of model 'summarizer-llm-guru'.


<ModelVersion: aliases=[], creation_timestamp=1697288824643, current_stage='None', description=None, last_updated_timestamp=1697288824643, name='summarizer-llm-guru', run_id='7f8a76ed2d844498bb28d17dd6e802c4', run_link=None, source='file:///content/mlruns/343285995678365066/7f8a76ed2d844498bb28d17dd6e802c4/artifacts/summarizer', status='READY', status_message=None, tags={}, user_id=None, version=1>

## Test the LLM pipeline
 During the Staging step of development, our goal is to move code and/or models from Development to Production.  In order to do so, we must test the code and/or models to make sure they are ready for Production.

 We track our progress here using the [MLflow Model Registry](https://mlflow.org/docs/latest/model-registry.html).  This metadata and model store organizes models as follows:
* **A registered model** is a named model in the registry, in our case corresponding to our summarization model.  It may have multiple *versions*.
    * **A model version** is an instance of a given model.  As you update your model, you will create new versions.  Each version is designated as being in a particular *stage* of deployment.
       * **A stage** is a stage of deployment: `None` (development), `Staging`, `Production`, or `Archived`.

 The model we registered above starts with 1 version in stage `None` (development).

 In the workflow below, we will programmatically transition the model from development to staging to production.  For more information on the Model Registry API, see the [Model Registry docs](https://mlflow.org/docs/latest/model-registry.html).  Alternatively, you can edit the registry and make model stage transitions via the UI.  To access the UI, click the Experiments menu option in the left-hand sidebar, and search for your model name.


In [24]:
from mlflow import MlflowClient

client = MlflowClient()
client.search_registered_models(filter_string=f"name = '{model_name}'")

[<RegisteredModel: aliases={}, creation_timestamp=1697288824638, description=None, last_updated_timestamp=1697288824643, latest_versions=[<ModelVersion: aliases=[], creation_timestamp=1697288824643, current_stage='None', description=None, last_updated_timestamp=1697288824643, name='summarizer-llm-guru', run_id='7f8a76ed2d844498bb28d17dd6e802c4', run_link=None, source='file:///content/mlruns/343285995678365066/7f8a76ed2d844498bb28d17dd6e802c4/artifacts/summarizer', status='READY', status_message=None, tags={}, user_id=None, version=1>], name='summarizer-llm-guru', tags={}>]

 In the metadata above, you can see that the model is currently in stage `None` (development).  In this workflow, we will run manual tests, but it would be reasonable to run both automated evaluation and human evaluation in practice.  Once tests pass, we will promote the model to stage `Production` to mark it ready for user-facing applications.

 *Model URIs*: Below, we use model URIs to tell MLflow which model and version we are referring to.  Two common URI patterns for the MLflow Model Registry are:
 * `f"models:/{model_name}/{model_version}"` to refer to a specific model version by number
 * `f"models:/{model_name}/{model_stage}"` to refer to the latest model version in a given stage

In [25]:
model_version = 1
dev_model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version}")
dev_model

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


mlflow.pyfunc.loaded_model:
  artifact_path: summarizer
  flavor: mlflow.transformers
  run_id: 7f8a76ed2d844498bb28d17dd6e802c4

 *Note about model dependencies*:
 When you load the model via MLflow above, you may see warnings about the Python environment.  It is very important to ensure that the environments for development, staging, and production match.
 * For this demo notebook, everything is done within the same notebook environment, so we do not need to worry about libraries and versions.  However, in the Production section below, we demonstrate how to pass the `env_manager` argument to the method for loading the saved MLflow model, which tells MLflow what tooling to use to recreate the environment.
 * To create a genuine production job, make sure to install the needed libraries.  MLflow saves these libraries and versions alongside the logged model; see the [MLflow docs on model storage](https://mlflow.org/docs/latest/models.html#storage-format) for more information.  While using Databricks for this course, you can also generate an example inference notebook which includes code for setting up the environment; see [the model inference docs](https://docs.databricks.com/machine-learning/manage-model-lifecycle/index.html#use-model-for-inference) for batch or streaming inference for more information.


 ### Transition to Staging
 We will move the model to stage `Staging` to indicate that we are actively testing it.


In [26]:
client.transition_model_version_stage(model_name, model_version, "staging")

<ModelVersion: aliases=[], creation_timestamp=1697288824643, current_stage='Staging', description=None, last_updated_timestamp=1697291380192, name='summarizer-llm-guru', run_id='7f8a76ed2d844498bb28d17dd6e802c4', run_link=None, source='file:///content/mlruns/343285995678365066/7f8a76ed2d844498bb28d17dd6e802c4/artifacts/summarizer', status='READY', status_message=None, tags={}, user_id=None, version=1>

In [27]:
staging_model = dev_model
results = staging_model.predict(xsum_sample.to_pandas()["document"])
display(pd.DataFrame(results, columns=["generated_summary"]))

Unnamed: 0,generated_summary
0,many businesses and householders were affected...
1,fire alarm went off at the Holiday Inn in Hope...
2,stewards only handed Hamilton a reprimand afte...
3,the 67-year-old is accused of committing the o...
4,a man receiving psychiatric treatment at the c...
5,Gregor Townsend gave a debut to powerhouse win...
6,"Veronica Vanessa Chango-Alverez, 31, was kille..."
7,the 25-year-old was hit by a motorbike during ...
8,german says he can see the finishing line afte...
9,the crash happened about 07:20 GMT at the junc...


 ### Transition to Production
 The results look great!  :) Let's transition the model to Production.


In [28]:
client.transition_model_version_stage(model_name, model_version, "production")

<ModelVersion: aliases=[], creation_timestamp=1697288824643, current_stage='Production', description=None, last_updated_timestamp=1697291617936, name='summarizer-llm-guru', run_id='7f8a76ed2d844498bb28d17dd6e802c4', run_link=None, source='file:///content/mlruns/343285995678365066/7f8a76ed2d844498bb28d17dd6e802c4/artifacts/summarizer', status='READY', status_message=None, tags={}, user_id=None, version=1>