## RAFT Finetuning using Azure AI MaaS Serverless

This sample shows how use `text-classification` components from the `azureml` system registry to fine tune a model to detect emotions using emotion dataset. [Serverless compute](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-use-serverless-compute?view=azureml-api-2&tabs=python) is used to fine-tune the model. We then deploy the fine tuned model to an online endpoint for real time inference. The model is trained on tiny sample of the dataset with a small number of epochs to illustrate the fine tuning approach.

### Training data
We will use the [emotion](https://huggingface.co/datasets/dair-ai/emotion) dataset.

### Model
This notebook is curated for `Llama` models for text-classification, Llama models are picked from `azureml-meta` registry. In this notebook we finetune with Llama-2-7b model, if you want to finetune with other variants like 13b or 70b, you can use this notebook with probably different SKUs.

### Outline
* Pick a model to fine tune.
* Pick and explore training data.
* Configure the fine tuning job.
* Run the fine tuning job.
* Review training and evaluation metrics. 
* Register the fine tuned model. 
* Deploy the fine tuned model for real time inference.
* Clean up resources. 

### 1. Setup pre-requisites
* Install dependencies
* Connect to AzureML Workspace. Learn more at [set up SDK authentication](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication?tabs=sdk). Replace  `<WORKSPACE_NAME>`, `<RESOURCE_GROUP>` and `<SUBSCRIPTION_ID>` below.
* Connect to `azureml` system registry
* Set an optional experiment name


Install dependencies by running below cell. This is not an optional step if running in a new environment.

In [1]:
%pip install azure-ai-ml
%pip install azure-identity
%pip install datasets
%pip install mlflow
%pip install azureml-mlflow

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.
Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.
Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.
Defaulting to user installation because normal site-packages is not writeable
Collecting mlflow
  Downloading mlflow-2.15.0-py3-none-any.whl.metadata (29 kB)
Collecting mlflow-skinny==2.15.0 (from mlflow)
  Downloading mlflow_skinny-2.15.0-py3-none-any.whl.metadata (30 kB)
Collecting alembic!=1.10.0,<2 (from mlflow)
  Downloading alembic-1.13.2-py3-none-any.whl.metadata (7.4 kB)
Collecting docker<8,>=4.0.0 (from mlflow)
  Downloading docker-7.1.0-py3-none-any.whl.metadata (3.8 kB)
Collecting graphene<4 (from mlflow)
  Downloading graphene-3.3-py2.py3-none-any.wh

In [1]:
from azure.ai.ml import MLClient
from azure.identity import (
    DefaultAzureCredential,
    InteractiveBrowserCredential,
    ClientSecretCredential,
)
from azure.ai.ml.entities import AmlCompute
import time

try:
    credential = DefaultAzureCredential()
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    credential = InteractiveBrowserCredential()

#try:
workspace_ml_client = MLClient.from_config(credential=credential)
#except:
#    workspace_ml_client = MLClient(
#        credential,
#        subscription_id="<SUBSCRIPTION_ID>",
#        resource_group_name="<RESOURCE_GROUP>",
#        workspace_name="<WORKSPACE_NAME>",
#    )

# the models, fine tuning pipelines and environments are available in the AzureML system registry, "azureml"
registry_ml_client = MLClient(credential, registry_name="azureml")
registry_ml_client_meta = MLClient(credential, registry_name="azureml-meta")

experiment_name = "raft-llama"

# generating a unique timestamp that can be used for names and versions that need to be unique
timestamp = str(int(time.time()))

Found the config file in: ./config.json


### 2. Pick a foundation model to fine tune

Models that support `fill-mask` tasks are good candidates to fine tune for `text-classification`. You can browse these models in the Model Catalog in the AzureML Studio, filtering by the `fill-mask` task. In this example, we use the `bert-base-uncased` model. If you have opened this notebook for a different model, replace the model name and version accordingly. 

Note the model id property of the model. This will be passed as input to the fine tuning job. This is also available as the `Asset ID` field in model details page in AzureML Studio Model Catalog. 

In [2]:
model_name = "Llama-2-7b"
foundation_model = registry_ml_client_meta.models.get(model_name, label="latest")
print(
    "\n\nUsing model name: {0}, version: {1}, id: {2} for fine tuning".format(
        foundation_model.name, foundation_model.version, foundation_model.id
    )
)



Using model name: Llama-2-7b, version: 22, id: azureml://registries/azureml-meta/models/Llama-2-7b/versions/22 for fine tuning


### 3. Pick the dataset for fine-tuning the model

We use the [emotion](https://huggingface.co/datasets/dair-ai/emotion) dataset. The next few cells show basic data preparation for fine tuning:
* Visualize some data rows
* Replace numerical categories in data with the actual string labels. This mapping is available in the [./emotion-dataset/label.json](./emotion-dataset/label.json). This step is needed if you want string labels such as `anger`, `joy`, etc. returned when scoring the model. If you skip this step, the model will return numerical categories such as 0, 1, 2, etc. and you will have to map them to what the category represents yourself. 
* We want this sample to run quickly, so save smaller `train`, `validation` and `test` files containing 10% of the original. This means the fine tuned model will have lower accuracy, hence it should not be put to real-world use. 

##### Here is an example of how the data should look like

Single text classification requires the training data to include at least 2 fields – one for ‘Sentence1’ and ‘Label’ like in this example. Sentence 2 can be left blank in this case. The below examples are from Emotion dataset. 

| Text (Sentence1) | Label (Label) |
| :- | :- |
| i feel so blessed to be able to share it with you all | joy | 
| i feel intimidated nervous and overwhelmed and i shake like a leaf | fear | 

 

Text pair classification, where you have two sentences to be classified (e.g., sentence entailment) will need the training data to have 3 fields – for ‘Sentence1’, ‘Sentence2’ and ‘Label’ like in this example. The below examples are from Microsoft Research Paraphrase Corpus dataset. 

| Text1 (Sentence 1) | Text2 (Sentence 2) | Label_text (Label) |
| :- | :- | :- |
| Amrozi accused his brother , whom he called " the witness " , of deliberately distorting his evidence . | Referring to him as only " the witness " , Amrozi accused his brother of deliberately distorting his evidence . | equivalent |
| Yucaipa owned Dominick 's before selling the chain to Safeway in 1998 for $ 2.5 billion . | Yucaipa bought Dominick 's in 1995 for \$ 693 million and sold it to Safeway for \$ 1.8 billion in 1998 . | not equivalent |

 

In [5]:
# load the ./emotion-dataset/train.jsonl file into a pandas dataframe and show the first 5 rows
import pandas as pd

ds_name = "wakeboarding-DEMO"
ds_path = f"dataset/{ds_name}"

dataset_path_hf_train = f"{ds_path}-files/{ds_name}-hf.train.jsonl"
dataset_path_hf_valid = f"{ds_path}-files/{ds_name}-hf.valid.jsonl"
dataset_path_hf_eval = f"{ds_path}-files/{ds_name}-hf.eval.jsonl"

pd.set_option(
    "display.max_colwidth", 0
)  # set the max column width to 0 to display the full text
df = pd.read_json(dataset_path_hf_train, lines=True)
df.head(2)

Unnamed: 0,id,type,question,context,oracle_context,cot_answer,instruction
0,19afd70c-212c-41ca-8add-62d3ba3e74f2,general,What is a heel side backflip also known as?,"{'sentences': [['The less stretch inEquipment the rope, the better it is for the rider to learn how to ride and do various tricks.', 'This does not necessarily mean the rider is fullyupside down, as evidenced by the Raley, a trick where the rider extends their body parallel to thewater bending at the knees to achieve inversion.[21] The heel side backflip, otherwise known as atantrum, is often regarded as the first and easiest invert to learn as an intermediate level rider. This isbecause the shape of the wake or the “kicker” (a type of floating ramp, used primarily at cable parks,that mimics the shape of a wake) naturally initiates the sequence of motions that are necessary tocomplete the trick.[22]There are many different ways that people are engaged in this sport.', '""Wakeboarding on a ﬂooded road"" (https://www.bbc.co.uk/news/av/uk-25665314). BBC News.Retrieved 2020-11-15.Tricks CommunityRelationship to KitesurﬁngSee alsoReferences', 'Wakeboards can also vary in the amount offins they have which can be up to 5. Fins will make it easier to turn and stay level for beginners butshould be removed for tricks and spins since they can make it easier to catch an edge on a jump orduring a spin.[1] (https://www.wakemakers.com/blogs/resources/wakesurf-fins-explained#:~:text=The%20three%20fin%20setup%20known,to%20make%20big%20bottom%20turns.&text=A%20quad%20fin%20setup%20give%20your%20board%20acceleration%20and%20speed.) Another part ofthe wakeboard itself to consider is the size of the board and the material, which are usually a wood orfoam core with a fiber glass or carbon fiber coating.[14][2] (https://www.saertex.com/en/services/references_overview/referenz/handmade-carbon-wakeboards#:~:text=HANDMADE%20CARBON%20WAKEBOARDS,fiber%20and%20a%20wooden%20core.)At the same time, the length of a wakeboarding rope will vary based on the rider's preferences. Alonger rope may allow for more preparation and momentum before performing tricks,[15] whereas ashorter rope may require less speed and height to travel between wakes.[15] In addition to this, thematerial of the rope being used can make a significant difference in ride experience.']], 'title': [['placeholder_title', 'placeholder_title', 'placeholder_title', 'placeholder_title']]}","This does not necessarily mean the rider is fullyupside down, as evidenced by the Raley, a trick where the rider extends their body parallel to thewater bending at the knees to achieve inversion.[21] The heel side backflip, otherwise known as atantrum, is often regarded as the first and easiest invert to learn as an intermediate level rider. This isbecause the shape of the wake or the “kicker” (a type of floating ramp, used primarily at cable parks,that mimics the shape of a wake) naturally initiates the sequence of motions that are necessary tocomplete the trick.[22]There are many different ways that people are engaged in this sport.","To answer the question, we need to identify the alternative name for a heel side backflip.\n\nThe context provided gives us the necessary information to determine this.\n\nFirst, we look for the part of the context that directly mentions the heel side backflip. \n\nWe find it in the sentence: ##begin_quote##The heel side backflip, otherwise known as an tantrum, is often regarded as the first and easiest invert to learn as an intermediate level rider##end_quote##.\n\nFrom this sentence, we understand that the heel side backflip is also known by another name. \n\nThe alternative name is explicitly mentioned in the same sentence as ""an tantrum"" (assuming ""an"" is a typo and it should be ""a"").\n\nTherefore, based on the context provided, we can conclude that the heel side backflip is also known as a tantrum.\n\n<ANSWER>: Tantrum","<DOCUMENT>The less stretch inEquipment\nthe rope, the better it is for the rider to learn how to ride and do various tricks.</DOCUMENT>\n<DOCUMENT>This does not necessarily mean the rider is fullyupside down, as evidenced by the Raley, a trick where the rider extends their body parallel to thewater bending at the knees to achieve inversion.[21] The heel side backflip, otherwise known as atantrum, is often regarded as the first and easiest invert to learn as an intermediate level rider. This isbecause the shape of the wake or the “kicker” (a type of floating ramp, used primarily at cable parks,that mimics the shape of a wake) naturally initiates the sequence of motions that are necessary tocomplete the trick.[22]There are many different ways that people are engaged in this sport.</DOCUMENT>\n<DOCUMENT>""Wakeboarding on a ﬂooded road"" (https://www.bbc.co.uk/news/av/uk-25665314). BBC News.Retrieved 2020-11-15.Tricks\nCommunityRelationship to KitesurﬁngSee alsoReferences</DOCUMENT>\n<DOCUMENT>Wakeboards can also vary in the amount offins they have which can be up to 5. Fins will make it easier to turn and stay level for beginners butshould be removed for tricks and spins since they can make it easier to catch an edge on a jump orduring a spin.[1] (https://www.wakemakers.com/blogs/resources/wakesurf-fins-explained#:~:text=The%20three%20fin%20setup%20known,to%20make%20big%20bottom%20turns.&text=A%20quad%20fin%20setup%20give%20your%20board%20acceleration%20and%20speed.) Another part ofthe wakeboard itself to consider is the size of the board and the material, which are usually a wood orfoam core with a fiber glass or carbon fiber coating.[14][2] (https://www.saertex.com/en/services/references_overview/referenz/handmade-carbon-wakeboards#:~:text=HANDMADE%20CARBON%20WAKEBOARDS,fiber%20and%20a%20wooden%20core.)At the same time, the length of a wakeboarding rope will vary based on the rider's preferences. Alonger rope may allow for more preparation and momentum before performing tricks,[15] whereas ashorter rope may require less speed and height to travel between wakes.[15] In addition to this, thematerial of the rope being used can make a significant difference in ride experience.</DOCUMENT>\nWhat is a heel side backflip also known as?"
1,0dd9a26b-3653-4a7e-88b2-22ea06edc8fb,general,What is an advanced type of wakeboarding trick?,"{'sentences': [['The latter soon grew in popularity, and a braided rope made out of polypropylene waseventually introduced. A few years later, a rope that was less stretchy was invented that gave the ridera more consistent pull on the rope, though they were heavier and larger in diameter. Modern ropes arecoated with Spectra and Dyneema, which decrease the stretch and drag of the rope.When wakeboarding, a wide variety of safety equipment is used.These include life vests or other buoyancy aids which preventwater-related injuries and deaths, as they keep the wearer buoyantuntil they are picked up. These vests especially help in situationswhere the rider is knocked unconscious or cannot tread water.[12]Furthermore, wakeboarders use water-resistant helmets that areable to ventilate water so that the helmets do not fill up withwater.[12] Even the length of a wakeboarder's rope is important inproviding safety; if the rope is too long, a wakeboarder will land onthe flat part of the wake instead of the downside, and their knees will buckle.Wakeboards themselves are also important to consider when preparing to ride. Four of the mostcommon wakeboard styles include the continuous rocker, the three-stage rocker, the five-stage rocker,and the hybrid rocker.[13] Rocker refers to the curve of the board from head to tail and each type ofboard has a different placement and amount of curve that works best for different types of rides.', 'Popular brands like Byerly,Hyperlight, O'Brien and Ronix provide ever evolving technology and equipment as well as qualityadvice for new and seasoned riders alike.', 'Most of these stem from the wakeboarderadjusting their posture, edge, and distribution of weight so as to fly into the air upon hitting thewake.[17] Various tricks include the toeside edge,[17] the heelside edge,[18] the ride switch,[19] and the180° spin.[20]More advanced tricks call for what is known as inversion or “inverts”.', 'A person wakeboarding in Zug, Switzerland WakeboardingWakeboarding is a water sport in which therider, standing on a wakeboard (a board with footbindings), is towed behind a motorboat across itswake and especially up off the crest in order toperform aerial maneuvers.[1] A hallmark ofwakeboarding is the attempted performance ofmidair tricks. Wakeboarding was developed from acombination of water skiing, snowboarding andsurfing techniques.The rider is usually towed by a rope behind a boat,but can also be towed by cable systems andwinches, and be pulled by other motorized vehicles like personal watercraft, cars, trucks, and all-terrain vehicles. The gear and wakeboard boat used are often personalized to each rider's liking.Though natural watercourses such as rivers, lakes and areas of open water are generally used inwakeboarding, it is possible to wakeboard in unconventional locations, such as flooded roads and carparks, using a car as the towing vehicle.[2][3][4]Wakeboarding is done for pleasure and competition, ranging from freestyle wakeboarding andwakeboard parks to wakeboard competitions at the X Games, WWA Wakeboard WorldChampionships, or similar events.[5] Though it had an obscure start, it is now the world's fastest-growing water sport.[6]The sport that would later become wake boarding began as water-skiing with mono water skis (singlewater skis) in the USA around 1950. The main difference between mono water skis and contemporarywake boards is the shape of the board and the stance.Wake surfing - using surfboards as mono water skis - began picking up traction around 1964, where itwas seen as an ""exciting new sport that's soon going to sweep the waterways.""[7] Although surfboardswere originally used, boards without straps or bindings were first seen in New Zealand with boardscalled ""skurfboards"".']], 'title': [['placeholder_title', 'placeholder_title', 'placeholder_title', 'placeholder_title']]}","Most of these stem from the wakeboarderadjusting their posture, edge, and distribution of weight so as to fly into the air upon hitting thewake.[17] Various tricks include the toeside edge,[17] the heelside edge,[18] the ride switch,[19] and the180° spin.[20]More advanced tricks call for what is known as inversion or “inverts”.","To answer the question, we need to identify an advanced type of wakeboarding trick. The context provided gives us the necessary information to determine this.\n\nFirst, we look for the part of the context that mentions advanced wakeboarding tricks. We find it in the sentence: ##begin_quote##More advanced tricks call for what is known as inversion or “inverts”##end_quote##.\n\nFrom this sentence, we understand that advanced wakeboarding tricks involve ""inversion"" or ""inverts."" This suggests that an advanced type of wakeboarding trick is one that involves inversion.\n\nTherefore, based on the context provided, we can conclude that an advanced type of wakeboarding trick is an invert.\n\n<ANSWER>: Invert","<DOCUMENT>The latter soon grew in popularity, and a braided rope made out of polypropylene waseventually introduced. A few years later, a rope that was less stretchy was invented that gave the ridera more consistent pull on the rope, though they were heavier and larger in diameter. Modern ropes arecoated with Spectra and Dyneema, which decrease the stretch and drag of the rope.When wakeboarding, a wide variety of safety equipment is used.These include life vests or other buoyancy aids which preventwater-related injuries and deaths, as they keep the wearer buoyantuntil they are picked up. These vests especially help in situationswhere the rider is knocked unconscious or cannot tread water.[12]Furthermore, wakeboarders use water-resistant helmets that areable to ventilate water so that the helmets do not fill up withwater.[12] Even the length of a wakeboarder's rope is important inproviding safety; if the rope is too long, a wakeboarder will land onthe flat part of the wake instead of the downside, and their knees will buckle.Wakeboards themselves are also important to consider when preparing to ride. Four of the mostcommon wakeboard styles include the continuous rocker, the three-stage rocker, the five-stage rocker,and the hybrid rocker.[13] Rocker refers to the curve of the board from head to tail and each type ofboard has a different placement and amount of curve that works best for different types of rides.</DOCUMENT>\n<DOCUMENT>Popular brands like Byerly,Hyperlight, O'Brien and Ronix provide ever evolving technology and equipment as well as qualityadvice for new and seasoned riders alike.</DOCUMENT>\n<DOCUMENT>Most of these stem from the wakeboarderadjusting their posture, edge, and distribution of weight so as to fly into the air upon hitting thewake.[17] Various tricks include the toeside edge,[17] the heelside edge,[18] the ride switch,[19] and the180° spin.[20]More advanced tricks call for what is known as inversion or “inverts”.</DOCUMENT>\n<DOCUMENT>A person wakeboarding in Zug, Switzerland\nWakeboardingWakeboarding is a water sport in which therider, standing on a wakeboard (a board with footbindings), is towed behind a motorboat across itswake and especially up off the crest in order toperform aerial maneuvers.[1] A hallmark ofwakeboarding is the attempted performance ofmidair tricks. Wakeboarding was developed from acombination of water skiing, snowboarding andsurfing techniques.The rider is usually towed by a rope behind a boat,but can also be towed by cable systems andwinches, and be pulled by other motorized vehicles like personal watercraft, cars, trucks, and all-terrain vehicles. The gear and wakeboard boat used are often personalized to each rider's liking.Though natural watercourses such as rivers, lakes and areas of open water are generally used inwakeboarding, it is possible to wakeboard in unconventional locations, such as flooded roads and carparks, using a car as the towing vehicle.[2][3][4]Wakeboarding is done for pleasure and competition, ranging from freestyle wakeboarding andwakeboard parks to wakeboard competitions at the X Games, WWA Wakeboard WorldChampionships, or similar events.[5] Though it had an obscure start, it is now the world's fastest-growing water sport.[6]The sport that would later become wake boarding began as water-skiing with mono water skis (singlewater skis) in the USA around 1950. The main difference between mono water skis and contemporarywake boards is the shape of the board and the stance.Wake surfing - using surfboards as mono water skis - began picking up traction around 1964, where itwas seen as an ""exciting new sport that's soon going to sweep the waterways.""[7] Although surfboardswere originally used, boards without straps or bindings were first seen in New Zealand with boardscalled ""skurfboards"".</DOCUMENT>\nWhat is an advanced type of wakeboarding trick?"


In [6]:
# load test.jsonl, train.jsonl and validation.jsonl form the ./emotion-dataset folder into pandas dataframes
test_df = pd.read_json(dataset_path_hf_eval, lines=True)
train_df = pd.read_json(dataset_path_hf_train, lines=True)
validation_df = pd.read_json(dataset_path_hf_valid, lines=True)
# join the train, validation and test dataframes with the id2label dataframe to get the label_string column
# show the first 2 rows of the train dataframe
train_df.head(2)

Unnamed: 0,id,type,question,context,oracle_context,cot_answer,instruction
0,19afd70c-212c-41ca-8add-62d3ba3e74f2,general,What is a heel side backflip also known as?,"{'sentences': [['The less stretch inEquipment the rope, the better it is for the rider to learn how to ride and do various tricks.', 'This does not necessarily mean the rider is fullyupside down, as evidenced by the Raley, a trick where the rider extends their body parallel to thewater bending at the knees to achieve inversion.[21] The heel side backflip, otherwise known as atantrum, is often regarded as the first and easiest invert to learn as an intermediate level rider. This isbecause the shape of the wake or the “kicker” (a type of floating ramp, used primarily at cable parks,that mimics the shape of a wake) naturally initiates the sequence of motions that are necessary tocomplete the trick.[22]There are many different ways that people are engaged in this sport.', '""Wakeboarding on a ﬂooded road"" (https://www.bbc.co.uk/news/av/uk-25665314). BBC News.Retrieved 2020-11-15.Tricks CommunityRelationship to KitesurﬁngSee alsoReferences', 'Wakeboards can also vary in the amount offins they have which can be up to 5. Fins will make it easier to turn and stay level for beginners butshould be removed for tricks and spins since they can make it easier to catch an edge on a jump orduring a spin.[1] (https://www.wakemakers.com/blogs/resources/wakesurf-fins-explained#:~:text=The%20three%20fin%20setup%20known,to%20make%20big%20bottom%20turns.&text=A%20quad%20fin%20setup%20give%20your%20board%20acceleration%20and%20speed.) Another part ofthe wakeboard itself to consider is the size of the board and the material, which are usually a wood orfoam core with a fiber glass or carbon fiber coating.[14][2] (https://www.saertex.com/en/services/references_overview/referenz/handmade-carbon-wakeboards#:~:text=HANDMADE%20CARBON%20WAKEBOARDS,fiber%20and%20a%20wooden%20core.)At the same time, the length of a wakeboarding rope will vary based on the rider's preferences. Alonger rope may allow for more preparation and momentum before performing tricks,[15] whereas ashorter rope may require less speed and height to travel between wakes.[15] In addition to this, thematerial of the rope being used can make a significant difference in ride experience.']], 'title': [['placeholder_title', 'placeholder_title', 'placeholder_title', 'placeholder_title']]}","This does not necessarily mean the rider is fullyupside down, as evidenced by the Raley, a trick where the rider extends their body parallel to thewater bending at the knees to achieve inversion.[21] The heel side backflip, otherwise known as atantrum, is often regarded as the first and easiest invert to learn as an intermediate level rider. This isbecause the shape of the wake or the “kicker” (a type of floating ramp, used primarily at cable parks,that mimics the shape of a wake) naturally initiates the sequence of motions that are necessary tocomplete the trick.[22]There are many different ways that people are engaged in this sport.","To answer the question, we need to identify the alternative name for a heel side backflip.\n\nThe context provided gives us the necessary information to determine this.\n\nFirst, we look for the part of the context that directly mentions the heel side backflip. \n\nWe find it in the sentence: ##begin_quote##The heel side backflip, otherwise known as an tantrum, is often regarded as the first and easiest invert to learn as an intermediate level rider##end_quote##.\n\nFrom this sentence, we understand that the heel side backflip is also known by another name. \n\nThe alternative name is explicitly mentioned in the same sentence as ""an tantrum"" (assuming ""an"" is a typo and it should be ""a"").\n\nTherefore, based on the context provided, we can conclude that the heel side backflip is also known as a tantrum.\n\n<ANSWER>: Tantrum","<DOCUMENT>The less stretch inEquipment\nthe rope, the better it is for the rider to learn how to ride and do various tricks.</DOCUMENT>\n<DOCUMENT>This does not necessarily mean the rider is fullyupside down, as evidenced by the Raley, a trick where the rider extends their body parallel to thewater bending at the knees to achieve inversion.[21] The heel side backflip, otherwise known as atantrum, is often regarded as the first and easiest invert to learn as an intermediate level rider. This isbecause the shape of the wake or the “kicker” (a type of floating ramp, used primarily at cable parks,that mimics the shape of a wake) naturally initiates the sequence of motions that are necessary tocomplete the trick.[22]There are many different ways that people are engaged in this sport.</DOCUMENT>\n<DOCUMENT>""Wakeboarding on a ﬂooded road"" (https://www.bbc.co.uk/news/av/uk-25665314). BBC News.Retrieved 2020-11-15.Tricks\nCommunityRelationship to KitesurﬁngSee alsoReferences</DOCUMENT>\n<DOCUMENT>Wakeboards can also vary in the amount offins they have which can be up to 5. Fins will make it easier to turn and stay level for beginners butshould be removed for tricks and spins since they can make it easier to catch an edge on a jump orduring a spin.[1] (https://www.wakemakers.com/blogs/resources/wakesurf-fins-explained#:~:text=The%20three%20fin%20setup%20known,to%20make%20big%20bottom%20turns.&text=A%20quad%20fin%20setup%20give%20your%20board%20acceleration%20and%20speed.) Another part ofthe wakeboard itself to consider is the size of the board and the material, which are usually a wood orfoam core with a fiber glass or carbon fiber coating.[14][2] (https://www.saertex.com/en/services/references_overview/referenz/handmade-carbon-wakeboards#:~:text=HANDMADE%20CARBON%20WAKEBOARDS,fiber%20and%20a%20wooden%20core.)At the same time, the length of a wakeboarding rope will vary based on the rider's preferences. Alonger rope may allow for more preparation and momentum before performing tricks,[15] whereas ashorter rope may require less speed and height to travel between wakes.[15] In addition to this, thematerial of the rope being used can make a significant difference in ride experience.</DOCUMENT>\nWhat is a heel side backflip also known as?"
1,0dd9a26b-3653-4a7e-88b2-22ea06edc8fb,general,What is an advanced type of wakeboarding trick?,"{'sentences': [['The latter soon grew in popularity, and a braided rope made out of polypropylene waseventually introduced. A few years later, a rope that was less stretchy was invented that gave the ridera more consistent pull on the rope, though they were heavier and larger in diameter. Modern ropes arecoated with Spectra and Dyneema, which decrease the stretch and drag of the rope.When wakeboarding, a wide variety of safety equipment is used.These include life vests or other buoyancy aids which preventwater-related injuries and deaths, as they keep the wearer buoyantuntil they are picked up. These vests especially help in situationswhere the rider is knocked unconscious or cannot tread water.[12]Furthermore, wakeboarders use water-resistant helmets that areable to ventilate water so that the helmets do not fill up withwater.[12] Even the length of a wakeboarder's rope is important inproviding safety; if the rope is too long, a wakeboarder will land onthe flat part of the wake instead of the downside, and their knees will buckle.Wakeboards themselves are also important to consider when preparing to ride. Four of the mostcommon wakeboard styles include the continuous rocker, the three-stage rocker, the five-stage rocker,and the hybrid rocker.[13] Rocker refers to the curve of the board from head to tail and each type ofboard has a different placement and amount of curve that works best for different types of rides.', 'Popular brands like Byerly,Hyperlight, O'Brien and Ronix provide ever evolving technology and equipment as well as qualityadvice for new and seasoned riders alike.', 'Most of these stem from the wakeboarderadjusting their posture, edge, and distribution of weight so as to fly into the air upon hitting thewake.[17] Various tricks include the toeside edge,[17] the heelside edge,[18] the ride switch,[19] and the180° spin.[20]More advanced tricks call for what is known as inversion or “inverts”.', 'A person wakeboarding in Zug, Switzerland WakeboardingWakeboarding is a water sport in which therider, standing on a wakeboard (a board with footbindings), is towed behind a motorboat across itswake and especially up off the crest in order toperform aerial maneuvers.[1] A hallmark ofwakeboarding is the attempted performance ofmidair tricks. Wakeboarding was developed from acombination of water skiing, snowboarding andsurfing techniques.The rider is usually towed by a rope behind a boat,but can also be towed by cable systems andwinches, and be pulled by other motorized vehicles like personal watercraft, cars, trucks, and all-terrain vehicles. The gear and wakeboard boat used are often personalized to each rider's liking.Though natural watercourses such as rivers, lakes and areas of open water are generally used inwakeboarding, it is possible to wakeboard in unconventional locations, such as flooded roads and carparks, using a car as the towing vehicle.[2][3][4]Wakeboarding is done for pleasure and competition, ranging from freestyle wakeboarding andwakeboard parks to wakeboard competitions at the X Games, WWA Wakeboard WorldChampionships, or similar events.[5] Though it had an obscure start, it is now the world's fastest-growing water sport.[6]The sport that would later become wake boarding began as water-skiing with mono water skis (singlewater skis) in the USA around 1950. The main difference between mono water skis and contemporarywake boards is the shape of the board and the stance.Wake surfing - using surfboards as mono water skis - began picking up traction around 1964, where itwas seen as an ""exciting new sport that's soon going to sweep the waterways.""[7] Although surfboardswere originally used, boards without straps or bindings were first seen in New Zealand with boardscalled ""skurfboards"".']], 'title': [['placeholder_title', 'placeholder_title', 'placeholder_title', 'placeholder_title']]}","Most of these stem from the wakeboarderadjusting their posture, edge, and distribution of weight so as to fly into the air upon hitting thewake.[17] Various tricks include the toeside edge,[17] the heelside edge,[18] the ride switch,[19] and the180° spin.[20]More advanced tricks call for what is known as inversion or “inverts”.","To answer the question, we need to identify an advanced type of wakeboarding trick. The context provided gives us the necessary information to determine this.\n\nFirst, we look for the part of the context that mentions advanced wakeboarding tricks. We find it in the sentence: ##begin_quote##More advanced tricks call for what is known as inversion or “inverts”##end_quote##.\n\nFrom this sentence, we understand that advanced wakeboarding tricks involve ""inversion"" or ""inverts."" This suggests that an advanced type of wakeboarding trick is one that involves inversion.\n\nTherefore, based on the context provided, we can conclude that an advanced type of wakeboarding trick is an invert.\n\n<ANSWER>: Invert","<DOCUMENT>The latter soon grew in popularity, and a braided rope made out of polypropylene waseventually introduced. A few years later, a rope that was less stretchy was invented that gave the ridera more consistent pull on the rope, though they were heavier and larger in diameter. Modern ropes arecoated with Spectra and Dyneema, which decrease the stretch and drag of the rope.When wakeboarding, a wide variety of safety equipment is used.These include life vests or other buoyancy aids which preventwater-related injuries and deaths, as they keep the wearer buoyantuntil they are picked up. These vests especially help in situationswhere the rider is knocked unconscious or cannot tread water.[12]Furthermore, wakeboarders use water-resistant helmets that areable to ventilate water so that the helmets do not fill up withwater.[12] Even the length of a wakeboarder's rope is important inproviding safety; if the rope is too long, a wakeboarder will land onthe flat part of the wake instead of the downside, and their knees will buckle.Wakeboards themselves are also important to consider when preparing to ride. Four of the mostcommon wakeboard styles include the continuous rocker, the three-stage rocker, the five-stage rocker,and the hybrid rocker.[13] Rocker refers to the curve of the board from head to tail and each type ofboard has a different placement and amount of curve that works best for different types of rides.</DOCUMENT>\n<DOCUMENT>Popular brands like Byerly,Hyperlight, O'Brien and Ronix provide ever evolving technology and equipment as well as qualityadvice for new and seasoned riders alike.</DOCUMENT>\n<DOCUMENT>Most of these stem from the wakeboarderadjusting their posture, edge, and distribution of weight so as to fly into the air upon hitting thewake.[17] Various tricks include the toeside edge,[17] the heelside edge,[18] the ride switch,[19] and the180° spin.[20]More advanced tricks call for what is known as inversion or “inverts”.</DOCUMENT>\n<DOCUMENT>A person wakeboarding in Zug, Switzerland\nWakeboardingWakeboarding is a water sport in which therider, standing on a wakeboard (a board with footbindings), is towed behind a motorboat across itswake and especially up off the crest in order toperform aerial maneuvers.[1] A hallmark ofwakeboarding is the attempted performance ofmidair tricks. Wakeboarding was developed from acombination of water skiing, snowboarding andsurfing techniques.The rider is usually towed by a rope behind a boat,but can also be towed by cable systems andwinches, and be pulled by other motorized vehicles like personal watercraft, cars, trucks, and all-terrain vehicles. The gear and wakeboard boat used are often personalized to each rider's liking.Though natural watercourses such as rivers, lakes and areas of open water are generally used inwakeboarding, it is possible to wakeboard in unconventional locations, such as flooded roads and carparks, using a car as the towing vehicle.[2][3][4]Wakeboarding is done for pleasure and competition, ranging from freestyle wakeboarding andwakeboard parks to wakeboard competitions at the X Games, WWA Wakeboard WorldChampionships, or similar events.[5] Though it had an obscure start, it is now the world's fastest-growing water sport.[6]The sport that would later become wake boarding began as water-skiing with mono water skis (singlewater skis) in the USA around 1950. The main difference between mono water skis and contemporarywake boards is the shape of the board and the stance.Wake surfing - using surfboards as mono water skis - began picking up traction around 1964, where itwas seen as an ""exciting new sport that's soon going to sweep the waterways.""[7] Although surfboardswere originally used, boards without straps or bindings were first seen in New Zealand with boardscalled ""skurfboards"".</DOCUMENT>\nWhat is an advanced type of wakeboarding trick?"


### 4. Submit the fine tuning job using the the model and data as inputs
 
Create the job that uses the `text-classification` pipeline component. [Learn more](https://github.com/Azure/azureml-assets/blob/main/training/finetune_acft_hf_nlp/components/pipeline_components/text_classification/README.md) about all the parameters supported for fine tuning.

Define finetune parameters

Finetune parameters can be grouped into 2 categories - training parameters, optimization parameters

Training parameters define the training aspects such as - 
1. the optimizer, scheduler to use
2. the metric to optimize the finetune
3. number of training steps and the batch size
and so on

Optimization parameters help in optimizing the GPU memory and effectively using the compute resources. Below are few of the parameters that belong to this category. _The optimization parameters differs for each model and are packaged with the model to handle these variations._
1. enable the deepspeed, ORT and LoRA
2. enable mixed precision training
2. enable multi-node training 

In [None]:
import ast

# Training parameters
training_parameters = dict(
    num_train_epochs=3,
    per_device_train_batch_size=1,
    per_device_eval_batch_size=1,
    learning_rate=2e-5,
    metric_for_best_model="f1_macro",
)
print(f"The following training parameters are enabled - {training_parameters}")

# Optimization parameters - As these parameters are packaged with the model itself, lets retrieve those parameters
if "model_specific_defaults" in foundation_model.tags:
    optimization_parameters = ast.literal_eval(
        foundation_model.tags["model_specific_defaults"]
    )  # convert string to python dict
else:
    optimization_parameters = dict(
        apply_lora="true", apply_deepspeed="true", apply_ort="true"
    )
print(f"The following optimizations are enabled - {optimization_parameters}")

In [None]:
from azure.ai.ml.dsl import pipeline
from azure.ai.ml.entities import CommandComponent, PipelineComponent, Job, Component
from azure.ai.ml import PyTorchDistribution, Input

# fetch the pipeline component
pipeline_component_func = registry_ml_client.components.get(
    name="text_classification_pipeline", label="latest"
)


# define the pipeline job
@pipeline()
def create_pipeline():
    text_classification_pipeline = pipeline_component_func(
        # specify the foundation model available in the azureml system registry id identified in step #3
        mlflow_model_path=foundation_model.id,
        # huggingface_id = 'bert-base-uncased', # if you want to use a huggingface model, uncomment this line and comment the above line
        instance_type_finetune="Standard_NC24s_v3",
        instance_type_model_import="Standard_E4s_v3",
        instance_type_preprocess="Standard_E4s_v3",
        instance_type_model_evaluation="Standard_NC24s_v3",
        # map the dataset splits to parameters
        train_file_path=Input(
            type="uri_file", path="./emotion-dataset/small_train.jsonl"
        ),
        validation_file_path=Input(
            type="uri_file", path="./emotion-dataset/small_validation.jsonl"
        ),
        test_file_path=Input(
            type="uri_file", path="./emotion-dataset/small_test.jsonl"
        ),
        evaluation_config=Input(
            type="uri_file", path="./text-classification-config.json"
        ),
        # The following parameters map to the dataset fields
        sentence1_key="text",
        label_key="label_string",
        # Training settings
        number_of_gpu_to_use_finetuning=4,  # set to the number of GPUs available in the compute
        **training_parameters,
        **optimization_parameters
    )
    return {
        # map the output of the fine tuning job to the output of pipeline job so that we can easily register the fine tuned model
        # registering the model is required to deploy the model to an online or batch endpoint
        "trained_model": text_classification_pipeline.outputs.mlflow_model_folder
    }


pipeline_object = create_pipeline()

# don't use cached results from previous jobs
pipeline_object.settings.force_rerun = True

# set continue on step failure to False
pipeline_object.settings.continue_on_step_failure = False

Submit the job

In [None]:
# submit the pipeline job
pipeline_job = workspace_ml_client.jobs.create_or_update(
    pipeline_object, experiment_name=experiment_name
)
# wait for the pipeline job to complete
workspace_ml_client.jobs.stream(pipeline_job.name)

### 5. Review training and evaluation metrics
Viewing the job in AzureML studio is the best way to analyze logs, metrics and outputs of jobs. You can create custom charts and compare metics across different jobs. See https://learn.microsoft.com/en-us/azure/machine-learning/how-to-log-view-metrics?tabs=interactive#view-jobsruns-information-in-the-studio to learn more. 

However, we may need to access and review metrics programmatically for which we will use MLflow, which is the recommended client for logging and querying metrics.

In [None]:
import mlflow, json

mlflow_tracking_uri = workspace_ml_client.workspaces.get(
    workspace_ml_client.workspace_name
).mlflow_tracking_uri
mlflow.set_tracking_uri(mlflow_tracking_uri)
# concat 'tags.mlflow.rootRunId=' and pipeline_job.name in single quotes as filter variable
filter = "tags.mlflow.rootRunId='" + pipeline_job.name + "'"
runs = mlflow.search_runs(
    experiment_names=[experiment_name], filter_string=filter, output_format="list"
)
training_run = None
evaluation_run = None
# get the training and evaluation runs.
# using a hacky way till 'Bug 2320997: not able to show eval metrics in FT notebooks - mlflow client now showing display names' is fixed
for run in runs:
    # check if run.data.metrics.epoch exists
    if "epoch" in run.data.metrics:
        training_run = run
    # else, check if run.data.metrics.accuracy exists
    elif "accuracy" in run.data.metrics:
        evaluation_run = run

In [None]:
if training_run:
    print("Training metrics:\n\n")
    print(json.dumps(training_run.data.metrics, indent=2))
else:
    print("No Training job found")

In [None]:
if evaluation_run:
    print("Evaluation metrics:\n\n")
    print(json.dumps(evaluation_run.data.metrics, indent=2))
else:
    print("No Evaluation job found")

### 6. Register the fine tuned model with the workspace

We will register the model from the output of the fine tuning job. This will track lineage between the fine tuned model and the fine tuning job. The fine tuning job, further, tracks lineage to the foundation model, data and training code.

In [None]:
from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes

# check if the `trained_model` output is available
print("pipeline job outputs: ", workspace_ml_client.jobs.get(pipeline_job.name).outputs)

# fetch the model from pipeline job output - not working, hence fetching from fine tune child job
model_path_from_job = "azureml://jobs/{0}/outputs/{1}".format(
    pipeline_job.name, "trained_model"
)

finetuned_model_name = model_name + "-emotion-detection"
finetuned_model_name = finetuned_model_name.replace("/", "-")
print("path to register model: ", model_path_from_job)
prepare_to_register_model = Model(
    path=model_path_from_job,
    type=AssetTypes.MLFLOW_MODEL,
    name=finetuned_model_name,
    version=timestamp,  # use timestamp as version to avoid version conflict
    description=model_name + " fine tuned model for emotion detection",
)
print("prepare to register model: \n", prepare_to_register_model)
# register the model from pipeline job output
registered_model = workspace_ml_client.models.create_or_update(
    prepare_to_register_model
)
print("registered model: \n", registered_model)

### 7. Deploy the fine tuned model to an online endpoint
Online endpoints give a durable REST API that can be used to integrate with applications that need to use the model.

In [None]:
import time, sys
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    ProbeSettings,
)

# Create online endpoint - endpoint names need to be unique in a region, hence using timestamp to create unique endpoint name

online_endpoint_name = "emotion-" + timestamp
# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="Online endpoint for "
    + registered_model.name
    + ", fine tuned model for emotion detection",
    auth_mode="key",
)
workspace_ml_client.begin_create_or_update(endpoint).wait()

You can find here the list of SKU's supported for deployment - [Managed online endpoints SKU list](https://learn.microsoft.com/en-us/azure/machine-learning/reference-managed-online-endpoints-vm-sku-list)

In [None]:
# create a deployment
demo_deployment = ManagedOnlineDeployment(
    name="demo",
    endpoint_name=online_endpoint_name,
    model=registered_model.id,
    instance_type="Standard_E64s_v3",
    instance_count=1,
    liveness_probe=ProbeSettings(initial_delay=600),
)
workspace_ml_client.online_deployments.begin_create_or_update(demo_deployment).wait()
endpoint.traffic = {"demo": 100}
workspace_ml_client.begin_create_or_update(endpoint).result()

### 8. Test the endpoint with sample data

We will fetch some sample data from the test dataset and submit to online endpoint for inference. We will then show the display the scored labels alongside the ground truth labels

In [None]:
# read ./emotion-dataset/small_test.jsonl into a pandas dataframe
test_df = pd.read_json("./emotion-dataset/small_test.jsonl", lines=True)
# take 5 random samples
test_df = test_df.sample(n=5)
# rebuild index
test_df.reset_index(drop=True, inplace=True)
# rename the label_string column to ground_truth_label
test_df = test_df.rename(columns={"label_string": "ground_truth_label"})
test_df.head(5)

In [None]:
# create a json object with the key as "inputs" and value as a list of values from the text column of the test dataframe
test_df_copy = test_df[["text"]]
test_df_copy = test_df_copy.rename(columns={"text": "input_string"})
test_json = {"input_data": test_df_copy.to_dict("split")}
# save the json object to a file named sample_score.json in the ./emotion-dataset folder
with open("./emotion-dataset/sample_score.json", "w") as f:
    json.dump(test_json, f)

In [None]:
# score the sample_score.json file using the online endpoint with the azureml endpoint invoke method
response = workspace_ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name="demo",
    request_file="./emotion-dataset/sample_score.json",
)
print("raw response: \n", response, "\n")
# convert the response to a pandas dataframe and rename the label column as scored_label
response_df = pd.read_json(response)
response_df = response_df.rename(columns={0: "scored_label"})
response_df.head(5)

In [None]:
# merge the test dataframe and the response dataframe on the index
merged_df = pd.merge(test_df, response_df, left_index=True, right_index=True)
merged_df.head(5)

### 9. Delete the online endpoint
Don't forget to delete the online endpoint, else you will leave the billing meter running for the compute used by the endpoint

In [None]:
workspace_ml_client.online_endpoints.begin_delete(name=online_endpoint_name).wait()