## Fine-Tuning Phi-3 Models - A Python SDK Experience

Learn how to fine-tune the <code>phi-3-mini-4k-instruct</code> model using Python Programming Language - An SDK / Code Experience. This notebook is based on the Azure Examples Github [here](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/system/finetune/chat-completion/chat-completion.ipynb), with important modifications for compatability.

The last successful run is on an AML CPU Compute <code>Standard_D13_v2</code> with Kernel type <code>Python 3.10 - SDK v2</code>.

He Zhang, July. 2024

## Chat Completion - Ultrachat-200K 

This sample shows how to use `chat-completion` components from the `azureml` system registry to fine tune a model to complete a conversation between 2 people using `ultrachat_200k` dataset. We then deploy the fine tuned model to an online endpoint for real time inference.

### Training data
We will use the [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) dataset. This dataset is intended to mimic multi-turn conversation dialogues between 2 people.

### Model
We will use the `phi-3-mini-4k-instruct` model to show how an user can fine tune a model for chat-completion task. If you opened this notebook from a specific model card, remember to replace the specific model name. 

### Outline
* Setup pre-requisites such as compute.
* Pick a model to fine tune.
* Pick and explore training data.
* Configure the fine tuning job.
* Run the fine tuning job.
* Review training and evaluation metrics. 
* Register the fine tuned model. 
* Deploy the fine tuned model for real time inference.
* Clean up resources. 

### 1. Setup pre-requisites
* Install dependencies
* Connect to AzureML Workspace. Learn more at [set up SDK authentication](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication?tabs=sdk). Replace  `<WORKSPACE_NAME>`, `<RESOURCE_GROUP>` and `<SUBSCRIPTION_ID>` below.
* Connect to `azureml` system registry
* Set an optional experiment name
* Check or create compute. A single GPU node can have multiple GPU cards. For example, in one node of `Standard_NC24rs_v3` there are 4 NVIDIA V100 GPUs while in `Standard_NC12s_v3`, there are 2 NVIDIA V100 GPUs. Refer to the [docs](https://learn.microsoft.com/en-us/azure/virtual-machines/sizes-gpu) for this information. The number of GPU cards per node is set in the param `gpus_per_node` below. Setting this value correctly will ensure utilization of all GPUs in the node. The recommended GPU compute SKUs can be found [here](https://learn.microsoft.com/en-us/azure/virtual-machines/ncv3-series) and [here](https://learn.microsoft.com/en-us/azure/virtual-machines/ndv2-series).

Install dependencies by running below cell. This is not an optional step if running in a new environment.

In [1]:
%pip install azure-ai-ml
%pip install azure-identity
%pip install datasets==2.9.0
%pip install mlflow
%pip install azureml-mlflow

Collecting azure-storage-blob<13.0.0,>=12.10.0 (from azure-ai-ml)
  Downloading azure_storage_blob-12.21.0b1-py3-none-any.whl (393 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m393.8/393.8 kB[0m [31m8.3 MB/s[0m eta [36m0:00:00[0m00:01[0m


Installing collected packages: azure-storage-blob
  Attempting uninstall: azure-storage-blob
    Found existing installation: azure-storage-blob 12.13.0
    Uninstalling azure-storage-blob-12.13.0:
      Successfully uninstalled azure-storage-blob-12.13.0
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
azureml-mlflow 1.51.0 requires azure-storage-blob<=12.13.0,>=12.5.0, but you have azure-storage-blob 12.21.0b1 which is incompatible.[0m[31m
[0mSuccessfully installed azure-storage-blob-12.21.0b1
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


Note: you may need to restart the kernel to use updated packages.


Note: you may need to restart the kernel to use updated packages.
Collecting azure-storage-blob<=12.13.0,>=12.5.0 (from azureml-mlflow)
  Using cached azure_storage_blob-12.13.0-py3-none-any.whl (377 kB)
Installing collected packages: azure-storage-blob
  Attempting uninstall: azure-storage-blob
    Found existing installation: azure-storage-blob 12.21.0b1
    Uninstalling azure-storage-blob-12.21.0b1:
      Successfully uninstalled azure-storage-blob-12.21.0b1
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
azure-storage-file-datalake 12.11.0 requires azure-storage-blob<13.0.0,>=12.16.0b1, but you have azure-storage-blob 12.13.0 which is incompatible.[0m[31m
[0mSuccessfully installed azure-storage-blob-12.13.0
Note: you may need to restart the kernel to use updated packages.


In [2]:
from azure.ai.ml import MLClient
from azure.identity import (
    DefaultAzureCredential,
    InteractiveBrowserCredential,
)
from azure.ai.ml.entities import AmlCompute
import time

try:
    credential = DefaultAzureCredential()
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    credential = InteractiveBrowserCredential()

try:
    workspace_ml_client = MLClient.from_config(credential=credential)
except:
    workspace_ml_client = MLClient(
        credential,
        subscription_id="<SUBSCRIPTION_ID>",
        resource_group_name="<RESOURCE_GROUP>",
        workspace_name="<WORKSPACE_NAME>",
    )

# the models, fine tuning pipelines and environments are available in the AzureML registry, "azureml"
registry_ml_client = MLClient(credential, registry_name="azureml")
experiment_name = "chat_completion_Phi-3-mini-4k-instruct"

# generating a unique timestamp that can be used for names and versions that need to be unique
timestamp = str(int(time.time()))

Found the config file in: /config.json


### 2. Pick a foundation model to fine tune

`Phi-3-mini-4k-instruct` is a 3.8B parameters, lightweight, state-of-the-art open source model built upon datasets used for Phi-2. The model belongs to the Phi-3 model family, and the Mini version comes in two variants 4K and 128K which is the context length (in tokens) it can support, we need to fine tune the model for our specific purpose in order to use it. You can browse these models in the Model Catalog in the AzureML Studio, filtering by the `chat-completion` task. In this example, we use the `Phi-3-mini-4k-instruct` model. If you have opened this notebook for a different model, replace the model name and version accordingly.

Note the model id property of the model. This will be passed as input to the fine tuning job. This is also available as the `Asset ID` field in model details page in AzureML Studio Model Catalog.

In [3]:
model_name = "Phi-3-mini-4k-instruct"
foundation_model = registry_ml_client.models.get(model_name, label="latest")
print(
    "\n\nUsing model name: {0}, version: {1}, id: {2} for fine tuning".format(
        foundation_model.name, foundation_model.version, foundation_model.id
    )
)



Using model name: Phi-3-mini-4k-instruct, version: 8, id: azureml://registries/azureml/models/Phi-3-mini-4k-instruct/versions/8 for fine tuning


### 3. Create a compute to be used with the job

The finetune job works `ONLY` with `GPU` compute. The size of the compute depends on how big the model is and in most cases it becomes tricky to identify the right compute for the job. In this cell, we guide the user to select the right compute for the job.

`NOTE1` The computes listed below work with the most optimized configuration. Any changes to the configuration might lead to Cuda Out Of Memory error. In such cases, try to upgrade the compute to a bigger compute size.

`NOTE2` While selecting the compute_cluster_size below, make sure the compute is available in your resource group. If a particular compute is not available you can make a request to get access to the compute resources.

In [4]:
import ast

if "finetune_compute_allow_list" in foundation_model.tags: 
    computes_allow_list = ast.literal_eval(
        foundation_model.tags["finetune_compute_allow_list"]
    )  # convert string to python list
    print(f"Please create a compute from the above list - {computes_allow_list}")
else:
    computes_allow_list = None
    print("Computes allow list is not part of model tags")

Please create a compute from the above list - ['Standard_NC24ads_A100_v4', 'Standard_NC48ads_A100_v4', 'Standard_NC96ads_A100_v4', 'Standard_ND96asr_v4', 'Standard_ND96amsr_A100_v4']


In [5]:
# If you have a specific compute size to work with change it here. By default we use the one compute from the above list
compute_cluster_size = "Standard_NC96ads_A100_v4"

# If you already have a gpu cluster, mention it here. Else will create a new one with the name 'gpu-cluster-xxx'
compute_cluster = "gpu-cluster-nc96ads-a100-v4"

try:
    compute = workspace_ml_client.compute.get(compute_cluster)
    print("The compute cluster already exists! Reusing it for the current run")
except Exception as ex:
    print(
        f"Looks like the compute cluster doesn't exist. Creating a new one with compute size {compute_cluster_size}!"
    )
    try:
        print("Attempt #1 - Trying to create a dedicated compute")
        compute = AmlCompute(
            name=compute_cluster,
            size=compute_cluster_size,
            tier="Dedicated",
            max_instances=2,  # For multi node training set this to an integer value more than 1
        )
        workspace_ml_client.compute.begin_create_or_update(compute).wait()
    except Exception as e:
        try:
            print(
                "Attempt #2 - Trying to create a low priority compute. Since this is a low priority compute, the job could get pre-empted before completion."
            )
            compute = AmlCompute(
                name=compute_cluster,
                size=compute_cluster_size,
                tier="LowPriority",
                max_instances=2,  # For multi node training set this to an integer value more than 1
            )
            workspace_ml_client.compute.begin_create_or_update(compute).wait()
        except Exception as e:
            print(e)
            raise ValueError(
                f"WARNING! Compute size {compute_cluster_size} not available in workspace"
            )


# Sanity check on the created compute
compute = workspace_ml_client.compute.get(compute_cluster)
if compute.provisioning_state.lower() == "failed":
    raise ValueError(
        f"Provisioning failed, Compute '{compute_cluster}' is in failed state. "
        f"please try creating a different compute"
    )

if computes_allow_list is not None:
    computes_allow_list_lower_case = [x.lower() for x in computes_allow_list]
    if compute.size.lower() not in computes_allow_list_lower_case:
        raise ValueError(
            f"VM size {compute.size} is not in the allow-listed computes for finetuning"
        )
else:
    # Computes with K80 GPUs are not supported
    unsupported_gpu_vm_list = [
        "standard_nc6",
        "standard_nc12",
        "standard_nc24",
        "standard_nc24r",
    ]
    if compute.size.lower() in unsupported_gpu_vm_list:
        raise ValueError(
            f"VM size {compute.size} is currently not supported for finetuning"
        )


# This is the number of GPUs in a single node of the selected 'vm_size' compute.
# Setting this to less than the number of GPUs will result in underutilized GPUs, taking longer to train.
# Setting this to more than the number of GPUs will result in an error.
gpu_count_found = False
workspace_compute_sku_list = workspace_ml_client.compute.list_sizes()
available_sku_sizes = []
for compute_sku in workspace_compute_sku_list:
    available_sku_sizes.append(compute_sku.name)
    if compute_sku.name.lower() == compute.size.lower():
        gpus_per_node = compute_sku.gpus
        gpu_count_found = True
        
# if gpu_count_found not found, then print an error
if gpu_count_found:
    print(f"Number of GPU's in compute {compute.size}: {gpus_per_node}")
else:
    raise ValueError(
        f"Number of GPU's in compute {compute.size} not found. Available skus are: {available_sku_sizes}."
        f"This should not happen. Please check the selected compute cluster: {compute_cluster} and try again."
    )

The compute cluster already exists! Reusing it for the current run
Number of GPU's in compute Standard_NC96ads_A100_v4: 4


### 4. Pick the dataset for fine-tuning the model

We use the [ultrachat_200k](https://huggingface.co/datasets/samsum) dataset. The dataset has four splits, suitable for:
* Supervised fine-tuning (sft).
* Generation ranking (gen).
The number of examples per split is shown as follows:

| train_sft | test_sft | train_gen | test_gen |
| :- | :- | :- | :- |
| 207865 | 23110 | 256032 | 28304 |

The next few cells show basic data preparation for fine tuning:
* Visualize some data rows
* We want this sample to run quickly, so save `train_sft`, `test_sft` files containing 5% of the already trimmed rows. This means the fine tuned model will have lower accuracy, hence it should not be put to real-world use.

> The [download-dataset.py](./download-dataset.py) is used to download the ultrachat_200k dataset and transform the dataset into finetune pipeline component consumable format. Also as the dataset is large, hence we here have only part of the dataset. 

> Running the below script only downloads 5% of the data. This can be increased by changing `dataset_split_pc` parameter to desired percenetage.

> **Note** : Some language models have different language codes and hence the column names in the dataset should reflect the same.

##### Here is an example of how the data should look like 

The chat-completion dataset is stored in parquet format with each entry using the following schema:
``` json
{
    "prompt": "Create a fully-developed protagonist who is challenged to survive within a dystopian society under the rule of a tyrant. ...",
    "messages":[
        {
            "content": "Create a fully-developed protagonist who is challenged to survive within a dystopian society under the rule of a tyrant. ...",
            "role": "user"
        },
        {
            "content": "Name: Ava\n\n Ava was just 16 years old when the world as she knew it came crashing down. The government had collapsed, leaving behind a chaotic and lawless society. ...",
            "role": "assistant"
        },
        {
            "content": "Wow, Ava's story is so intense and inspiring! Can you provide me with more details.  ...",
            "role": "user"
        }, 
        {
            "content": "Certainly! ....",
            "role": "assistant"
        }
    ],
    "prompt_id": "d938b65dfe31f05f80eb8572964c6673eddbd68eff3db6bd234d7f1e3b86c2af"
}
```

In [6]:
# download the dataset using the helper script. This needs datasets library: https://pypi.org/project/datasets/
import os

exit_status = os.system(
    "python ./download-dataset.py --dataset HuggingFaceH4/ultrachat_200k --download_dir ultrachat_200k_dataset --dataset_split_pc 5"
)
if exit_status != 0:
    raise Exception("Error downloading dataset")

Using custom data configuration HuggingFaceH4--ultrachat_200k-f7bf8cac407bd4eb
Using custom data configuration HuggingFaceH4--ultrachat_200k-f7bf8cac407bd4eb
Reusing dataset parquet (/home/azureuser/.cache/huggingface/datasets/HuggingFaceH4___parquet/HuggingFaceH4--ultrachat_200k-f7bf8cac407bd4eb/0.0.0/7328ef7ee03eaf3f86ae40594d46a1cec86161704e02dd19f232d81eee72ade8)
Creating json from Arrow format: 100%|██████████| 2/2 [00:05<00:00,  2.51s/ba]
Using custom data configuration HuggingFaceH4--ultrachat_200k-f7bf8cac407bd4eb
Reusing dataset parquet (/home/azureuser/.cache/huggingface/datasets/HuggingFaceH4___parquet/HuggingFaceH4--ultrachat_200k-f7bf8cac407bd4eb/0.0.0/7328ef7ee03eaf3f86ae40594d46a1cec86161704e02dd19f232d81eee72ade8)
Creating json from Arrow format: 100%|██████████| 1/1 [00:02<00:00,  2.51s/ba]
Using custom data configuration HuggingFaceH4--ultrachat_200k-f7bf8cac407bd4eb
Reusing dataset parquet (/home/azureuser/.cache/huggingface/datasets/HuggingFaceH4___parquet/HuggingFa

In [8]:
# load the ./ultrachat_200k_dataset/train_sft.jsonl file into a pandas dataframe and show the first 5 rows
import pandas as pd

pd.set_option(
    "display.max_colwidth", 0
)  # set the max column width to 0 to display the full text
df = pd.read_json("./ultrachat_200k_dataset/train_sft.jsonl", lines=True)

df.info()
df.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10393 entries, 0 to 10392
Data columns (total 3 columns):
 #   Column     Non-Null Count  Dtype 
---  ------     --------------  ----- 
 0   prompt     10393 non-null  object
 1   prompt_id  10393 non-null  object
 2   messages   10393 non-null  object
dtypes: object(3)
memory usage: 243.7+ KB


Unnamed: 0,prompt,prompt_id,messages
0,"These instructions apply to section-based themes (Responsive 6.0+, Retina 4.0+, Parallax 3.0+ Turbo 2.0+, Mobilia 5.0+). What theme version am I using?\nOn your Collections pages & Featured Collections sections, you can easily show the secondary image of a product on hover by enabling one of the theme's built-in settings!\nYour Collection pages & Featured Collections sections will now display the secondary product image just by hovering over that product image thumbnail.\nDoes this feature apply to all sections of the theme or just specific ones as listed in the text material?",f0e37e9f7800261167ce91143f98f511f768847236f133f2d0aed60b444ebe57,"[{'content': 'These instructions apply to section-based themes (Responsive 6.0+, Retina 4.0+, Parallax 3.0+ Turbo 2.0+, Mobilia 5.0+). What theme version am I using? On your Collections pages & Featured Collections sections, you can easily show the secondary image of a product on hover by enabling one of the theme's built-in settings! Your Collection pages & Featured Collections sections will now display the secondary product image just by hovering over that product image thumbnail. Does this feature apply to all sections of the theme or just specific ones as listed in the text material?', 'role': 'user'}, {'content': 'This feature only applies to Collection pages and Featured Collections sections of the section-based themes listed in the text material.', 'role': 'assistant'}, {'content': 'Can you guide me through the process of enabling the secondary image hover feature on my Collection pages and Featured Collections sections?', 'role': 'user'}, {'content': 'Sure, here are the steps to enable the secondary image hover feature on your Collection pages and Featured Collections sections: 1. Log in to your Shopify account and go to your Online Store. 2. Click on Customize theme for the section-based theme you are using. 3. Navigate to the Collection pages or Featured Collections section where you want to enable the secondary image hover feature. 4. With the section open, click on the section's setting (gear) icon in the top-left corner. 5. In the settings panel that appears, look for an option labeled 'Image display' or 'Image hover'. 6. If available, select 'Show secondary image on hover'. 7. Save the changes and preview the Collection/Featured Collection page to see the effect. If you're having trouble finding the setting, the best thing to do is refer to your theme's documentation, since the location and labeling of settings can vary between themes.', 'role': 'assistant'}, {'content': 'Can you provide me with a link to the documentation for my theme?', 'role': 'user'}, {'content': 'I don't have access to your store's theme information. However, you can usually find the documentation for your theme by going to the shopify theme store, finding your theme and clicking on the 'support' link located in the bottom right corner of the page. Alternatively, you can do a google search for the name of your theme followed by 'documentation' or 'user guide'.', 'role': 'assistant'}, {'content': 'Can you confirm if this feature also works for the Quick Shop section of my theme?', 'role': 'user'}, {'content': 'The secondary image hover feature may or may not work for your Quick Shop section, depending on the configuration of your theme. Some themes include this feature in the Quick Shop section by default, while others may require additional customization. To check if this feature is available for the Quick Shop section of your theme, follow these steps: 1. Go to the Quick Shop section where you would like to enable the feature. 2. Click on the Quick Shop settings icon (gear icon) and look for 'Image display' or 'Image hover'. 3. If available, select 'Show secondary image on hover'. 4. Save the changes. If this option is not available in your Quick Shop section settings, you may need to reach out to your theme developer for assistance with customizing your Quick Shop section to include this feature.', 'role': 'assistant'}]"
1,"Which famous landmarks should I visit in London, beyond the usual ones?",f5025bdcae61bb77fd98a4d6cd6ba8e0199a098cfebcf6830f4a85e0d13a9e21,"[{'content': 'Which famous landmarks should I visit in London, beyond the usual ones?', 'role': 'user'}, {'content': '1. Leadenhall Market - a beautiful indoor market with stunning Victorian architecture, also used as a filming location in the Harry Potter films. 2. St. Dunstan in the East - a ruined church in the middle of the city that has been turned into a beautiful public garden. 3. The Monument - a 202-foot-tall column commemorating the Great Fire of London, with a staircase leading to a viewing platform offering great views of the city. 4. The Camden Town Markets - an eclectic collection of markets offering food, fashion, and vintage items, plus live music and street performers. 5. Novelist's House - the former home of Charles Dickens, now a museum dedicated to his life and works. 6. The Old Operating Theatre - a museum housed in the oldest surviving operating theatre in Europe, with exhibits on the history of surgery and medical practices. 7. The Churchill War Rooms - an underground complex used as a British government command centre during World War II, now a museum dedicated to Winston Churchill and his leadership during the war. 8. The Brunel Museum - a museum dedicated to the life and works of engineer Isambard Kingdom Brunel, with exhibitions on his revolutionary designs for bridges, ships, and tunnels. 9. The Clink Prison Museum - a museum in the former site of the notorious medieval prison, with exhibits on the history of crime and punishment in London. 10. The Royal Observatory Greenwich - an observatory and museum dedicated to astronomy and navigation, with exhibits on the history of timekeeping and space exploration.', 'role': 'assistant'}, {'content': 'Hmm, those are interesting suggestions, but I've already seen so many touristy landmarks in London. Is there something more offbeat that you can recommend? Something that locals might know about?', 'role': 'user'}, {'content': 'Absolutely! Here are some offbeat and lesser-known things to see in London that locals might recommend: 1. God's Own Junkyard - a neon wonderland filled with vintage and new neon signs. 2. Hornsey Town Hall Arts Centre - a grade II listed building with a rich history and home to many art exhibitions and events. 3. Clapham Grand - a historic Victorian theatre that has been transformed into a music venue, cinema and a comedy club. 4. Gordon Museum of Pathology - the pathology museum of King's College in London that holds a collection of over 8,000 medical specimens. 5. John Soane's Museum - the former home of wealthy architect John Soane is a museum with eclectic collections of paintings, sculptures, and curiosities. 6. The Viktor Wynd Museum of Curiosities - a small museum filled with strange and unusual objects, taxidermy, and rare books. 7. The Cross Bones Graveyard - a medieval graveyard that became known as an outcast cemetery with a history connected to the sex industry, now a memorial garden. 8. Highgate Cemetery - a beautiful Victorian graveyard known for its Gothic architecture and where many famous people, including Karl Marx, are buried. 9. Dalston Eastern Curve Garden - a beautiful community garden near the Dalston Junction station that is run by volunteers and has regular events. 10. The Geffrye Museum of the Home - a museum that explores the history of domestic interiors in England, with a beautiful garden and regular market events. There are many other hidden gems in London, and a quick Google search for ‘offbeat things in London’ will bring up many blogs and resources with more options.', 'role': 'assistant'}, {'content': 'Wow, these offbeat suggestions are really interesting. I had no idea there were so many unique things to see in London. Thanks for the recommendations, you're much more helpful than I expected. Can you tell me more about the Viktor Wynd Museum of Curiosities? It sounds like my kind of place.', 'role': 'user'}, {'content': 'Sure! The Viktor Wynd Museum of Curiosities, Fine Art & Natural History is a small museum located in Hackney, East London. The museum, founded by collector Viktor Wynd is a place where you can find taxidermy, books, anatomical models, and other curiosities. The museum is famous for its collection of shrunken heads, dodo bones, and the famous mermaid. It is also notable for its dimly lit, atmospheric decor, and its bar which serves cocktails and allows visitors to take drinks into the museum. The museum is open daily except for Mondays and Tuesdays, and it also frequently hosts events and exhibitions, so be sure to check the museum's website to see what's on.', 'role': 'assistant'}, {'content': 'Woah, the Viktor Wynd Museum sounds like a wacky and fun place to visit. I don't think I've ever seen a shrunken head before. I wonder what kind of cocktails they serve there... Maybe I can convince some friends to check it out with me!', 'role': 'user'}, {'content': 'Yes, the museum is definitely one of the more unusual places to visit in London! The cocktail menu at Viktor Wynd's bar is just as eclectic as the museum's collection, featuring brightly colored cocktails inspired by the exhibits, such as the ""Shrunken Head"" or ""Mermaid Tail"" cocktails. The museum and bar are on the smaller side, so it would make for a unique and intimate outing with friends. If you're looking for something different and memorable to do in London, the Viktor Wynd Museum is definitely worth checking out!', 'role': 'assistant'}, {'content': 'Whoa, a bar in a museum?! That's wild. I'm definitely going to have to try one of those cocktails. Do you think they have anything with a shrunken head as a garnish? That would be a trip.', 'role': 'user'}, {'content': 'I am not sure if the Viktor Wynd Museum of Curiosities serves cocktails with an actual shrunken head as a garnish. However, they might have some creative cocktails that incorporate a shrunken head replica or something similar as an element in the drink. Nevertheless, the bar offers a unique setting and a cocktail menu that includes some unusual ingredients so it's definitely worth checking out! It's likely to be an eccentric and memorable experience, and I'm sure you'll have a great time there with your friends.', 'role': 'assistant'}, {'content': 'I can't wait to see that mermaid at the Viktor Wynd Museum. I wonder if it's real or just a replica. Either way, it's bound to be a fascinating item to behold. And I'll make sure to order that ""Mermaid Tail"" cocktail while I'm there!', 'role': 'user'}, {'content': 'Yes, the mermaid at the Viktor Wynd Museum of Curiosities is definitely one of its most famous exhibits. It's actually a recreation made from the skeleton of a monkey and the tail of a fish, so it's not a real mermaid of course, but it's still an intriguing piece. The museum has many other curiosities to discover as well, and the atmosphere there is unlike anything you'll find in other museums. And don't forget to try that ""Mermaid Tail"" cocktail - it's the perfect accompaniment to the museum's quirky vibe!', 'role': 'assistant'}]"
2,"Write a comprehensive blog post of at least 1000 words about the top 10 most eco-friendly cities in the world and their renewable energy initiatives. Use a formal and informative tone, and include statistics, case studies, and expert opinions to support your claims. Make sure to cover various aspects of sustainability, such as public transportation, waste management, green buildings, and carbon emissions reduction. Additionally, provide specific examples of innovative renewable energy projects implemented in each city and assess their impact on the environment and the local community. Provide credible sources and links to relevant articles, reports, and websites to enhance the article's credibility and appeal to the readers.",6db663a4d2671b41e0038c43c39f79cf909b10987dc5954286228079c2c78207,"[{'content': 'Write a comprehensive blog post of at least 1000 words about the top 10 most eco-friendly cities in the world and their renewable energy initiatives. Use a formal and informative tone, and include statistics, case studies, and expert opinions to support your claims. Make sure to cover various aspects of sustainability, such as public transportation, waste management, green buildings, and carbon emissions reduction. Additionally, provide specific examples of innovative renewable energy projects implemented in each city and assess their impact on the environment and the local community. Provide credible sources and links to relevant articles, reports, and websites to enhance the article's credibility and appeal to the readers.', 'role': 'user'}, {'content': 'In the recent years, there has been a growing awareness of the necessity of developing sustainable cities to combat the detrimental effects of climate change. Cities are responsible for 75% of the world's carbon emissions, but they also have the potential to be leaders in renewable energy and green initiatives. Many cities are taking action to reduce their carbon footprint, and ten cities have gone above and beyond in terms of developing eco-friendly initiatives. In this post, we will take a look at the top ten most eco-friendly cities in the world and their renewable energy initiatives. 1. Reykjavik, Iceland Reykjavik is the capital of Iceland, and it is considered one of the most environmentally friendly cities in the world. The city is powered almost entirely by renewable energy, thanks to Iceland's abundant geothermal and hydroelectric power sources. About 90% of Reykjavik's population is connected to district heating, which is generated through geothermal power. The city also has an extensive public transportation system that relies on electric buses and taxis. One of the most innovative renewable energy projects in Reykjavik is the CarbFix project, which involves capturing carbon dioxide emissions and storing them in underground basaltic rock formations. This initiative not only reduces carbon emissions, but it also creates a new type of renewable geothermal energy source. This project has reduced Reykjavik’s carbon footprint significantly and is gaining recognition worldwide. 2. Copenhagen, Denmark Copenhagen is another city that has earned its reputation as a sustainable city. The city's ambitious goal is to be carbon-neutral by 2025, and it has already made significant progress in reducing its carbon emissions. Copenhagen has an extensive bicycle infrastructure, and biking is the most popular mode of transportation in the city. The city also has a robust public transportation system, with most buses running on biogas or electricity. Copenhagen is also home to several innovative energy-generating projects. The city’s Amager Bakke waste-to-energy plant is an excellent example of how to tackle waste management while generating clean energy. The plant incinerates waste, and the heat generated is converted into electricity, which is then supplied to the grid. The plant also features a recreational space with a ski slope, hiking trails, and even a climbing wall, making it more than just an energy-generating facility. 3. Stockholm, Sweden Stockholm is another city that has made significant progress in becoming more environmentally friendly. The city has a 100% renewable energy target by 2040, and it is well on its way towards achieving that goal. Like Copenhagen, Stockholm has an extensive bicycle infrastructure, with bikes being the most popular mode of transportation. The city also has an efficient public transportation system, which runs on renewable energy. Stockholm has several innovative projects in the renewable energy sector. One such project is the Värtahamnen district cooling plant, which uses seawater to cool buildings. The cold water is pumped from deep-sea depth and is circulated through the city's cooling system. This system is significantly more energy-efficient than traditional air conditioning methods, and it has reduced Stockholm's carbon emissions significantly. 4. Vancouver, Canada Vancouver is a city known for its natural beauty, and it is also known for its commitment to sustainability. The city aims to be the greenest city in the world by 2020, and it has made significant progress towards that goal. The city has an impressive public transportation system, which includes buses, trains, and ferries, all of which run on renewable energy. Vancouver has been investing in renewable energy initiatives and the green economy, creating new jobs while reducing carbon emissions. It is home to Canada’s largest biomass energy plant, which converts wood waste and sawdust into electricity, reducing the use of fossil fuels. Additionally, the city has launched an ambitious program called Renewable City Strategy, which aims to see the city run on 100% renewable energy before 2050. 5. Portland, USA Portland is a city that has long been known for its eco-friendliness, and it has made significant strides in reducing its carbon emissions. The city aims to reduce its carbon emissions by 80% by 2050, and it has implemented several sustainable initiatives. Portland's public transportation system is extensive, and the city has invested heavily in bike lanes and bike-sharing programs, making cycling a popular option for getting around.  Portland is also home to several innovative energy projects. One such project is the Covanta Portland waste-to-energy facility, which converts waste into clean energy. The project generates enough electricity to power more than 27,000 households and reduces the amount of waste sent to landfills significantly. 6. San Francisco, USA San Francisco is a city that has always been at the forefront of sustainability initiatives, and it aims to be a zero-waste city by 2020. The city has an impressive public transportation system, which includes buses, trains, and cable cars, all of which are powered by renewable energy sources. Additionally, San Francisco is home to the largest municipal solar installation in', 'role': 'assistant'}, {'content': 'This is a really informative post about the top eco-friendly cities and their renewable energy initiatives. I would like to know more about their waste management strategies since it is also an important aspect of sustainability. Can you add some more information on that?', 'role': 'user'}, {'content': 'Absolutely! Waste management is an important aspect of sustainability as it helps cities reduce their carbon footprint and ensure that resources are not wasted. Here is some information on the waste management strategies that these cities have implemented: 7. Oslo, Norway Oslo is the capital of Norway and is known for its impressive waste management program. The city is well on its way to becoming a zero-waste city, with over 60% of its waste being recycled or reused. Oslo's waste management program involves separating organic waste into biowaste, which is then converted into biogas and biofertilizer. Additionally, the city has implemented a ""pay as you throw"" policy, which has encouraged residents to reduce their waste output. 8. Zurich, Switzerland Zurich has a strong reputation for being environmentally friendly, and its waste management program is no exception. The city has a Zero Waste Strategy, which aims to have no waste sent to landfills by 2025. Zurich has an extensive recycling program, and residents are encouraged to separate their waste into different categories, including paper, glass, metal, and plastic. Organic waste is also collected separately and composted, which is then used to fertilize public parks and gardens. 9. Adelaide, Australia Adelaide is a city that has made significant progress in reducing its carbon emissions, and its waste management program is contributing to that goal. The city has implemented a system where household waste is sorted into three different bins: one for organic waste, one for recyclable materials, and one for general waste. The organic waste is processed into compost, which is then used to fertilize public parks and gardens, while recyclable materials are sent to recycling centers. 10. Singapore Singapore is a city-state that has made waves with its innovative waste management program. The city is known for its high population density, which makes waste management a significant challenge. Singapore has implemented a ""3R"" approach - reduce, reuse, and recycle - to reduce its waste output. The city also has a waste-to-energy plant, which converts waste into electricity that is then supplied to the grid. In conclusion, all the top ten most eco-friendly cities in the world have implemented various strategies for waste management. By focusing on sustainable waste management strategies, these cities have been able to reduce their carbon emissions, protect the environment, and promote sustainability.', 'role': 'assistant'}, {'content': 'Thank you for the additional information about waste management strategies in these eco-friendly cities. I was also wondering if you could provide more details on their green building initiatives, such as the use of sustainable materials and designs to reduce energy consumption and carbon emissions.', 'role': 'user'}, {'content': 'Certainly! Green buildings are an important aspect of sustainability, as they can significantly reduce energy consumption and help lower carbon emissions. Here are some examples of green building initiatives in these eco-friendly cities: 1. Reykjavik, Iceland Reykjavik has implemented several programs to promote sustainable building practices. The city has a ""Green Steps"" program, which encourages residents and businesses to adopt sustainable practices. Reykjavik has also mandated that all new buildings must meet strict energy efficiency standards. Buildings are designed to maximize natural light and use sustainable building materials wherever possible. 2. Copenhagen, Denmark Copenhagen has implemented several green building initiatives, including the use of sustainable materials and designs to reduce energy consumption. The city has a program called ""Green Roofs and Walls,"" which promotes the use of vegetation on buildings to improve energy efficiency and reduce the urban heat island effect. Additionally, Copenhagen has implemented a district heating program, which uses waste heat from power plants to heat buildings, reducing energy consumption. 3. Stockholm, Sweden Stockholm has implemented several programs to promote green building practices, including the use of sustainable materials and designs. The city has adopted strict energy efficiency standards for new buildings, and it promotes the use of green roofs and walls to improve energy efficiency and reduce the urban heat island effect. Stockholm also encourages the use of solar panels and other renewable energy sources in buildings. 4. Vancouver, Canada Vancouver has implemented several programs to promote sustainable building practices, including the use of sustainable materials and designs. The city has adopted strict green building codes, and it encourages the use of energy-efficient windows, insulation, and lighting. Vancouver also has a program that encourages the use of green roofs and walls to improve energy efficiency and reduce the urban heat island effect. 5. Portland, USA Portland is known for its sustainable building practices, and the city has implemented several green building initiatives. The city has implemented a green building policy, which encourages the use of sustainable materials and designs to reduce energy consumption. Additionally, Portland has a program called ""Ecoroofs,"" which promotes the use of vegetation on roofs to improve energy efficiency and reduce the urban heat island effect. 6. San Francisco, USA San Francisco has implemented several programs to promote green building practices, including the use of sustainable materials and designs. The city has adopted strict energy efficiency standards for new buildings, and it promotes the use of green roofs and walls to improve energy efficiency and reduce the urban heat island effect. Additionally, San Francisco has a program called Zero Net Energy Buildings, which encourages the construction of buildings that generate as much energy as they use. In conclusion, all the top ten most eco-friendly cities in the world have implemented programs to promote sustainable building practices. By using sustainable materials and designs, these cities have significantly reduced energy consumption and lowered their carbon emissions.', 'role': 'assistant'}, {'content': 'I'm impressed by the efforts these eco-friendly cities have made to promote sustainability. However, I'm curious about the challenges these cities might have faced during the implementation of their renewable energy and sustainability initiatives. Can you provide me with more details on those challenges and how the cities overcame them?', 'role': 'user'}, {'content': 'Certainly! While these ecologically friendly cities have been successful in implementing renewable energy and sustainability initiatives, they have also faced their fair share of challenges. Here are some examples of challenges faced by these cities and how they overcame them: 1. Reykjavik, Iceland Despite its extensive use of renewable energy, Reykjavik faced the challenge of balancing its energy-intensive industries with sustainability goals. The city overcame this challenge by implementing a carbon capture project, keeping pace with the emission-intensive industries, while still reducing the carbon footprint. The CarbFix project, which I described earlier, proved to be a game-changer, making Reykjavik a city that effectively captured and utilized carbon dioxide. 2. Copenhagen, Denmark Copenhagen faced the challenge of encouraging car drivers to adopt more eco-friendly modes of transportation. The city overcame this challenge by incentivizing bike commuting, providing public transportations, and by raising awareness on the positive impacts of sustainable travel. The city was able to increase the number of bike riders to account for approximately 50% of commuters, contributing to the significant reduction of carbon emissions. 3. Stockholm, Sweden Stockholm faced the challenge of finding a way to sustain green infrastructure projects despite the cost, particularly in a city where both land and property values are relatively high. The city overcame this challenge by cooperating with businesses and local community organizations to help defray the costs of infrastructure projects. By doing so, Stockholm effectively secured long-term funding for infrastructure projects that are vital for sustainability. 4. Vancouver, Canada Vancouver faced the challenge of reducing carbon emissions while still maintaining economic growth. The city overcame this challenge by promoting the development of sustainable industries, while at the same time fostering an eco-friendly environment for businesses. By doing so, local businesses were encouraged to adopt more sustainable practices and were recognized for their efforts in promoting sustainability. 5. Portland, USA Portland faced the challenge of dealing with a lack of federal support for sustainability initiatives, particularly during the early stages of their implementation. The city overcame this challenge by forging partnerships with local businesses and organizations to promote sustainability initiatives. By embracing public-private partnerships and working closely with the city's commercial sector, Portland was able to overcome a lack of federal support while still promoting sustainable practices. 6. San Francisco, USA San Francisco faced the challenge of competing for resources with other nearby communities, given its geographically small size. The city overcame this challenge by adopting a collaborative approach with other nearby communities, encouraging them to adopt eco-friendly practices. By doing this, San Francisco was able to leverage the resources of neighboring communities to promote sustainability initiatives throughout the region. In conclusion, while these eco-friendly cities have faced their share of challenges in promoting sustainability, they have successfully overcome them. Through collaborating with businesses and community organizations, promoting sustainable practices, and embracing partnerships, these cities have successfully implemented different renewable energy and green initiatives, promoting a more sustainable future.', 'role': 'assistant'}]"
3,"De León, previewing the speech he will give today, said he will highlight his Senate Bill 535, which directs a quarter of the proceeds from the Greenhouse Gas Reduction Fund to projects that benefit disadvantaged communities.\nOn Thursday, de León nodded approvingly as a string of leading scientists and religious leaders gathered for hours of weedy policy discussions on the impacts of climate change, including gloomy predictions on mortality attributable to air pollution.\nSEIU HEADS TO THE BAR: Employees of the State Bar of California represented by SEIU are planning a picket line today at the bar building in Los Angeles to protest the latest contract offer. What is the reason for SEIU employees planning a picket line at the State Bar of California building in Los Angeles?",dd1afba7d2151b0695edea838378c8fd086d538e62a6643f67b24b7afeaf7f19,"[{'content': 'De León, previewing the speech he will give today, said he will highlight his Senate Bill 535, which directs a quarter of the proceeds from the Greenhouse Gas Reduction Fund to projects that benefit disadvantaged communities. On Thursday, de León nodded approvingly as a string of leading scientists and religious leaders gathered for hours of weedy policy discussions on the impacts of climate change, including gloomy predictions on mortality attributable to air pollution. SEIU HEADS TO THE BAR: Employees of the State Bar of California represented by SEIU are planning a picket line today at the bar building in Los Angeles to protest the latest contract offer. What is the reason for SEIU employees planning a picket line at the State Bar of California building in Los Angeles?', 'role': 'user'}, {'content': 'The reason for SEIU employees planning a picket line at the State Bar of California building in Los Angeles is to protest the latest contract offer.', 'role': 'assistant'}, {'content': 'Can you provide more details about Senate Bill 535 and how it will benefit disadvantaged communities?', 'role': 'user'}, {'content': 'Sure, Senate Bill 535 is a California law that directs at least 25% of the money generated through the state's Greenhouse Gas Reduction Fund towards projects that benefit disadvantaged communities. The bill is designed to ensure that the benefits of California's climate policies, which include cap-and-trade, are felt most heavily by communities that have been disproportionately impacted by pollution and environmental degradation. The funds generated through the bill are invested in projects that promote clean energy, reduce pollution, and support sustainable communities. This includes initiatives like public transportation, energy efficiency retrofits for low-income homes, and urban forestry projects. Proponents of the bill argue that it helps to level the playing field for low-income communities and communities of color, which have historically faced greater exposure to environmental hazards and pollution. By directing funds towards these communities, the bill aims to create more equitable and sustainable outcomes for all Californians.', 'role': 'assistant'}]"
4,"Write an essay that evaluates the positive and negative influence of social media on personal relationships, citing specific examples and research to support your claims. Analyze the ways in which social media usage affects communication patterns, trust levels, and emotional connections in romantic partnerships, friendships, and family relationships. Consider the role of social comparison, cyberbullying, and privacy concerns in shaping the impact of social media use on personal relationships. Finally, provide recommendations for how individuals can cultivate healthy and meaningful relationships in the age of social media.",cbf683405d8fe0221a42560cec50307d5fa9efa160c49d6d9b2522278f335f56,"[{'content': 'Write an essay that evaluates the positive and negative influence of social media on personal relationships, citing specific examples and research to support your claims. Analyze the ways in which social media usage affects communication patterns, trust levels, and emotional connections in romantic partnerships, friendships, and family relationships. Consider the role of social comparison, cyberbullying, and privacy concerns in shaping the impact of social media use on personal relationships. Finally, provide recommendations for how individuals can cultivate healthy and meaningful relationships in the age of social media.', 'role': 'user'}, {'content': 'Social media is a ubiquitous presence in modern society and has revolutionized the way people communicate with one another. While there are undoubtedly positive aspects of social media usage, there are also significant negative consequences that impact personal relationships. In this essay, I will evaluate the positive and negative influence of social media on personal relationships, citing specific examples and research to support my claims. I will analyze the ways in which social media usage affects communication patterns, trust levels, and emotional connections in romantic partnerships, friendships, and family relationships. I will also consider the role that social comparison, cyberbullying, and privacy concerns play in shaping the impact of social media use on personal relationships. Finally, I will provide recommendations for how individuals can cultivate healthy and meaningful relationships in the age of social media. One of the positive aspects of social media is that it allows people to communicate with one another more easily and frequently than ever before. Individuals can maintain long-distance relationships with friends and family members, interact with people who share similar interests or backgrounds, and cultivate new friendships with people from all over the world. Social media platforms like Facebook and Instagram provide an easy way for users to share their lives with others, which can enhance emotional connections and feelings of intimacy in relationships. For example, a recent study found that social media use was positively associated with self-disclosure, which is a key factor in building and maintaining close relationships (Chen & Lee, 2013). However, social media also has negative consequences that can impact personal relationships. One of the most significant concerns is the impact of social comparison, which can lead to feelings of envy, insecurity, and dissatisfaction. Studies have shown that frequent social media use is associated with higher levels of depressive symptoms, and that exposure to idealized and curated images of others can negatively impact self-esteem (Vogel et al., 2014). Additionally, social media can provide a platform for cyberbullying, which can cause emotional distress, anxiety, and even suicide. Cyberbullying can come from strangers, but it can also come from people who are known to the victim, such as peers or romantic partners. A recent survey found that one in four teenagers has experienced cyberbullying, and that it can lead to decreased self-esteem and social isolation (Hinduja & Patchin, 2018). Moreover, the overreliance on social media for communication can also have negative effects on personal relationships. In romantic partnerships, excessive social media use has been associated with lower levels of trust and relationship satisfaction (Muise et al., 2009). Social media usage can lead to misinterpretation of messages, and the lack of nonverbal cues can make it difficult to understand the true meaning of a conversation. In families, the overuse of social media can hinder face-to-face communication and physical interaction, which are essential components of healthy relationships. Based on these findings, individuals should take steps to limit the negative impact of social media on their personal relationships. One way to do so is to cultivate meaningful offline interactions with friends and family members. Taking a break from social media can also be beneficial, particularly if it leads to more in-person communication and interaction. Additionally, individuals can be mindful of the content they post on social media, and strive to create realistic and authentic portrayals of their lives. Finally, it is essential to educate oneself on the dangers of cyberbullying and to take active steps to prevent it. In conclusion, social media has both positive and negative influences on personal relationships. While social media can enhance communication and provide a platform for building new relationships, it can also result in social comparison, cyberbullying, and a lack of trust in personal relationships. By taking steps to limit the negative effects of social media exposure and cultivating healthy, in-person relationships, individuals can create meaningful and fulfilling connections with others while still enjoying the benefits of social media use.', 'role': 'assistant'}, {'content': 'Can you add more information on the impact of social media on family relationships? Also, can you suggest some practical tips on how to balance social media use and face-to-face communication with our loved ones?', 'role': 'user'}, {'content': 'Sure, here are some additional details on the impact of social media on family relationships. Social media has transformed the way we interact with our families, with both positive and negative implications. While it can provide a platform to stay in touch with family members living far away or separated by distance, it can also hinder close relationships by replacing face-to-face communication with online interactions. Social media has been found to decrease the amount of quality time families spend together physically, which can affect the emotional connections between family members. For example, studies have found that children who spend more time on social media have lower relationship satisfaction with their parents, while parents who spend more time on social media have lower relationship satisfaction with their children (Kross et al., 2020). Moreover, the content shared on social media can sometimes lead to misunderstandings and conflicts in family relationships. Parents may share pictures or posts related to their child without their consent, which can result in embarrassment or frustration. Similarly, siblings may share information about their siblings without their permission, which can strain relationships between siblings. That being said, there are practical tips to balance social media use and face-to-face communication with our loved ones. The first step is to set boundaries and establish times when social media will be avoided or limited. For example, putting aside an hour or two a day to disconnect from social media and spend time with family can make a big difference in fostering healthy relationships. Family members can also establish a ""no-phone"" policy during meal times or during important family events such as holidays or vacations. Seeking out opportunities to engage in shared activities can also strengthen family bonds. Activities such as cooking, hiking, or game nights can create opportunities for meaningful interactions and laughter, which can foster stronger emotional connections between family members. Finally, finding ways to use social media to enhance family relationships is important. Tools such as video chat or group messaging platforms can allow family members to stay in touch and share their lives with one another while also maintaining meaningful connections. Overall, balancing social media use with face-to-face communication and shared activities can help individuals cultivate healthy and meaningful family relationships while still enjoying the benefits of social media use.', 'role': 'assistant'}, {'content': 'Can you give some examples of ""shared activities"" that families can do together to strengthen their relationships?', 'role': 'user'}, {'content': 'Sure! Here are a few examples of shared activities that families can do together to strengthen their relationships: 1. Cooking and baking: Preparing meals together is not only a practical activity that can help families save time and money, but it also provides an opportunity for sharing stories, tips, and traditions. Cooking and baking can be a fun way for family members to connect and create memories while enjoying tasty treats. 2. Outdoor activities: Hiking, cycling, or camping are all great ways to get outdoors and enjoy nature. Family members can take turns planning and organizing trips or activities, and encourage each other to try new things. Outdoor activities also provide an opportunity to disconnect from technology and spend quality time together. 3. Game nights: Board games, card games, or video games can be a fun way for family members to engage in friendly competition and create shared experiences. Game nights can be an opportunity to learn new games or revisit old favorites, and can be enjoyed by family members of all ages. 4. Movie nights: Watching movies together allows family members to bond over shared interests and experiences. Family members can take turns selecting movies, and share their thoughts and opinions during or after the screening. Movie nights can also be an opportunity to introduce younger family members to classic or new movies that they may not have seen before. 5. Volunteer work: Participating in volunteer work as a family can provide a sense of purpose and fulfillment, while also strengthening relationships. Family members can choose a cause or organization that they feel passionate about, and work together towards a common goal. Volunteering is also a great way to instill values of empathy and compassion in younger family members. Overall, shared activities should be fun, engaging, and flexible enough to accommodate everyone's interests and schedules. By finding ways to spend time together as a family, and creating meaningful connections, individuals can create a positive and healthy family environment.', 'role': 'assistant'}, {'content': 'Can you give some more examples of outdoor activities that families can enjoy together?', 'role': 'user'}, {'content': 'Sure, here are some more examples of outdoor activities that families can enjoy together: 1. Picnics: Going on a picnic is a great way to spend time together outdoors while enjoying a meal or snack. Families can pack a basket of their favorite foods and drinks, and find a scenic location in nature, such as a park, beach, or forest. 2. Geocaching: Geocaching is an outdoor treasure hunt that uses GPS coordinates to locate hidden ""caches"" containing small treasures or messages. Families can work together to discover new caches and share their finds with one another. 3. Nature walks: Taking a leisurely stroll in nature can be a great way to relax and bond as a family. Families can choose a trail or park to explore, and spend time observing and discussing the plants, animals, and natural features they encounter along the way. 4. Water activities: If a family lives near a lake, river, or ocean, they can enjoy activities such as swimming, kayaking, or paddleboarding. These activities provide a fun and exciting challenge that the whole family can enjoy together. 5. Gardening: Gardening is a great way to connect with nature and engage in a productive, creative activity. Families can work together to plan, plant, and care for a garden, and enjoy the fruits of their labor as a family. Overall, outdoor activities provide an excellent opportunity for families to spend quality time together while enjoying the benefits of fresh air, exercise, and nature. Families can choose activities that suit their interests, skills, and comfort levels, and use them as a way to build strong and lasting relationships.', 'role': 'assistant'}]"


In [19]:
# check the one example conversation
import json

json_str = json.dumps(df.iloc[0]["messages"], indent=4)
print(json_str)

[
    {
        "content": "These instructions apply to section-based themes (Responsive 6.0+, Retina 4.0+, Parallax 3.0+ Turbo 2.0+, Mobilia 5.0+). What theme version am I using?\nOn your Collections pages & Featured Collections sections, you can easily show the secondary image of a product on hover by enabling one of the theme's built-in settings!\nYour Collection pages & Featured Collections sections will now display the secondary product image just by hovering over that product image thumbnail.\nDoes this feature apply to all sections of the theme or just specific ones as listed in the text material?",
        "role": "user"
    },
    {
        "content": "This feature only applies to Collection pages and Featured Collections sections of the section-based themes listed in the text material.",
        "role": "assistant"
    },
    {
        "content": "Can you guide me through the process of enabling the secondary image hover feature on my Collection pages and Featured Collections

### 5. Submit the fine tuning job using the the model and data as inputs
 
Create the job that uses the `text-generation` pipeline component. [Learn more](https://github.com/Azure/azureml-assets/blob/main/assets/training/finetune_acft_hf_nlp/components/pipeline_components/text_generation/README.md) about all the parameters supported for fine tuning.

Define finetune parameters

Finetune parameters can be grouped into 2 categories - training parameters, optimization parameters

Training parameters define the training aspects such as - 
1. the optimizer, scheduler to use
2. the metric to optimize the finetune
3. number of training steps and the batch size
and so on

Optimization parameters help in optimizing the GPU memory and effectively using the compute resources. Below are few of the parameters that belong to this category. _The optimization parameters differs for each model and are packaged with the model to handle these variations._
1. enable the deepspeed, ORT and LoRA
2. enable mixed precision training
2. enable multi-node training 

In [21]:
foundation_model.tags

{'Featured': '',
 'Preview': '',
 'maas-inference': 'True',
 'huggingface_model_id': '',
 'SharedComputeCapacityEnabled': '',
 'license': 'mit',
 'disable-batch': 'true',
 'task': 'chat-completion',
 'author': 'microsoft',
 'hiddenlayerscanned': '',
 'inference_compute_allow_list': "['Standard_NC6s_v3', 'Standard_NC12s_v3', 'Standard_NC24s_v3', 'Standard_ND40rs_v2', 'Standard_NC24ads_A100_v4', 'Standard_NC48ads_A100_v4', 'Standard_ND96asr_v4', 'Standard_ND96amsr_A100_v4', 'Standard_NC96ads_A100_v4']",
 'finetune_compute_allow_list': "['Standard_NC24ads_A100_v4', 'Standard_NC48ads_A100_v4', 'Standard_NC96ads_A100_v4', 'Standard_ND96asr_v4', 'Standard_ND96amsr_A100_v4']",
 'inference_supported_envs': "['vllm']",
 'model_specific_defaults': "{'apply_deepspeed': 'true', 'deepspeed_stage': 2, 'apply_lora': 'true', 'apply_ort': 'false', 'precision': 16, 'ignore_mismatched_sizes': 'false', 'num_train_epochs': 1, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'gradient_accu

In [23]:
# Default training parameters
training_parameters = dict(
    num_train_epochs=3,
    per_device_train_batch_size=1,
    per_device_eval_batch_size=1,
    learning_rate=5e-6,
    lr_scheduler_type="cosine",
)

# Default optimization parameters
optimization_parameters = dict(
    apply_lora="true",
    apply_deepspeed="true",
    deepspeed_stage=2,
)
# Let's construct finetuning parameters using training and optimization paramters.
finetune_parameters = {**training_parameters, **optimization_parameters}
print(finetune_parameters)

{'num_train_epochs': 3, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'learning_rate': 5e-06, 'lr_scheduler_type': 'cosine', 'apply_lora': 'true', 'apply_deepspeed': 'true', 'deepspeed_stage': 2}


In [24]:
# Each model finetuning works best with certain finetuning parameters which are packed with model as `model_specific_defaults`.
# Let's override the finetune_parameters in case the model has some custom defaults.
if "model_specific_defaults" in foundation_model.tags:
    print("Warning! Model specific defaults exist. The defaults could be overridden.")
    finetune_parameters.update(
        ast.literal_eval(  # convert string to python dict
            foundation_model.tags["model_specific_defaults"]
        )
    )
print(
    f"The following finetune parameters are going to be set for the run: {finetune_parameters}"
)

The following finetune parameters are going to be set for the run: {'num_train_epochs': 1, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'learning_rate': '5e-6', 'lr_scheduler_type': 'cosine', 'apply_lora': 'true', 'apply_deepspeed': 'true', 'deepspeed_stage': 2, 'apply_ort': 'false', 'precision': 16, 'ignore_mismatched_sizes': 'false', 'gradient_accumulation_steps': 1, 'logging_strategy': 'steps', 'logging_steps': 10, 'save_total_limit': 1}


In [37]:
# Set the pipeline display name for distinguishing different runs from the name
def get_pipeline_display_name():
    batch_size = (
        int(finetune_parameters.get("per_device_train_batch_size", 1))
        * int(finetune_parameters.get("gradient_accumulation_steps", 1))
        * int(gpus_per_node)
        * int(finetune_parameters.get("num_nodes_finetune", 1))
    )
    scheduler = finetune_parameters.get("lr_scheduler_type", "linear")
    deepspeed = finetune_parameters.get("apply_deepspeed", "false")
    ds_stage = finetune_parameters.get("deepspeed_stage", "2")
    if deepspeed == "true":
        ds_string = f"ds{ds_stage}"
    else:
        ds_string = "nods"
    lora = finetune_parameters.get("apply_lora", "false")
    if lora == "true":
        lora_string = "lora"
    else:
        lora_string = "nolora"
    save_limit = finetune_parameters.get("save_total_limit", -1)
    seq_len = finetune_parameters.get("max_seq_length", -1)
    return (
        model_name
        + "-"
        + "ultrachat"
        + "-"
        + f"bs{batch_size}"
        + "-"
        + f"{scheduler}"
        + "-"
        + ds_string
        + "-"
        + lora_string
        + f"-save_limit{save_limit}"
        + f"-seqlen{seq_len}"
    )


pipeline_display_name = get_pipeline_display_name()
print(f"Display name used for the run: {pipeline_display_name}")

Display name used for the run: Phi-3-mini-4k-instruct-ultrachat-bs4-cosine-ds2-lora-save_limit1-seqlen-1


In [40]:
from azure.ai.ml.dsl import pipeline
from azure.ai.ml import Input

# fetch the pipeline component
pipeline_component_func = registry_ml_client.components.get(
    name="chat_completion_pipeline", label="latest"
)

In [46]:
# define the pipeline job
@pipeline(name=pipeline_display_name)
def create_pipeline():
    chat_completion_pipeline = pipeline_component_func(
        mlflow_model_path=foundation_model.id,
        compute_model_import=compute_cluster,
        compute_preprocess=compute_cluster,
        compute_finetune=compute_cluster,
        compute_model_evaluation=compute_cluster,
        # map the dataset splits to parameters
        train_file_path=Input(
            type="uri_file", path="./ultrachat_200k_dataset/train_sft.jsonl"
        ),
        test_file_path=Input(
            type="uri_file", path="./ultrachat_200k_dataset/test_sft.jsonl"
        ),
        # Training settings
        number_of_gpu_to_use_finetuning=gpus_per_node,  # set to the number of GPUs available in the compute
        **finetune_parameters
    )
    return {
        # map the output of the fine tuning job to the output of pipeline job so that we can easily register the fine tuned model
        # registering the model is required to deploy the model to an online or batch endpoint
        "trained_model": chat_completion_pipeline.outputs.mlflow_model_folder
    }


pipeline_object = create_pipeline()

# don't use cached results from previous jobs
pipeline_object.settings.force_rerun = True

# set continue on step failure to False
pipeline_object.settings.continue_on_step_failure = False

In [57]:
experiment_name

'chat_completion_Phi-3-mini-4k-instruct'

Submit the job

In [None]:
# submit the pipeline job
pipeline_job = workspace_ml_client.jobs.create_or_update(
    pipeline_object, experiment_name=experiment_name
)

# wait for the pipeline job to complete
workspace_ml_client.jobs.stream(pipeline_job.name)

Class AutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class AutoDeleteConditionSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseAutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class IntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class ProtectionLevelSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseIntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
[32mUploading train_sft.jsonl[32m (<

RunId: green_ball_73djrjlhsp
Web View: https://ml.azure.com/runs/green_ball_73djrjlhsp?wsid=/subscriptions/02243ba5-b777-47c6-9ecf-830b204b7593/resourcegroups/llm_fine_tuning/workspaces/llm-ft-aml

Streaming logs/azureml/executionlogs.txt

[2024-06-30 08:34:23Z] Submitting 1 runs, first five are: f450f698:965cb73d-a724-408d-be6b-98d037486287


#### Review training and evaluation metrics
Viewing the job in AzureML studio is the best way to analyze logs, metrics and outputs of jobs. You can create custom charts and compare metics across different jobs. See https://learn.microsoft.com/en-us/azure/machine-learning/how-to-log-view-metrics?tabs=interactive#view-jobsruns-information-in-the-studio to learn more. 

However, we may need to access and review metrics programmatically for which we will use MLflow, which is the recommended client for logging and querying metrics.

In [74]:
import mlflow, json

mlflow_tracking_uri = workspace_ml_client.workspaces.get(
    workspace_ml_client.workspace_name
).mlflow_tracking_uri

mlflow.set_tracking_uri(mlflow_tracking_uri)

# concat 'tags.mlflow.rootRunId=' and pipeline_job.name in single quotes as filter variable
filter = "tags.mlflow.rootRunId='" + pipeline_job.name + "'"
runs = mlflow.search_runs(
    experiment_names=[experiment_name], filter_string=filter, output_format="list"
)
training_and_evaluation_run = None

# get the training and evaluation runs.
# using a hacky way till 'Bug 2320997: not able to show eval metrics in FT notebooks - mlflow client now showing display names' is fixed
for run in runs:
    # check if run.data.metrics.epoch exists
    if "epoch" in run.data.metrics:
        training_and_evaluation_run = run

In [76]:
if training_and_evaluation_run:
    print("Training & Evaluation metrics:\n\n")
    print(json.dumps(training_run.data.metrics, indent=2))
else:
    print("No Training / Evaluation job found")

Training & Evaluation metrics:


{
  "loss": 0.7856,
  "grad_norm": 0.576697051525116,
  "learning_rate": 2.3119535520421675e-10,
  "epoch": 1.0,
  "eval_loss": 0.7564878463745117,
  "eval_runtime": 65.0173,
  "eval_samples_per_second": 31.976,
  "eval_steps_per_second": 7.998,
  "checkpoint_save_step": 2079.0,
  "train_runtime": 1380.8044,
  "train_samples_per_second": 6.021,
  "train_steps_per_second": 1.506,
  "total_flos": 2.5514210184973517e+17,
  "train_loss": 0.7663773359617057
}


### 6. Register the fine tuned model with the workspace

We will register the model from the output of the fine tuning job. This will track lineage between the fine tuned model and the fine tuning job. The fine tuning job, further, tracks lineage to the foundation model, data and training code.

In [78]:
pipeline_job

Experiment,Name,Type,Status,Details Page
chat_completion_Phi-3-mini-4k-instruct,green_ball_73djrjlhsp,pipeline,Preparing,Link to Azure Machine Learning studio


In [79]:
from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes

# check if the `trained_model` output is available
print("pipeline job outputs: ", workspace_ml_client.jobs.get(pipeline_job.name).outputs)

pipeline job outputs:  {'trained_model': <azure.ai.ml.entities._job.pipeline._io.base.PipelineOutput object at 0x7f1d729c1d20>}


In [82]:
# fetch the model from pipeline job output - not working, hence fetching from fine tune child job
model_path_from_job = "azureml://jobs/{0}/outputs/{1}".format(
    pipeline_job.name, "trained_model"
)

finetuned_model_name = model_name + "-ultrachat-200k"
finetuned_model_name = finetuned_model_name.replace("/", "-")
print("path to register model: ", model_path_from_job)
print("user defined fine tuned model name: ", finetuned_model_name)

path to register model:  azureml://jobs/green_ball_73djrjlhsp/outputs/trained_model
user defined fine tuned model name:  Phi-3-mini-4k-instruct-ultrachat-200k


In [85]:
# register the model
prepare_to_register_model = Model(
    path=model_path_from_job,
    type=AssetTypes.MLFLOW_MODEL,
    name=finetuned_model_name,
    version=timestamp,  # use timestamp as version to avoid version conflict
    description=model_name + " fine tuned model for ultrachat 200k chat-completion",
)
print("prepare to register model: \n", prepare_to_register_model)

# register the model from pipeline job output
registered_model = workspace_ml_client.models.create_or_update(
    prepare_to_register_model
)
print("registered model: \n", registered_model)

prepare to register model: 
 description: Phi-3-mini-4k-instruct fine tuned model for ultrachat 200k chat-completion
name: Phi-3-mini-4k-instruct-ultrachat-200k
path: azureml://jobs/green_ball_73djrjlhsp/outputs/trained_model
properties: {}
tags: {}
type: mlflow_model
version: '1719729591'

registered model: 
 creation_context:
  created_at: '2024-07-03T04:41:47.701519+00:00'
  created_by: He Zhang
  created_by_type: User
  last_modified_at: '2024-07-03T04:41:47.701519+00:00'
  last_modified_by: He Zhang
  last_modified_by_type: User
description: Phi-3-mini-4k-instruct fine tuned model for ultrachat 200k chat-completion
flavors:
  hftransformersv2:
    code: code
    hf_config_class: AutoConfig
    hf_predict_module: predict_phi
    hf_pretrained_class: AutoModelForCausalLM
    hf_tokenizer_class: AutoTokenizer
    model_data: data
    model_hf_load_kwargs: "{\n  \"trust_remote_code\": \"true\"\n}"
    pytorch_version: 2.2.2
    task_type: chat-completion
    tokenizer_config: "{\n  \"

### 7. Deploy the fine tuned model to an online endpoint
Online endpoints give a durable REST API that can be used to integrate with applications that need to use the model.

In [86]:
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    ProbeSettings,
    OnlineRequestSettings,
)

# Create online endpoint - endpoint names need to be unique in a region, hence using timestamp to create unique endpoint name
online_endpoint_name = "ultrachat-completion-" + timestamp

# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="Online endpoint for "
    + registered_model.name
    + ", fine tuned model for ultrachat-200k-chat-completion",
    auth_mode="key",
)
workspace_ml_client.begin_create_or_update(endpoint).wait()

You can find here the list of SKU's supported for deployment - [Managed online endpoints SKU list](https://learn.microsoft.com/en-us/azure/machine-learning/reference-managed-online-endpoints-vm-sku-list)

In [87]:
%%time
# create a deployment
demo_deployment = ManagedOnlineDeployment(
    name="demo",
    endpoint_name=online_endpoint_name,
    model=registered_model.id,
    instance_type="Standard_NC6s_v3",
    instance_count=1,
    liveness_probe=ProbeSettings(initial_delay=600),
    request_settings=OnlineRequestSettings(request_timeout_ms=90000),
)
workspace_ml_client.online_deployments.begin_create_or_update(demo_deployment).wait()
endpoint.traffic = {"demo": 100}
workspace_ml_client.begin_create_or_update(endpoint).result()

Check: endpoint ultrachat-completion-1719729591 exists


...........................................................................................................................................................CPU times: user 1.69 s, sys: 53.8 ms, total: 1.75 s
Wall time: 13min 49s


ManagedOnlineEndpoint({'public_network_access': 'Enabled', 'provisioning_state': 'Succeeded', 'scoring_uri': 'https://ultrachat-completion-1719729591.westeurope.inference.ml.azure.com/score', 'openapi_uri': 'https://ultrachat-completion-1719729591.westeurope.inference.ml.azure.com/swagger.json', 'name': 'ultrachat-completion-1719729591', 'description': 'Online endpoint for Phi-3-mini-4k-instruct-ultrachat-200k, fine tuned model for ultrachat-200k-chat-completion', 'tags': {}, 'properties': {'azureml.onlineendpointid': '/subscriptions/02243ba5-b777-47c6-9ecf-830b204b7593/resourcegroups/llm_fine_tuning/providers/microsoft.machinelearningservices/workspaces/llm-ft-aml/onlineendpoints/ultrachat-completion-1719729591', 'AzureAsyncOperationUri': 'https://management.azure.com/subscriptions/02243ba5-b777-47c6-9ecf-830b204b7593/providers/Microsoft.MachineLearningServices/locations/westeurope/mfeOperationsStatus/oeidp:19b3ccb1-2f95-4654-998d-43fb2a60e9ce:838627e7-c833-4cec-ae00-964f8c2326bd?api-

### 8. Test the endpoint with sample data

We will fetch some sample data from the test dataset and submit to online endpoint for inference. We will then show the display the scored labels alongside the ground truth labels

In [88]:
# read ./ultrachat_200k_dataset/test_gen.jsonl into a pandas dataframe
test_df = pd.read_json("./ultrachat_200k_dataset/test_gen.jsonl", lines=True)
# take few random samples
test_df = test_df.sample(n=1)
# rebuild index
test_df.reset_index(drop=True, inplace=True)
test_df.head(2)

Unnamed: 0,prompt,prompt_id,messages
0,"Given the text: Social Media Marketing Conference is going to be organised at Shilo Inn Suites Hotel, Portland, USA on 09 Apr 2018. This expo is going to be a 1 day event. This event forays into categories like Science & Research.\nCould you suggest a recommended itinerary for attendees of the one-day Social Media Marketing Conference in Portland on April 9th, 2018?",3eb5e1c75c197b224b9ec29fce6221666b675518d7330a78fb53ee2b7907ddb7,"[{'content': 'Given the text: Social Media Marketing Conference is going to be organised at Shilo Inn Suites Hotel, Portland, USA on 09 Apr 2018. This expo is going to be a 1 day event. This event forays into categories like Science & Research. Could you suggest a recommended itinerary for attendees of the one-day Social Media Marketing Conference in Portland on April 9th, 2018?', 'role': 'user'}, {'content': 'Here is a possible itinerary for attendees of the Social Media Marketing Conference on April 9th, 2018: 8:00 AM - 9:00 AM: Registration and networking Check-in at the Shilo Inn Suites Hotel, collect your conference badge and mingle with other attendees over coffee and pastries. 9:00 AM - 10:00 AM: Opening keynote Listen to a prominent speaker who could be giving insights on the latest social media trends, changes to algorithms, and how to stay on top of developments. 10:15 AM - 11:15 AM: Breakout sessions Attend one of the concurrent sessions in your chosen category of interest, such as marketing strategy, content creation, social analytics or customer engagement. 11:30 AM - 12:30 PM: Panel discussion Hear from a panel of experts who are apt to impart their advice on navigating complex social media issues and tactics. 12:30 PM - 2:00 PM: Networking lunch Take the opportunity to get to know fellow attendees, as well as the vendors and exhibitors who could be showing off their social media products and services at the expo. 2:15 PM - 3:15 PM: Hands-on workshop Participate in a practical activity that could guide you in honing your social media skills or put your ideas into practice. 3:30 PM - 4:30 PM: Final keynote Attend the final keynote speech with another notable speaker and listen to their concluding notes and highlights of the conference. 4:30 PM - 5:00 PM: Closing remarks and giveaways Stay for the wrap-up and enter into the prize drawings for some social-media-worthy swag items.', 'role': 'assistant'}, {'content': 'Can you give me the details of the breakout sessions that will be held at the Social Media Marketing Conference on April 9th?', 'role': 'user'}]"


In [89]:
import json

# create a json object with the key as "input_data" and value as a list of values from the text column of the test dataframe
parameters = {
    "temperature": 0.6,
    "top_p": 0.9,
    "do_sample": True,
    "max_new_tokens": 200,
}
test_json = {
    "input_data": {
        "input_string": [test_df["messages"][0]],
        "parameters": parameters,
    },
    "params": {},
}

# save the json object to a file named sample_score.json in the ./samsum-dataset folder
with open("./ultrachat_200k_dataset/sample_score.json", "w") as f:
    json.dump(test_json, f)

In [91]:
# score the sample_score.json file using the online endpoint with the azureml endpoint invoke method
response = workspace_ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name="demo",
    request_file="./ultrachat_200k_dataset/sample_score.json",
)
print("raw response: \n", response, "\n")

raw response: 
 {"output":" I don't have the specific details of the breakout sessions for the social media marketing conference on april 9th, 2018. However, the conference is likely to have multiple sessions in various categories such as social media strategy, content creation, social analytics, customer engagement, and more. Attendees can check the conference website or program brochure for the detailed schedule and session titles."} 



### 9. Delete the online endpoint
Don't forget to delete the online endpoint, else you will leave the billing meter running for the compute used by the endpoint

In [93]:
workspace_ml_client.online_endpoints.begin_delete(name=online_endpoint_name).wait()

........................................................................