# Distillation Example

This notebook shows you how to distill a small LLM to do a narrow set of tasks based on the responses of a larger LLM.

Based on https://cloud.google.com/vertex-ai/generative-ai/docs/models/distill-text-models

## Set up.

Install the necessary packages, set up the API keys etc.

In [1]:
#%pip install --quiet -r requirements.txt

In [4]:
from dotenv import load_dotenv
load_dotenv("../keys.env");

PROVIDER = "Google"
#PROVIDER = "OpenAI"

if PROVIDER == "Google":
    from langchain_google_genai import ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings
    model = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.1)
else:
    from langchain_openai import ChatOpenAI, OpenAIEmbeddings
    model = ChatOpenAI(model_name="gpt-4o-mini", temperature=0.1)

## Generate reviews

Normally, you'd get the reviews from your application logs or data warehouse, but here we'll generate a synthetic dataset using an LLM.

In [14]:
persona = ['man', 'woman', 'student', 'retired person', 'veteran', 'restaurant critic', 'traveler']
meal = ['breakfast', 'lunch', 'dinner']
cuisine = ['Mexican', 'Indian', 'Chinese', 'Thai', 'Italian', 'French', 'Greek']
service = ['fast', 'slow', 'personal', 'efficient', 'friendly', 'surly']
food = ['terrible', 'overpriced', 'good', 'great', 'amazing']
length = range(2, 10)

In [17]:
from langchain_core.prompts import PromptTemplate
import random

prompt_template = PromptTemplate.from_template("""
You are a {persona} who visited a restaurant serving {cuisine} food for {meal}. Write a {length}-line review,
assuming that the food was {food} and the service was {service}. Add details to make it realistic.
""".strip())
prompt_val = prompt_template.format(
    persona=random.choice(persona),
    cuisine=random.choice(cuisine),
    meal=random.choice(meal),
    length=random.choice(length),
    food=random.choice(food),
    service=random.choice(service)
)
prompt_val

'You are a veteran who visited a restaurant serving French food for breakfast. Write a 6-line review,\nassuming that the food was amazing and the service was efficient. Add details to make it realistic.'

In [19]:
response = model.invoke(prompt_val)
response

AIMessage(content='After years of MREs, this was a welcome change. The buttery croissants melted in my mouth, the coffee was rich and strong, and the omelet was fluffy perfection. The service was quick and friendly, even with the place packed.  A taste of Paris in the heart of America, and a much-needed respite from the usual greasy spoon.  Definitely coming back! \n', response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-805f1835-8242-4870-ba15-78f8539242ff-0', usage_metadata={'input_tokens': 43, 'output_tokens': 77, 'total_tokens': 120})

In [20]:
response.content

'After years of MREs, this was a welcome change. The buttery croissants melted in my mouth, the coffee was rich and strong, and the omelet was fluffy perfection. The service was quick and friendly, even with the place packed.  A taste of Paris in the heart of America, and a much-needed respite from the usual greasy spoon.  Definitely coming back! \n'

In [31]:
N = 1000
batch_size = 10

reviews = []
for x in range(0, N, batch_size):
    prompt_vals = [
        prompt_template.format(
            persona=random.choice(persona),
            cuisine=random.choice(cuisine),
            meal=random.choice(meal),
            length=random.choice(length),
            food=random.choice(food),
            service=random.choice(service)
        ) for x in range(batch_size)
    ]
    batch_responses = model.batch(prompt_vals)
    reviews.extend(batch_responses)
    print(f'{int(100*len(reviews)/N)}%', end='...', flush=True)

1%...2%...3%...4%...5%...6%...7%...8%...9%...10%...11%...12%...13%...14%...15%...16%...17%...18%...19%...20%...21%...22%...23%...24%...25%...26%...27%...28%...29%...30%...31%...32%...33%...34%...35%...36%...37%...38%...39%...40%...41%...42%...43%...44%...45%...46%...47%...48%...49%...50%...51%...52%...53%...54%...55%...56%...57%...58%...59%...60%...61%...62%...63%...64%...65%...66%...67%...68%...69%...70%...71%...72%...73%...74%...75%...76%...77%...78%...79%...80%...81%...82%...83%...84%...85%...86%...87%...88%...89%...90%...91%...92%...93%...94%...95%...96%...97%...98%...99%...100%...

In [32]:
len(reviews)

1000

In [34]:
reviews[0].content

"The moussaka was dry, the souvlaki bland, and the prices made my wallet weep.  The service was brisk, almost too brisk, like they were trying to get us out the door before we realized how much we'd been overcharged.  I've had better Greek food at a taverna on the beach in Mykonos, and for half the price.  Not sure I'll be back. \n"

In [35]:
reviews = [review.content for review in reviews]
reviews[0]

"The moussaka was dry, the souvlaki bland, and the prices made my wallet weep.  The service was brisk, almost too brisk, like they were trying to get us out the door before we realized how much we'd been overcharged.  I've had better Greek food at a taverna on the beach in Mykonos, and for half the price.  Not sure I'll be back. \n"

In [38]:
import json
with open('reviews.json', 'w') as ofp:
    json.dump(reviews, ofp, indent=1)

In [39]:
!head reviews.json

[
 "The moussaka was dry, the souvlaki bland, and the prices made my wallet weep.  The service was brisk, almost too brisk, like they were trying to get us out the door before we realized how much we'd been overcharged.  I've had better Greek food at a taverna on the beach in Mykonos, and for half the price.  Not sure I'll be back. \n",
 "The escargots were divine, the steak cooked to perfection, and the cr\u00e8me br\u00fbl\u00e9e was a delightful ending to a truly memorable meal. The ambiance was charming, and the service was warm and attentive.  I felt like I was transported to a Parisian bistro, and I can't wait to return for another delightful evening. \n",
 "The feta omelet was decent, but at 15 euros, it felt like a tourist trap. The coffee was lukewarm and the baklava, while tasty, was a measly two bites for 8 euros. The service was as warm as the coffee, with a grumpy waitress who seemed annoyed by our presence.  I'd recommend skipping this place and finding a more authentic, 

## Create training dataset for Vertex AI

Format is
<pre>
{"input_text": "question: How many people live in Beijing? context: With over 21 million residents, Beijing is the world's most populous national capital city and is China's second largest city after Shanghai. It is located in Northern China, and is governed as a municipality under the direct administration of the State Council with 16 urban, suburban, and rural districts.[14] Beijing is mostly surrounded by Hebei Province with the exception of neighboring Tianjin to the southeast; together, the three divisions form the Jingjinji megalopolis and the national capital region of China.", "output_text": "over 21 million people"}
</pre>
We want to omit the output because we want the teacher model to generate the output and rationale.

In [40]:
import json
with open('reviews.json') as ifp:
    reviews = json.load(ifp)
reviews[0]

"The moussaka was dry, the souvlaki bland, and the prices made my wallet weep.  The service was brisk, almost too brisk, like they were trying to get us out the door before we realized how much we'd been overcharged.  I've had better Greek food at a taverna on the beach in Mykonos, and for half the price.  Not sure I'll be back. \n"

In [55]:
def create_input(review):
    return f"""
    Read this review and fill out the JSON structure about the restaurant.
    The cuisine refers to the type of food is referenced in the review? Choose one of: {cuisine}
    The rating is a number between 1 and 5. 1 is unhappy. 5 is very happy.
    The summary is a one-line summary of the review.
    
    ***REVIEW***
    {review}
    
    ***OUTPUT JSON***
    {{
       "cuisine": __, 
       "rating": __, 
       "summary": __
    }}
    """.strip()

d = {"input_text": create_input(reviews[0])}
d

{'input_text': 'Read this review and fill out the JSON structure about the restaurant.\n    The cuisine refers to the type of food is referenced in the review? Choose one of: [\'Mexican\', \'Indian\', \'Chinese\', \'Thai\', \'Italian\', \'French\', \'Greek\']\n    The rating is a number between 1 and 5. 1 is unhappy. 5 is very happy.\n    The summary is a one-line summary of the review.\n    \n    ***REVIEW***\n    The moussaka was dry, the souvlaki bland, and the prices made my wallet weep.  The service was brisk, almost too brisk, like they were trying to get us out the door before we realized how much we\'d been overcharged.  I\'ve had better Greek food at a taverna on the beach in Mykonos, and for half the price.  Not sure I\'ll be back. \n\n    \n    ***OUTPUT JSON***\n    {\n       "cuisine": __, \n       "rating": __, \n       "summary": __\n    }'}

In [57]:
import json
with open('distill_train.jsonl', 'w') as ofp:
    for review in reviews:
        d = {"input_text": create_input(review)}
        json.dump(d, ofp)
        ofp.write('\n')
!head -3 distill_train.jsonl

{"input_text": "Read this review and fill out the JSON structure about the restaurant.\n    The cuisine refers to the type of food is referenced in the review? Choose one of: ['Mexican', 'Indian', 'Chinese', 'Thai', 'Italian', 'French', 'Greek']\n    The rating is a number between 1 and 5. 1 is unhappy. 5 is very happy.\n    The summary is a one-line summary of the review.\n    \n    ***REVIEW***\n    The moussaka was dry, the souvlaki bland, and the prices made my wallet weep.  The service was brisk, almost too brisk, like they were trying to get us out the door before we realized how much we'd been overcharged.  I've had better Greek food at a taverna on the beach in Mykonos, and for half the price.  Not sure I'll be back. \n\n    \n    ***OUTPUT JSON***\n    {\n       \"cuisine\": __, \n       \"rating\": __, \n       \"summary\": __\n    }"}
{"input_text": "Read this review and fill out the JSON structure about the restaurant.\n    The cuisine refers to the type of food is referenc

In [58]:
BUCKET="viz_genai_nonsensitive"  # CHANGE THIS to be your own bucket
REGION="us-central1"
!gsutil cp distill_train.jsonl gs://$BUCKET/

Copying file://distill_train.jsonl [Content-Type=application/octet-stream]...
/ [1 files][826.9 KiB/826.9 KiB]                                                
Operation completed over 1 objects/826.9 KiB.                                    


## Distill

In [61]:
import vertexai
from vertexai.preview.language_models import TextGenerationModel, TuningEvaluationSpec

vertexai.init(location=REGION)
student_model = TextGenerationModel.from_pretrained("text-bison@002")
teacher_model = TextGenerationModel.from_pretrained("text-unicorn@001")
distillation_job = student_model.distill_from(
        teacher_model=teacher_model,
        dataset=f"gs://{BUCKET}/distill_train.jsonl",
)

Creating PipelineJob
PipelineJob created. Resource name: projects/82379820716/locations/us-central1/pipelineJobs/distillation-20240816005042
To use this PipelineJob in another session:
pipeline_job = aiplatform.PipelineJob.get('projects/82379820716/locations/us-central1/pipelineJobs/distillation-20240816005042')
View Pipeline Job:
https://console.cloud.google.com/vertex-ai/locations/us-central1/pipelines/runs/distillation-20240816005042?project=82379820716
