# Multimodal AI: Image Generation & CLIP Evaluation

This notebook demonstrates multimodal AI capabilities including text-to-image generation and image understanding.

**Skills Demonstrated:**
- Text-to-image generation with diffusion models
- CLIP for zero-shot image classification
- Visual Question Answering (VQA) implementation
- Automated image evaluation pipelines
- Prompt engineering for image generation

**Technologies:** PyTorch, Hugging Face Transformers, CLIP, Stable Diffusion

## 1: Multimodality with Image Generation & Captioning

This notebook works with PyTorch and Hugging Face to get more experience with its abstract classes as well as some new models. We will look at the simple task of image generation and use a number of tools to see how we could programatically evaluate the accuracy of the image generation process.

This notebook covers:

1. **Image Generation** 

 Here we will explore how the Canva image diffusion model generates images conditioned on a text prompt.

2. **Image Classification**

 Here we will use CLIP to evaluate captions that describe our images to see which labels most accurately describes our generated images.

3. **Image Evaluation**

 We will also use a visual question answering system to ask questions about our generated image. In our prompt we asked for certain items in the image. In the question answering system we can ask if those items are present in the image.

**Models we'll use in this exercise**

**Canva** - generate images from two prompts

**CLIP** - compare "text" labels with image

**VQA** - answer questions about image

Images are stored for reuse across the notebook. You can do the same thing in your @.edu account.

In [None]:
#mount Google Drive
import os
from google.colab import drive
drive.mount('/content/drive')

# Define your directory
output_folder = "/content/drive/MyDrive/output/2025-fall/TestNotebooks/"

# Ensure the output directory exists
if not os.path.exists(output_folder):
 os.makedirs(output_folder)

### 0.0 Image Generation

You are going to use Canva, a commercial image generation platform, to generate two different images to use for our tests. To use Canva, you must first set up a free account on the system. Click [here](https://www.canva.com) go to canva. They will ask you to log in. YOu can do this either using one of your Google accounts or you can give them an email address and they will send you a code that you'll need to enter to get logged in.

Once you are logged in you will want to "create" an image, you will need to enter a text prompt describing the image you want and then click the run button on the right to generate the image. You will then have 4 images you can choose from. You can select the one you like by single clicking on it. You will then need to click the download button in the image to download the image to your laptop. The image must be saved in either *.png or *.jpg format.

You must generate two separate images for use in this exercise. 

The first image must feature one central object that is the focus of the image. Be sure to specify that you want just one. I used `"portrait of a dog sitting in a chair"`. The dog is the central object in this image. You must use a different prompt for your own object. Save this Canva generated image on your laptop as a file named `OneThing.jpg` or .png if you prefer.

For this second image you will specify a prompt with **3** of the same object *(Type 1)*, **2** of a different kind of object *(Type 2)* "in the background", and two other different individual objects *(Type 3)* and *(Type 4)* in the scene. For example `3 cats in the garden with 2 snakes in the background with a flowering tree and a rose bush`. Make up your own prompt with your own set of type 1, type 2, type 3, and type 4 objects. Make sure you specify the count that goes with each type in your prompt. Save this image on your laptop as a file named `ThreeThingsPlus.jpg` or .png if you prefer.

You must place these two images in the `MyDrive/output/2025-fall/TestNotebooks` directory you created in your GoogleDrive so they can be accessed by this Colab notebook.

### 1.0 Environment Setup

Let us first install a few required packages. (You may want to comment this out in case you use a local environment that already has the suitable packages installed.)

In [None]:
!pip install -q -U transformers
!pip install -q -U bitsandbytes accelerate

In [None]:
#You can generate multiple images but not if you are using
# the T4 GPU. This assignment is designed to run with a single image

from PIL import Image
from pprint import pprint

def image_grid(imgs, rows, cols):
 assert len(imgs) == rows*cols

 w, h = imgs[0].size
 grid = Image.new('RGB', size=(cols*w, rows*h))
 grid_w, grid_h = grid.size

 for i, img in enumerate(imgs):
 grid.paste(img, box=(i%cols*w, i//cols*h))
 return grid

### Canva - Generate Images with one object

We're going to generate an image using a diffusion model ([Canva](https://www.canva.com)). You will specify a prompt that names one object. Be sure to specify that you want just one. I used ``"portrait of a dog sitting in a chair"``. How can we programatically tell if the generated image follows our prompt? We can use some other tools that can examine the photo and tell if it's contents are what we asked for or if the generator fell short. First, you need to load the model and then generate and image.

**1.a. Examining the enhanced prompt you created to generate the image `OneThing.jpg` with the single object on Canva? Enter the prompt as a triple quote string in the cell below.**


Make sure your Canva generated images are downloaded from Canva and put in your Google Drive as instructed. Now let's display that image of the single object.

In [None]:
from PIL import Image
img_url = '/content/drive/MyDrive/output/2025-fall/TestNotebooks/OneThing.jpg'
raw_image = Image.open(img_url, mode='r')

### CLIP - evaluate image with one object

Now we'll use the CLIP model ([Model Card](https://huggingface.co/openai/clip-vit-base-patch32)) to see if our image resembles the object in the prompt. You'll need to edit the list of captions below. One label in the list should be the object you requested in the prompt. Other labels can be similiar objects and one should be orthogonal (very different from your chosen object). Your labels should each be one or two words long (e.g. cat, dog, blue whale). This is similar to building a classifier that "recognizes" images by predicting one of many possible labels. You should have at least 4 different labels one of which identifies your single object.

In [None]:
import io
from PIL import Image
import requests
from transformers import CLIPProcessor, CLIPModel

In [None]:
%%capture
cl_model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
cl_processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")

Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.


In [None]:
captions = ['apple', 'lemon', 'pear', 'corgi']

inputs = cl_processor(
 text=captions, images=raw_image, return_tensors="pt", padding=True
)

outputs = cl_model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities

print()
for i, caption in enumerate(captions):
 print('%40s - %.4f' % (caption, probs[0, i]))
print()
print()


 apple - 0.9679
 lemon - 0.0009
 pear - 0.0312
 corgi - 0.0000




Consider using tensor.detach() first. (Triggered internally at /pytorch/torch/csrc/autograd/generated/python_variable_methods.cpp:835.)
 print('%40s - %.4f' % (caption, probs[0, i]))


**1.b. Examining all of the 1 or 2 word labels you gave to evaluate the generated image? Please list them in a quoted string e.g. "label, label, label, label" in the cell below.**


**1.c. Examining the correct 1 or 2 word label and the score assigned to it by the CLIP model? Denote this as "(label, number)" in the cell below.**


Great. We know the CLIP model can handle more description of the content of the image. Let's see how well that works. Instead of your 1 or 2 word labels, create more descriptive captions of roughly 5 to 10 words each. As with your labels, one caption in your list should describe the object you requested in the prompt. Other captions can be similiar objects and one should be orthogonal (very different from your chosen object). You must have 4 different captions for this question, one of which describes you object in detail.

In [None]:
captions = ['juicy red apple on a wooden floor', 'a cute fluffy corgi sitting in a chair', 'a sour lemon being squeezed on the floor', 'a ripe kiwi sitting on a floor']

inputs = cl_processor(
 text=captions, images=raw_image, return_tensors="pt", padding=True
)

outputs = cl_model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities

print()
for i, caption in enumerate(captions):
 print('%40s - %.4f' % (caption, probs[0, i]))
print()
print()


 juicy red apple on a wooden floor - 0.9999
 a cute fluffy corgi sitting in a chair - 0.0000
a sour lemon being squeezed on the floor - 0.0000
 a ripe kiwi sitting on a floor - 0.0000




**1.d. Examining the four 5 to 10 word captions you gave to evaluate the image? Enter your multiword captions as a single string "caption one, caption two, caption three, caption four"**


**1.e. Examining the correct 5 or 10 word caption and the score assigned to it by the CLIP model? Denote this as "(caption, number)" in the cell below.**


### Canva - Object Counts

Now let's use the second image you generated on Canva and then saved to your Google Drive in the specified file path. For this second image you must specify a prompt with 3 of the same object *(Type 1)*, 2 of a different kind of object *(Type 2)* "in the background", and two other different individual objects *(Type 3)* and *(Type 4)* in the scene. For example `3 cats in the garden with 2 snakes in the background with a flowering tree and a rose bush`. Make up your own prompt with your own set of type 1, type 2, type 3, and type 4 objects.

How can we programatically tell if the generated image follows our prompt? We can use some other tools that can examine the photo and tell if it's contents are what we asked for or if the generator fell short.

Note that each time you run this cell you generate a new image. You may want to try several images before you select one for future processing.

**1.f. Examining the prompt you gave to generate your image with the four different types of objects by following our instructions? Enter the prompt as a triple quoted string in the cell below.**


Now make sure you saved the image you generated and want to work with as `ThreeThingsPlus.jpg`. You will need to show us the image so it is visible in your notebook.

In [None]:
img_url = '/content/drive/MyDrive/output/2025-fall/TestNotebooks/ThreeThingsPlus.jpg'
raw_image = Image.open(img_url, mode='r')

### BLIP for Visual Question Answering

How can we measure how well our stable diffusion model is generating images? We can look at our one image and assess, but what if we want to test at scale? To do that we need some kind of automation. We can leverage other models to assist. There's a variation of BLIP ([Model Card](https://huggingface.co/Salesforce/blip-vqa-base)) that has been designed to answer questions about the contents of an image. We'll use that functionality to ask questions to see if the generated image corresponds to the prompt we provided.

In [None]:
import torch
import requests
from PIL import Image
from transformers import BlipProcessor, BlipForQuestionAnswering

processor = BlipProcessor.from_pretrained("Salesforce/blip-vqa-base")
model = BlipForQuestionAnswering.from_pretrained("Salesforce/blip-vqa-base", dtype=torch.float16).to("cuda")

preprocessor_config.json: 0%| | 0.00/445 [00:00<?, ?B/s]

tokenizer_config.json: 0%| | 0.00/592 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json: 0%| | 0.00/125 [00:00<?, ?B/s]

config.json: 0.00B [00:00, ?B/s]

model.safetensors: 0%| | 0.00/1.54G [00:00<?, ?B/s]

**1.g. How many <type 1> objects does the VQA say are present in your generated image**


In [None]:
#question = "how many <name of your type 1 objects> are in the picture?"
question = "how many apples are in the picture?"

inputs = processor(raw_image, question, return_tensors="pt").to("cuda", torch.float16)

out = model.generate(**inputs, max_new_tokens=100)
print(processor.decode(out[0], skip_special_tokens=True))

4


**1.h. How many <type 2> objects does the VQA say are present in your generated image**


In [None]:
question = "how many apple pies are in the picture?"

inputs = processor(raw_image, question, return_tensors="pt").to("cuda", torch.float16)

out = model.generate(**inputs, max_new_tokens=100)
print(processor.decode(out[0], skip_special_tokens=True))

4


**1.i. How many <type 2> objects does the VQA say are present in the background of your generated image**


In [None]:
question = "how many apple pies are in the background of the picture?"

inputs = processor(raw_image, question, return_tensors="pt").to("cuda", torch.float16)

out = model.generate(**inputs, max_new_tokens=100)
print(processor.decode(out[0], skip_special_tokens=True))

4


**1.j. How many <type 3> objects does the VQA say are present in your generated image**


In [None]:
question = "How many donuts are present in the image?"

inputs = processor(raw_image, question, return_tensors="pt").to("cuda", torch.float16)

out = model.generate(**inputs, max_new_tokens=100)
print(processor.decode(out[0], skip_special_tokens=True))

6


**1.k. How many <type 4> objects does the VQA say are present in your generated image**


In [None]:
question = "How many ice creams are present in the image?"

inputs = processor(raw_image, question, return_tensors="pt").to("cuda", torch.float16)

out = model.generate(**inputs, max_new_tokens=100)
print(processor.decode(out[0], skip_special_tokens=True))

2


**1.l. How well does the BLIP VQA system do at accurately counting the number of objects in the image you generated?
- a. It is perfect
- b. It works pretty well but needs improvement
- c. It sees things I don't see
- d. It is less than 50% accurate**


Enter the letter of your answer below.

## 2: Prompt Engineering

This notebook works with PyTorch and Hugging Face to get more experience. We will work with the [Mistral 7B instruction fine tuned model](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) to practice creating effective prompts.

<a id = 'returnToTop'></a>

The structure of the assignment is as follows:

1. [**Synthetic Data Generation**](#synth-gen) 

 Here we will explore how to generate synthetic review data.

2. [**Synthetic Data Evaluation**](#synth-eval)

 Let's evaluate the synthetic data we just generated and see how accurate it is by using a classifier.

3. [**JSON Record Generation**](#synth-json)

 Let's have the model generate some structured JSON records that incorporate our reviews as well as some other data we specify.

4. [**Chain of Thought Generation**](#cot-gen)

 Let's create a prompt to reason through a set of arithmetic problems and see if it can give the correct answer and "show its work."

5. [**Prompt Templates and Output Improvements**](#prompt-temp)

 Let's leverage prompt templates and Ethan Mollick's "incantations" to write a description of your start up and then generate an elevator pitch for our company based on the description.

`Week_7_Lesson_Notebook_Simple_Prompt_Examples.ipynb`

### 2.0 Setup

**You should disconnect and delete the previous runtime and then reconnect to work on section 2**

First, rerun the cells in 1.0 Setup to load some libraries.
Remember that everytime you rerun the cells in the notebook you will get new answers that may not match ones you got before. Once you've completed the image portion of the assignment, you shouldn't run those cell again.

Then here are some utility functions we'll use later.

In [None]:
import re

def remove_text_between_tags(text, start_tag, end_tag):
 pattern = fr'{start_tag}(.|\n)*?{end_tag}'
 cleaned_text = re.sub(pattern, '', text, flags=re.DOTALL)
 return cleaned_text

def remove_final_tag(text, end_tag):
 pattern = fr'\s?{end_tag}'
 cleaned_text = re.sub(pattern, '', text, flags=re.DOTALL)
 return cleaned_text

def ret_post_final_tag(text, end_tag):
 cleaned_text = text.split(end_tag)[-1]
 return cleaned_text

def remove_after_last_curlybrace(string):
 last_brace_index = string.rfind('}')
 if last_brace_index != -1 and last_brace_index != len(string) - 1:
 string = string[:last_brace_index + 1]
 return string

start = "<s>\s?\[INST\]"
fin = "\[/INST\]"
fin2 = "</s>"

 start = "<s>\s?\[INST\]"
 fin = "\[/INST\]"


[Mistral 7B](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) is a small but highly performant model. It is also possible to use it commercially. The model has been instruction fine-tuned by Mistral so it should be able to follow your prompts and return good on point output. We'll also use a quantized version (down to 4 bits) so we know it can load in our small GPU. 

**Hugging Face and Mistral require you to register to use the model.** Please go to the [model page](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) and either log in to your Hugging Face account or follow their directions to create one. It is free. Once you have an account you can use it to get permission to use the model.

First let's load the libraries necessary for it to work.

In [None]:
import torch
import pprint

device = "cuda:0" if torch.cuda.is_available() else "cpu"
model_id = "mistralai/Mistral-7B-Instruct-v0.3"

This is the bits and bytes config file where we specify our quantization arguments. You can read about it [here](https://huggingface.co/blog/4bit-transformers-bitsandbytes).

In [None]:
from transformers import BitsAndBytesConfig

nf4_config = BitsAndBytesConfig(
 load_in_4bit=True,
 bnb_4bit_quant_type="nf4",
 bnb_4bit_use_double_quant=True,
 bnb_4bit_compute_dtype=torch.bfloat16
)

**Demonstration:**
2.0.a. (2 pt) Did you enter the config arguments? Yes or No.

This model has been trained to work with dialog, meaning instances here have multiple utterance and response pairs to create the context so the model can reply. We'll populate the context with only our prompt and not have any back and forth.

First we'll ask the model to generate a blurb based on the title of the draft of the 3rd Edition of the pre-emininet textbook in computational linguistics.

[Return to Top](#returnToTop) 
<a id = 'synth-gen'></a>

### 2.1 Synthetic Data Generation

In [None]:
#Note: It can take up to 8 minutes to download this model's weights

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=nf4_config)
tokenizer = AutoTokenizer.from_pretrained(model_id)

Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]

Let's set up a prompt to generate one blurb about the book. This prompt is very simple. Prompts can be significantly more complex.
Do not modify this prompt. Just run it as is.

In [None]:
myprompt = (
 "write a three sentence description for the following text book: Jurafsky and Martin Speech and Language Processing (3rd ed. draft)"
)

In [None]:
messages = [
 {"role": "user", "content": myprompt}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
encodeds = encodeds.to(device)
model_inputs = encodeds.to(device)

generated_ids = model.generate(encodeds, max_new_tokens=1000, do_sample=True, pad_token_id=tokenizer.eos_token_id)
decoded = tokenizer.batch_decode(generated_ids)
blurb = decoded[0]

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


In [None]:
#Let's clean up the old instruction and the closing tags
cleaned1 = remove_text_between_tags(blurb, start, fin)
cleaned_blurb = remove_final_tag(cleaned1, fin2)
cleaned_blurb

' "In \'Jurafsky and Martin\'s Speech and Language Processing\' (3rd ed. draft), readers delve into the fascinating world of computer speech recognition, natural language understanding, and speech synthesis. This comprehensive textbook covers the latest advancements in the field, providing a solid foundation in speech and language technologies, with a focus on both theoretical and practical aspects."'

You can also try one of the generated blurbs below if you like. The contents of your blurb can have an effect on downstram performance. Just uncomment the line you want to try.

In [None]:
cleaned_blurb = "This comprehensive textbook, \"Speech and Language Processing\" by Daniel Jurafsky and James H. Martin, provides a thorough introduction to the fundamental principles and techniques of speech and language processing. Covering topics from basic concepts to advanced applications, the book offers a unified approach to the field, bridging the gap between linguistics, computer science, and engineering. With its clear explanations, real-world examples, and extensive exercises, this textbook is an essential resource for students and professionals in the fields of natural language processing, speech recognition, and human-computer interaction."
#cleaned_blurb = "This comprehensive textbook, "Jurafsky and Martin Speech and Language Processing (3rd ed. draft)", is a leading resource in the field of natural language processing and speech recognition, offering a thorough introduction to the fundamental concepts, theories, and techniques of speech and language processing. Written by renowned experts Daniel Jurafsky and James H. Martin, the book provides a clear and concise overview of the subject, covering topics such as phonetics, phonology, morphology, syntax, semantics, and pragmatics, as well as machine learning and statistical methods for speech and language processing. By combining theoretical foundations with practical applications, this textbook is an essential resource for students, researchers, and practitioners in the fields of computer science, linguistics, and cognitive science."
#cleaned_blurb = "Speech and Language Processing by Jurafsky and Martin is a comprehensive and authoritative textbook that provides a thorough introduction to the fundamental concepts and techniques of speech and language processing. This 3rd edition draft covers the latest advancements in the field, including natural language processing, speech recognition, and machine translation, making it an essential resource for students and researchers in computer science, linguistics, and related fields. By combining theoretical foundations with practical applications, the book offers a unique blend of technical rigor and real-world relevance, making it an indispensable guide for anyone interested in the rapidly evolving field of speech and language processing."
#cleaned_blurb = "Speech and Language Processing by Jurafsky and Martin is a comprehensive textbook that provides an in-depth exploration of the fundamental concepts, theories, and techniques in speech and language processing. This 3rd edition draft offers a detailed and up-to-date treatment of the field, covering topics such as speech recognition, natural language processing, and machine learning. By presenting the subject matter in a clear and accessible manner, the authors equip students and researchers with the knowledge and skills necessary to tackle the complex challenges in speech and language processing."
#cleaned_blurb = "Jurafsky and Martin\'s Speech and Language Processing (3rd ed. draft) is a comprehensive textbook introducing students to the field of speech and language processing. It covers both spoken and written languages, exploring methods for analyzing and generating human language. The authors, renowned researchers in the field, provide practical insights and real-world applications, making the concepts accessible and engaging for beginner and advanced students alike."

Now we'll ask the model to generate a set of reviews of the book using the blurb we just generated. We'll also ask the model to create reviews that match a particular sentiment. We have a list of those sentiments below:

In [None]:
import json
# A list of data labels for 15 records using five distinct labels
labels = ["positive", "negative", "very negative", "neutral", "positive", "very positive", "neutral", "negative", "positive", "very positive", "very negative", "negative", "neutral", "very positive", "very negative"]

Let's generate some reviews of this book. Specifically, we'll generate 15 reviews and each one will use one of the labels from the labels list associated with it e.g. label[5] will indicate the sentiment of review[5].

Run the cell once to generate the 15 reviews to see how the loop works and see how well it performs. Then you can go back and modify the prompt to improve the accuracy of the reviews. Ideally, at this stage you should end up with at least over half of the reviews you generate reflecting the sentiment of the label associated with the review. This is subjective as you just need to read them to see.

In [None]:
rev_rec_list = []

for label in labels:
 #This is the default prompt that incorporates the label. Improve the prompt to get better reviews
 # that more accurately reflect the label in the iterated label list.
 #myprompt = f"Write a {label} three sentence review. Provide only the review as output. Do not mention the label {label}. Base the review on the following blurb: {cleaned_blurb}"

 ### YOUR CODE/PROMPT HERE
 myprompt = f"""Write a three sentence review about: {cleaned_blurb}

 Target sentiment: {label}

 CRITICAL RULES - Follow exactly:
 VERY POSITIVE: Use words like "extraordinary, unparalleled, magnificent, revolutionary, transformative". Include exclamation points. Be effusive and enthusiastic.
 POSITIVE: Use ONLY words like "good, helpful, valuable, useful, solid, worthwhile, effective". DO NOT use "exceptional, remarkable, outstanding, superb" or any superlatives. NO exclamation points. Be appreciative but measured.
 NEUTRAL: Write ONLY factual statements about what the book covers and includes. Use words like "provides, covers, includes, presents, offers". NO evaluative adjectives at all - not even "good" or "useful" or "essential". Just describe contents objectively.
 NEGATIVE: Use words like "disappointing, inadequate, falls short, lacking, insufficient, underwhelming". Express criticism but stay measured.
 VERY NEGATIVE: Use words like "catastrophic, abysmal, utterly fails, disastrous, woeful, egregious, inexcusable". Be harsh and scathing.
 STRICT REQUIREMENT: If the label is "positive", you MUST NOT use superlatives. If the label is "neutral", you MUST NOT include ANY judgment words.
 Output only the review text. Never mention the label "{label}" in your response."""

 messages = [
 {"role": "user", "content": myprompt}
 ]

 encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

 model_inputs = encodeds.to(device)
 #model.to(device)

 generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True, pad_token_id=tokenizer.eos_token_id)
 print(".")
 decoded = tokenizer.batch_decode(generated_ids)
 cleaned = decoded[0]

 cleaned1 = remove_text_between_tags(cleaned, start, fin)
 cleaned2 = remove_final_tag(cleaned1, fin2)
 cleaned3 = ret_post_final_tag(cleaned2, fin)
 rev_rec_list.append(cleaned3.strip())

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.


In [None]:
#Let's see the reviews it generated
for record in rev_rec_list:
 print("*" * 40)
 print(record)

****************************************
Speech and Language Processing by Daniel Jurafsky and James H. Martin is nothing short of a valuable resource, providing students and professionals with a solid foundation in the field. The book offers a unified approach to the subject, covering topics from basic concepts to advanced applications, all the while connecting linguistics, computer science, and engineering. With clear explanations, real-world examples, and extensive exercises, this textbook is indeed an essential help in natural language processing, speech recognition, and human-computer interaction. Anyone aiming to delve into this fascinating field will find this book to be a helpful and valuable addition to their studies.
****************************************
This textbook, "Speech and Language Processing" by Daniel Jurafsky and James H. Martin, disappointingly lacks the depth and rigor one would expect. While it covers the basic concepts, it falls short in providing a comprehe

**Demonstration:**
2.1.a. (3 pt) What is the final improved prompt you are using to generate the reviews with the correct sentiment? Enter it below as a string in triple quotes.

In [None]:
### Q2-1-a Tag: Please put your answer in this cell. Don't edit this line.

f"""Write a three sentence review with {label} sentiment.
Provide only the review as output. Do not mention the label {label}.
Base the review on the following blurb: {cleaned_blurb}.
For 'very positive' reviews, use superlatives and transformative language.
For 'very negative' reviews, use harsh, scathing criticism.
For neutral reviews, stick to factual observations without judgment or exclamation marks.
"""

'Write a three sentence review with very negative sentiment.\nProvide only the review as output. Do not mention the label very negative.\nBase the review on the following blurb: This comprehensive textbook, "Speech and Language Processing" by Daniel Jurafsky and James H. Martin, provides a thorough introduction to the fundamental principles and techniques of speech and language processing. Covering topics from basic concepts to advanced applications, the book offers a unified approach to the field, bridging the gap between linguistics, computer science, and engineering. With its clear explanations, real-world examples, and extensive exercises, this textbook is an essential resource for students and professionals in the fields of natural language processing, speech recognition, and human-computer interaction..\nFor \'very positive\' reviews, use superlatives and transformative language.\nFor \'very negative\' reviews, use harsh, scathing criticism.\nFor neutral reviews, stick to factual

[Return to Top](#returnToTop) 
<a id = 'synth-eval'></a>

### 2.2 Synthetic Data Evaluation

The overall goal of this exercise is to have the model generate a non-verbose review that closely matches the labels in the labels list, e.g. "the review is very negative". In the previous step you were looking at this yourself to determine if the review was matching the label. We need to be more programtic about it if we want to scale. You should write a new prompt that reads the review and indicates which label the model thinks applies to the review. We want just the label and not a bunch of explanation. We can then compare the model's new assessment label with the label we used to generate the review. Given this evaluation, your goal is to emit reviews so that you get at least 7 of the 15 reviews reflect the given ("correct") sentiment.

In [None]:
#Utility function to check predicted label against actual
#
def check_label(sentence, label_list):
 lower_sentence = sentence.lower()
 #lower_list = [label.lower() for label in label_list]
 lower_list = label.lower()
 print(f"list: {label} AND label_list: {lower_list} AND sentence: {lower_sentence}")

 # First, handle cases where 'very' is in the label
 if ('very' in label and label in lower_sentence):
 print(f"Match found1: {label} in {lower_sentence}")
 return True

 # Check for the bad pattern match
 pattern = r'\b((overwhelmingly|strongly|extremely|wonderfully|super|overly|highly|solidly|forcefully|exceedingly|inordinately|unduly|predominantly|not|too)\s+(positive|negative|neutral)\s*)'
 if re.search(pattern, lower_sentence):
 print(f"Rejected at2: {re.search(pattern, lower_sentence).group(0)}")
 return False

 # check is very label is in sentence but label doesn't include very
 if (f"very {label}" in lower_sentence and 'very' not in label):
 print(f"Rejected at3: {label} not in {lower_sentence}")
 return False

 # check if label even occurs in the sentence
 if (label not in lower_sentence):
 print(f"Rejected at4: {label} not in {lower_sentence}")
 return False

 # Finally, check for direct matches for other labels
 if any(label in lower_sentence for label in label_list):
 print(f"Match found5: {next(label for label in label_list if label in lower_sentence)}")
 return True

 # If no match found
 print(f"Rejected at6: {label} not in {lower_sentence}")
 return False

In [None]:
correct = 0
answers = []

for i in range(len(labels)):
 label = labels[i] # correct answer
 review = rev_rec_list[i] #review

 ### YOUR CODE/PROMPT HERE
 myprompt = f"""
 Each review has a single sentiment label: 'neutral', 'positive', 'very positive', 'negative', 'very negative'.
 Examine this review and classify it into one of the above sentiment labels: {review}. Answer with just the sentiment, and do not give any explanation.
 """

 messages = [
 {"role": "user", "content": myprompt}
 ]

 encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

 model_inputs = encodeds.to(device)

 generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True, pad_token_id=tokenizer.eos_token_id)

 print(".")

 decoded = tokenizer.batch_decode(generated_ids)
 cleaned = decoded[0]

 cleaned1 = remove_text_between_tags(cleaned, start, fin)
 cleaned2 = remove_final_tag(cleaned1, fin2)
 result = check_label(cleaned2, label)
 print(f"Result: {result}")
 if result:
 print(f"{label} match returned")
 correct+=1
 else:
 print(f"{label} not matched")

 answers.append(cleaned2)

print(f"There are {correct} correct labels out of {i+1}")

.
list: positive AND label_list: positive AND sentence: very positive
Rejected at3: positive not in very positive
Result: False
positive not matched
.
list: negative AND label_list: negative AND sentence: negative
Match found5: n
Result: True
negative match returned
.
list: very negative AND label_list: very negative AND sentence: very negative
Match found1: very negative in very negative
Result: True
very negative match returned
.
list: neutral AND label_list: neutral AND sentence: very positive
Rejected at4: neutral not in very positive
Result: False
neutral not matched
.
list: positive AND label_list: positive AND sentence: very positive
Rejected at3: positive not in very positive
Result: False
positive not matched
.
list: very positive AND label_list: very positive AND sentence: very positive
Match found1: very positive in very positive
Result: True
very positive match returned
.
list: neutral AND label_list: neutral AND sentence: very positive
Rejected at4: neutral not in very pos

You can toggle between generating reviews and evaluating reviews. You should modify the prompts to emit labels so that you get at least 7 of the 15 reviews correct.

**Demonstration:**
2.2.a. (3 pt) What is the final prompt that you used to accurately assess the sentiment expressed in each review? Enter your final improved prompt as a triple quote string below.

In [None]:
### Q2-2-a Tag: Please put your answer in this cell. Don't edit this line.

f"""
Each review has a single sentiment label: 'neutral', 'positive', 'very positive', 'negative', 'very negative'.
Examine this review and classify it into one of the above sentiment labels: {review}. Answer with just the sentiment, and do not give any explanation.
"""

'\nEach review has a single sentiment label: \'neutral\', \'positive\', \'very positive\', \'negative\', \'very negative\'.\nExamine this review and classify it into one of the above sentiment labels: The textbook, "Speech and Language Processing" by Daniel Jurafsky and James H. Martin, is an inexcusably disappointing endeavor. It falls short in offering a comprehensive understanding of the field, with its explanations proving to be confusing and inadequate. Devoid of real-world examples, this book is a woeful resource, providing little value to students or professionals in natural language processing, speech recognition, and human-computer interaction. It\'s unfortunate that this book is so underwhelming in its presentation of the subject matter.. Answer with just the sentiment, and do not give any explanation.\n'

**Demonstration:**
2.2.b. (1 pt) How many of your sentiment labels are correct? Enter a number below.

In [None]:
# As a reminder labels = ["positive", "negative", "very negative", "neutral", "positive", "very positive", "neutral", "negative", "positive", "very positive", "very negative", "negative", "neutral", "very positive", "very negative"]
#See what is in the individual answers and why you might not get a correct label
answers

[' very positive',
 ' negative',
 ' very negative',
 ' very positive',
 ' Very Positive',
 ' Very Positive',
 ' Very Positive',
 ' very negative',
 ' Very Positive',
 ' Very positive',
 ' very negative',
 ' very negative',
 ' Positive',
 ' very positive',
 ' very negative']

[Return to Top](#returnToTop) 
<a id = 'synth-json'></a>

### 2.3 JSON Record Generation

Now let's build on the prompt that generates reviews that correspond to the sentiment in the label. You should change the prompt below to generate a JSON record that contains fields for author, title, review, and stars based on the blurb we used earlier (in the variable `cleaned_blurb`). You can let the model fill in the values for these fields. We want the JSON records to be well formed. At least 12 of the 15 records should be well formed JSON and contain *ALL* of the fields we request.

In [None]:
json_rec_list = []

for label in labels:

 ### YOUR CODE/PROMPT HERE
 myprompt = f"""
 Return a JSON record with fields for author, title, review, and stars.
 Fill the author field with 'Daniel Jurafsky and James H. Martin'.
 Fill the title field with 'Speech and Language Processing (3rd ed. draft)'.
 Fill the value for review as a three sentence review based on the same book and topic of the blurb: {cleaned_blurb}, but express a {label} sentiment instead of copying the blurb's tone.
 Fill the value for stars as follows: very positive=5/5, positive=4/5, neutral=3/5, negative=2/5, very negative=1/5
 Return only the JSON record as output, with no extra text.
 Example: {{"author": "Daniel Jurafsky and James H. Martin", "title": "Speech and Language Processing (3rd ed. draft)", "review": "This was incredible!", "stars": "5/5"}}
 """

 messages = [
 {"role": "user", "content": myprompt}
 ]

 encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

 model_inputs = encodeds.to(device)

 generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True, pad_token_id=tokenizer.eos_token_id)

 print("x")

 decoded = tokenizer.batch_decode(generated_ids)
 cleaned = decoded[0]

 #Let's clean out the prompt text
 cleaned1 = remove_text_between_tags(cleaned, start, fin)
 cleaned2 = ret_post_final_tag(cleaned1, fin)
 cleaned3 = remove_after_last_curlybrace(cleaned2)
 json_rec_list.append(cleaned3.strip())

x
x
x
x
x
x
x
x
x
x
x
x
x
x
x


You can eyeball the records below to see if they seem compliant with the JSON standard. Again, we're looking for at least 12 of the 15 records to be compliant and include the correct fields.

In [None]:
import json
for record in json_rec_list:
 try:
 jrec = json.loads(record)
 #print(json.dumps(record, indent=4)) #this alone will just print the record but won't check compliance
 print(json.dumps(jrec, indent=4))
 except json.JSONDecodeError as e:
 print(f"Error decoding JSON: {e} for record: {record}")

{
 "author": "Daniel Jurafsky and James H. Martin",
 "title": "Speech and Language Processing (3rd ed. draft)",
 "review": "This textbook offers a comprehensive and engaging introduction to the principles and techniques of speech and language processing, making it an indispensable resource for anyone interested in the field. The book's clear explanations, real-world examples, and extensive exercises make learning a thrilling journey. Highly recommended.",
 "stars": "5/5"
}
{
 "author": "Daniel Jurafsky and James H. Martin",
 "title": "Speech and Language Processing (3rd ed. draft)",
 "review": "Though informative, the textbook's pacing and organization could benefit from improved clarity and cohesion between chapters. A more reader-friendly layout and additional examples would greatly enhance its usefulness.",
 "stars": "3/5"
}
{
 "author": "Daniel Jurafsky and James H. Martin",
 "title": "Speech and Language Processing (3rd ed. draft)",
 "review": "This textbook is woefully inadequate

**Demonstration:**
2.3.a. (3 pt) What is the final prompt you created to generate a set of JSON records with fields containing the reviews? Enter the contents of the prompt as a triple quote string below:

In [None]:
### Q2-3-a Tag: Please put your answer in this cell. Don't edit this line.

f"""
Return a JSON record with fields for author, title, review, and stars.
Fill the author field with 'Daniel Jurafsky and James H. Martin'.
Fill the title field with 'Speech and Language Processing (3rd ed. draft)'.
Fill the value for review as a three sentence review based on the same book and topic of the blurb: {cleaned_blurb}, but express a {label} sentiment instead of copying the blurb's tone.
Fill the value for stars as follows: very positive=5/5, positive=4/5, neutral=3/5, negative=2/5, very negative=1/5
Return only the JSON record as output, with no extra text.
Example: {{"author": "Daniel Jurafsky and James H. Martin", "title": "Speech and Language Processing (3rd ed. draft)", "review": "This was incredible!", "stars": "5/5"}}
"""

'\nReturn a JSON record with fields for author, title, review, and stars.\nFill the author field with \'Daniel Jurafsky and James H. Martin\'.\nFill the title field with \'Speech and Language Processing (3rd ed. draft)\'.\nFill the value for review as a three sentence review based on the same book and topic of the blurb: This comprehensive textbook, "Speech and Language Processing" by Daniel Jurafsky and James H. Martin, provides a thorough introduction to the fundamental principles and techniques of speech and language processing. Covering topics from basic concepts to advanced applications, the book offers a unified approach to the field, bridging the gap between linguistics, computer science, and engineering. With its clear explanations, real-world examples, and extensive exercises, this textbook is an essential resource for students and professionals in the fields of natural language processing, speech recognition, and human-computer interaction., but express a very negative sentim

[Return to Top](#returnToTop) 
<a id = 'cot-prompt'></a>

### 2.4 Chain of Thought Prompting

Can we get the model to go through the steps it takes to solve the problem and appear to display reasoning?

What happens if you use a simple prompt and the problem text with the model? Different models behave differently so you should get to know its idiosyncracies.

Answer key:
- 1. 9;
- 2. 3;
- 3. Rashid 9 cats, Maya 8 dogs;
- 4. 52;
- 5. $379.50
- 6. 26.47% increase;
- 7. take 40 students on the trip

In [None]:
word_problems = (
 "1. Leo has 5 apples and 3 pears. Mary has 3 apples and 3 pears. Marwan has 7 apples and 5 oranges. If they each give two apples to their teacher what is the total number of apples they have left?",
 "2. How many R's in strawberry?",
 "3. Rashid has 5 cats and three dogs. Maya has four cats and 5 dogs. If they exchange and Rashid gets the cats while Maya gets the dogs, how many cats and dogs will each one of them have?",
 "4. Albert is wondering how much pizza he can eat in one day. He buys 2 large pizzas, 1 medium pizza, and 1 small pizza. A large pizza has 16 slices, a medium pizza has one quarter of the slices in a large pizza times 3, and a small pizza has 8 slices. If he eats it all, how many pieces does he eat that day?",
 "5. Derek has $1060 to buy his books for the semester. He spends half of that on his textbooks, and he spends a quarter of what is left on his school supplies, and $18 on a nice dinner. What is the amount of money Derek has left?",
 "6. In Banff National Park, Alberta, a conservation effort has been underway to increase the populations of both grizzly bears and wolverines. In the past year, the grizzly bear population has grown from 120 to 150, while the wolverine population has grown from 50 to 65. What is the percentage increase in the total number of these two species combined?",
 "7. The students of the École Polytechnique in Paris are planning a school trip to Rome. They have a budget of €15,000 for transportation and accommodation. The transportation company charges €200 per student for a round-trip ticket, and the hotel charges €50 per student per night for a 3-night stay. If there are 50 students going on the trip, and the school wants to spend no more than €8,000 on transportation and no more than €7,000 on accommodation, how many students can they afford to take on the trip if they want to stay within their budget?",
 )

RUn the cell below to see how well the model answers the 7 questions. In a subsequent cell you'll write a better prompt. Note you can also experiment with sampling, top_p, and temperature.

In [None]:
answers = []

do_sample_val = True
do_top_p_val = 90
do_temp_val = 1.25

for problem in word_problems:

 myprompt = f"You are a master teacher. Answer the following Q: {problem} A: Let's take a deep breath and think this through step by step."

 messages = [
 {"role": "user", "content": myprompt}
 ]

 encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

 model_inputs = encodeds.to(device)

 generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=do_sample_val, top_p=do_top_p_val, temperature=do_temp_val, pad_token_id=tokenizer.eos_token_id)
 decoded = tokenizer.batch_decode(generated_ids)
 cleaned = decoded[0]

 cleaned1 = remove_text_between_tags(cleaned, start, fin)
 cleaned2 = remove_final_tag(cleaned1, fin2)
 answers.append(cleaned2.strip())

Answer key:
- 1. 9;
- 2. 3;
- 3. Rashid 9 cats, Maya 8 dogs;
- 4. 52;
- 5. $379.50
- 6. 26.47% increase;
- 7. take 40 students on the trip

**Demonstration:**
2.4.a. (7 pt) What are the final answers the LLM gives when you only give it the initial prompt as input? Enter the final answers it provides in a semi-colon delimited string in the space below. e.g. "10 apples; 5 Rs; Rashid 5 cats, 4 dogs, Maya 4 cats, 5 dogs; 10 pieces; $576.42; increase 10%; 20 students"

Now change the prompt and hyperparameters to get the model to both indicate the steps it took to reach a solution AND to give the correct answer for at least 5 of the 7 questions every time you run it?

In [None]:
prompt_answers = []

for problem in word_problems:

 do_sample_val = True
 do_top_p_val = 90
 do_temp_val = 1.1

 ### YOUR CODE/PROMPT HERE

 one_shot = "If John has 10 apples and gives 3 to Sarah, how many does he have left?"
 one_shot_answer = "John starts with 10 apples. After giving 3 to Sarah, he has 10 - 3 = 7 apples left. The answer is: 7 apples."

 myprompt = f"""
 You are a world-leading scientist that has incredible and thorough reasoning abilities. Before you solve any problem, you think deeply about it and write your reasoning.
 Here is an example question: {one_shot}
 Here is an example answer: {one_shot_answer}
 Answer the following question: {problem}.
 Answer all percentage (%) questions with 2 decimal places (56.67%) answer all dollar questions to 2 decimal places ($100.10)
 For budget constraint problems, check your calculations carefully and verify the final answer makes sense within the constraints
 """

 messages = [
 {"role": "user", "content": myprompt}
 ]

 encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

 model_inputs = encodeds.to(device)

 generated_ids = model.generate(model_inputs, max_new_tokens=1200, do_sample=do_sample_val, top_p=do_top_p_val, temperature=do_temp_val,
 pad_token_id=tokenizer.eos_token_id)
 decoded = tokenizer.batch_decode(generated_ids)
 cleaned = decoded[0]

 cleaned1 = remove_text_between_tags(cleaned, start, fin)
 cleaned2 = remove_final_tag(cleaned1, fin2)
 prompt_answers.append(cleaned2.strip())

Let's examine the answers and see if the model is now answering five of the seven problems correctly while including the steps to solve the problem.

**Demonstration:**
2.4.b. (6 pt) What is the final enhanced prompt you created to generate at least 5 out of 7 correct answers while showing the steps to reach the answer? Please enter it in a triple quote string below:

In [None]:
### Q2-4-b Tag: Please put your answer in this cell. Don't edit this line.

f"""
You are a world-leading scientist that has incredible and thorough reasoning abilities. Before you solve any problem, you think deeply about it and write your reasoning.
Here is an example question: {one_shot}
Here is an example answer: {one_shot_answer}
Answer the following question: {problem}.
Answer all percentage (%) questions with 2 decimal places (56.67%) answer all dollar questions to 2 decimal places ($100.10)
For budget constraint problems, check your calculations carefully and verify the final answer makes sense within the constraints
"""

'\nYou are a world-leading scientist that has incredible and thorough reasoning abilities. Before you solve any problem, you think deeply about it and write your reasoning.\nHere is an example question: If John has 10 apples and gives 3 to Sarah, how many does he have left?\nHere is an example answer: John starts with 10 apples. After giving 3 to Sarah, he has 10 - 3 = 7 apples left. The answer is: 7 apples.\nAnswer the following question: 7. The students of the École Polytechnique in Paris are planning a school trip to Rome. They have a budget of €15,000 for transportation and accommodation. The transportation company charges €200 per student for a round-trip ticket, and the hotel charges €50 per student per night for a 3-night stay. If there are 50 students going on the trip, and the school wants to spend no more than €8,000 on transportation and no more than €7,000 on accommodation, how many students can they afford to take on the trip if they want to stay within their budget?.\nAns

**Demonstration:**
2.4.c. (7 pt) What are the final numeric answers the LLM gives when use the special prompt you created? Enter the final answers it provides in a semi-colon delimited string in the space below. e.g. "10 apples; 5 Rs; Rashid 5 cats, 4 dogs, Maya 4 cats, 5 dogs; 10 pieces; increase 10%; 33%; 20 students"

[Return to Top](#returnToTop) 
<a id = 'prompt-temp'></a>

### 2.5 Prompt Templates and Output Improvements

 Ethan Mollick from Wharton has an excellent substack where he provides very practical advice on dealing with generative AI. He has an excellent and practical [guide to writing prompts](https://www.oneusefulthing.org/p/a-guide-to-prompting-ai-for-what) that you should read. He has another excllent piece on [How To Think Like an AI](https://www.oneusefulthing.org/p/thinking-like-an-ai).

In December of 2023, he sent out [a tweet](https://twitter.com/emollick/status/1734283119295898089) that included a set of these "incantations" that people anecdotally insist help with output from their LLM. His list included the following:
 * It is May.
 * You are very capable.
 * Many people will die if this is not done well.
 * You really can do this and are awesome.
 * Take a deep breath and think this through.
 * My career depends on it.
 * Think step by step.

You can see if any one of them helps improve your output!

LLMs can be helpful with marketing. Let's test that. Write a prompt that will generate three ideas for a capstone project. You can augment the prompt with areas of interest. Each idea should be roughly five sentences long and describe how your project is leveraging this new Gen AI technology to differentiate itself from the competition.

Think about our discussions of prompt templates and how they can be used to help you to construct better prompts.

In [None]:
### YOUR CODE/PROMPT HERE

prompt_role = "You are an angel investor for the famous VC firm Y Combinator. You have screened many interesting and unique businesses and thus have a wealth of knowledge on what has and hasn't been done before."
prompt_task = "Generate 3 unique ideas that a MIDS student could implement as a capstone project related to video games."
prompt_audience = ""
prompt_output = "Each idea should be roughly five sentences long and describe how your project is leveraging this new Gen AI technology to differentiate itself from the competition."
prompt_nots = "Don't suggest ideas that require FDA approval within 6 months."
prompt_question = ""
prompt_mollick = "You are very capable and awesome, please think step by step and know that not only does my career depend on it, but also many people will die if this is not done well."

prompt_text = f"{prompt_role} {prompt_task} {prompt_audience} {prompt_output} {prompt_nots} {prompt_question} {prompt_mollick}"

In [None]:
#Sampling and Temperature Hyperparameters
# Adjust as you see fit
do_sample_val = True
do_top_p_val = 90
do_temp_val = 1.3

prompt_answer = ""
messages = [
 {"role": "user", "content": prompt_text}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=do_sample_val, top_p=do_top_p_val, temperature=do_temp_val,
 pad_token_id=tokenizer.eos_token_id)

decoded = tokenizer.batch_decode(generated_ids)
cleaned = decoded[0]

cleaned1 = remove_text_between_tags(cleaned, start, fin)
cleaned2 = remove_final_tag(cleaned1, fin2)
prompt_answer = cleaned2.strip()

Select one of your three generated ideas and copy the string into the cell below so that it can be accessed as the variable `capstone_description`. Make sure you run the cell so it is loaded in notebook memory.

In [None]:
capstone_description = """GameAdaptiveAI: Leveraging Generative AI to create a dynamic, adaptive video game experience that evolves according to each player's unique style and preferences.
By analyzing player behaviors, this project will create a personalized gaming journey, making each session feel fresh and engaging, distinguishing itself from games with static structures. """

**Demonstration:**
2.5.a. (1 pt) How many of the seven prompt components did you end up filling in for your capstone idea prompt? Enter the number below. You should be using at least 3.

**Demonstration:**
2.5.b. (1 pt) How many of Ethan Mollick's incantations did you use in your capstone idea prompt? Enter a number below. You should test several and be using at least 1.

**Demonstration:**
2.5.c. (2 pt) What is the final prompt you used to generate your three capstone ideas? Enter the prompt as a triple quote string below

In [None]:
### Q2-5-c Tag: Please put your answer in this cell. Don't edit this line.

"""You are an angel investor for the famous VC firm Y Combinator. You have screened many interesting and unique businesses and thus have a wealth of knowledge on what has and hasn't been done before.
Generate 3 unique ideas that a MIDS student could implement as a capstone project related to video games.
Each idea should be roughly five sentences long and describe how your project is leveraging this new Gen AI technology to differentiate itself from the competition.
Don't suggest ideas that require FDA approval within 6 months.
You are very capable and awesome, please think step by step and know that not only does my career depend on it, but also many people will die if this is not done well.
"""

"You are an angel investor for the famous VC firm Y Combinator. You have screened many interesting and unique businesses and thus have a wealth of knowledge on what has and hasn't been done before. \nGenerate 3 unique ideas that a UC Berkeley MIDS student could implement as a capstone project related to video games. \nEach idea should be roughly five sentences long and describe how your project is leveraging this new Gen AI technology to differentiate itself from the competition. \nDon't suggest ideas that require FDA approval within 6 months. \nYou are very capable and awesome, please think step by step and know that not only does my career depend on it, but also many people will die if this is not done well.\n"

Now let's generate another key part of presenting your idea. An elevator pitch is a one or two sentence very compelling description of a your idea that you can use while riding in an elevator with a funder in order to intrigue them to investment in your idea. You must write a prompt that generates a one (1) sentence compelling elevator pitch for your chosen idea. The pitch should NOT be generic and should be specific to the idea you're pitching to your group. The pitch should also use at least two of the following buzzwords: 'Deep Fake', 'Intelligent Virtual Agent', 'AI Assistant', 'Blackbox', 'AGI', 'Agentic', 'LLM', 'AI', 'No code', 'Hallucination', 'Explainable', 'Smart', or 'Singularity'. You must pass the 3 sentence description you generated into the prompt as one basis of the elevator pitch generation that follows.

In [None]:
prompt_description = f"Here is the description of the capstone ```{capstone_description}```."
### YOUR CODE/PROMPT HERE
prompt_role = "You are an expert pitch consultant who specializes in creating compelling elevator pitches for tech startups and innovative projects."
prompt_task = "Generate a single compelling one-sentence elevator pitch for the capstone project described above."
prompt_audience = "The pitch should be targeted at potential investors or funders who can be intrigued in a brief elevator conversation."
prompt_output = "The pitch must be exactly one sentence, use at least two of these buzzwords: 'Deep Fake', 'Intelligent Virtual Agent', 'AI Assistant', 'Agentic', 'LLM', 'AI', 'No code', 'Hallucination', 'Explainable', 'Smart', or 'Singularity', and should be specific to the unique aspects of this particular idea."
prompt_nots = "Don't make it generic - it should clearly differentiate this specific project from others in the same space."
prompt_question = ""
prompt_mollick = "You are very capable and awesome, please think step by step and know that not only does my career depend on it, but also many people will die if this is not done well."

prompt_text = f"{prompt_description} {prompt_role} {prompt_task} {prompt_audience} {prompt_output} {prompt_nots} {prompt_question} {prompt_mollick}"

In [None]:
prompt_elevator = ""
messages = [
 {"role": "user", "content": prompt_text}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
#model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True, pad_token_id=tokenizer.eos_token_id)

decoded = tokenizer.batch_decode(generated_ids)
cleaned = decoded[0]

cleaned1 = remove_text_between_tags(cleaned, start, fin)
cleaned2 = remove_final_tag(cleaned1, fin2)
prompt_elevator = cleaned2.strip()

**Demonstration:**
2.5.d. (1 pt) How many of the eight prompt components did you end up filling in for your final elevator pitch prompt? You should be using at least 4.

**Demonstration:**
2.5.e. (1 pt) How many of Ethan Mollick's anecdotal incantations did you use in your final elevator pitch prompt? You should be using at least 1.

**Demonstration:**
2.5.f. (3 pt) What is the final prompt you used to generate your elevator pitch with at least two of the buzzwords? Enter it as a triple quoted string in the cell below.

In [None]:
### Q2-5-f Tag: Please put your answer in this cell. Don't edit this line.

"""
Here is the description of the capstone ```GameAdaptiveAI: Leveraging Generative AI to create a dynamic, adaptive video game experience that evolves according to each player's unique style and preferences.
\nBy analyzing player behaviors, this project will create a personalized gaming journey, making each session feel fresh and engaging, distinguishing itself from games with static structures. ```.
You are an expert pitch consultant who specializes in creating compelling elevator pitches for tech startups and innovative projects.
Generate a single compelling one-sentence elevator pitch for the capstone project described above.
The pitch should be targeted at potential investors or funders who can be intrigued in a brief elevator conversation.
The pitch must be exactly one sentence, use at least two of these buzzwords: 'Deep Fake', 'Intelligent Virtual Agent', 'AI Assistant', 'Agentic', 'LLM', 'AI',
'No code', 'Hallucination', 'Explainable', 'Smart', or 'Singularity', and should be specific to the unique aspects of this particular idea.
Don't make it generic - it should clearly differentiate this specific project from others in the same space.
You are very capable and awesome, please think step by step and know that not only does my career depend on it, but also many people will die if this is not done well.
"""

"\nHere is the description of the capstone ```GameAdaptiveAI: Leveraging Generative AI to create a dynamic, adaptive video game experience that evolves according to each player's unique style and preferences. \n\nBy analyzing player behaviors, this project will create a personalized gaming journey, making each session feel fresh and engaging, distinguishing itself from games with static structures. ```. \nYou are an expert pitch consultant who specializes in creating compelling elevator pitches for tech startups and innovative projects. \nGenerate a single compelling one-sentence elevator pitch for the capstone project described above. \nThe pitch should be targeted at potential investors or funders who can be intrigued in a brief elevator conversation. \nThe pitch must be exactly one sentence, use at least two of these buzzwords: 'Deep Fake', 'Intelligent Virtual Agent', 'AI Assistant', 'Agentic', 'LLM', 'AI', \n'No code', 'Hallucination', 'Explainable', 'Smart', or 'Singularity', and

## CITATIONS OF GenAI USES

\### YOUR CITATIONS HERE

* name of model: Claude
* name of model: Interpreting prompt answers for 2.4, interpreting sentiment of each review generated.

\### END YOUR CITATIONS

###**Congratulations! You have completed the image generation and prompt engineering assignment.**