# Day 1 - Examples

In this notebook we cover a few common examples scenarios in which we can use an LLM.

In [1]:
# Load environment variables
from dotenv import load_dotenv

load_dotenv("../../.env")

True

In [2]:
from tools import llm_call

## Example 1: Chatbot

via GPT-4

First we take a look at making a simple chatbot similar to ChatGPT. By retaining the chat history in memory and sending it to the model at each point, we can maintain an entire conversation!

In [3]:
system_prompt = "You are a helpful chat assistant."
prompt = "Suggest some places to visit for a vacation."

print("------SYSTEM PROMPT-------")
print(system_prompt)
print("----------PROMPT----------")
print(prompt)
print("---------RESPONSE---------")
result, chat_history = llm_call(model="gpt-4", prompt=prompt, system_prompt=system_prompt, return_chat_history=True)
print(result)

------SYSTEM PROMPT-------
You are a helpful chat assistant.
----------PROMPT----------
Suggest some places to visit for a vacation.
---------RESPONSE---------
Sure, here are some wonderful places you might consider for your next vacation based on different preferences:

1. Nature & Wildlife: 
   - The Serengeti, Tanzania
   - The Amazon Rainforest, Brazil
   - Yellowstone National Park, USA
   - Great Barrier Reef, Australia
   - Galapagos Islands, Ecuador

2. Adventure: 
   - Mount Everest, Nepal
   - The Inca Trail, Peru
   - Queenstown, New Zealand
   - Interlaken, Switzerland
   - Moab, USA

3. History & Culture: 
   - Rome, Italy
   - Cairo, Egypt
   - Athens, Greece
   - Kyoto, Japan
   - Istanbul, Turkey

4. Relaxation: 
   - Maldives
   - Maui, Hawaii
   - Phuket, Thailand
   - Amalfi Coast, Italy
   - Santorini, Greece
   
5. Modern Architecture & Shopping:
   - New York, USA
   - Dubai, UAE
   - Paris, France
   - Tokyo, Japan
   - London, UK
   
It's hard to give a comprehe

In [4]:
system_prompt = "You are a helpful chat assistant."
prompt = "Tokyo sounds great! What are the best places to visit there?"

print("------SYSTEM PROMPT-------")
print(system_prompt)
print("----------PROMPT----------")
print(prompt)
print("---------RESPONSE---------")
result, chat_history = llm_call(model="gpt-4", prompt=prompt, system_prompt=system_prompt, return_chat_history=True, chat_history=chat_history)
print(result)

------SYSTEM PROMPT-------
You are a helpful chat assistant.
----------PROMPT----------
Tokyo sounds great! What are the best places to visit there?
---------RESPONSE---------
Absolutely, Tokyo is a fantastic city! Here are some of the top places to visit:

1. **Tokyo Disneyland and DisneySea**: Both parks are uniquely Japanese, DisneySea being a one of a kind themed park.

2. **Senso-ji Temple**: This is one of Tokyo's most colorful and popular temples, located in Asakusa.

3. **Meiji Shrine**: Surrounded by beautiful forest, it's dedicated to the deified spirits of Emperor Meiji and his wife.

4. **Shinjuku Gyoen National Garden**: Provides a beautiful escape from the city. Particularly appealing during cherry blossom season.

5. **Tokyo Sky Tree**: The tallest structure in Japan, providing stunning views of the city.

6. **Tsukiji Fish Market**: Must visit for sushi lovers. The outer market has numerous food stalls.

7. **Akihabara**: A district famous for its many electronics shops

In [5]:
system_prompt = "You are a helpful chat assistant."
prompt = "Tell me more about the history of the 8th option."

print("------SYSTEM PROMPT-------")
print(system_prompt)
print("----------PROMPT----------")
print(prompt)
print("---------RESPONSE---------")
result, chat_history = llm_call(model="gpt-4", prompt=prompt, system_prompt=system_prompt, return_chat_history=True, chat_history=chat_history)
print(result)

------SYSTEM PROMPT-------
You are a helpful chat assistant.
----------PROMPT----------
Tell me more about the history of the 8th option.
---------RESPONSE---------
Sure, the 8th option refers to the Imperial Palace, which is a large park-like area located in the Chiyoda ward of Tokyo and contains several buildings including the main palace, the private residences of the Imperial Family, an archive, museums, and administrative offices.

The Imperial Palace, also known as the "Kyūjō" in Japanese, was completed in 1888 and was formerly known as Edo Castle (Edo-jō). It was the seat of the Tokugawa shogun who ruled Japan from 1603 until 1867. Edo Castle used to be the site of the shogun’s palace. 

In 1868, the shogunate was overthrown, and the country's capital and Imperial Residence were moved from Kyoto to Tokyo, previously known as Edo. Edo Castle became the Imperial Castle (皇城, Kōjō). The palace was destroyed during World War II and was rebuilt in the same style, afterwards.

The cast

## Example 2: Classification

Here we consider two problems - a multi-class news classification problem and a binary sentiment classification problem.

### News Classification

via PaLM 2 (text-bison)

We set up a template here telling the model to classify news headlines into one of 5 categories. Notice that we have provided the model with a number of examples of what the output should look like. This is an example of few-shot prompting we discussed earlier! 

In [22]:
# Text Bison does not support system prompts so we combine the two
prompt = f"""
Classify the given news headlines into one of the following categories: [business, entertainment, health, sports, technology]

Text: Pixel 7 Pro Expert Hands On Review. 
Category: technology 

Text: Quit smoking? 
Category: health 

Text: Birdies or bogeys? Top 5 tips to hit under par 
Category: sports 

Text: Relief from local minimum-wage hike looking more remote 
Category: business 

Text: Introducing Apple Vision Pro: Apple’s first spatial computer
Category: 
"""

In [23]:
print("------SYSTEM PROMPT-------")
print(system_prompt)
print("----------PROMPT----------")
print(prompt)
print("---------RESPONSE---------")
print(llm_call(model="text-bison", prompt=prompt, system_prompt=system_prompt))

------SYSTEM PROMPT-------

You are a helpful chat assistant. Always follow these instructions:
1. Given a document, you must extract the following information:
    a. Name of the company.
    b. Founders of the company.
    c. Founding date.
    d. Date of Series A funding.
2. If the information is not in the document, output "Not Specified"
3. Return the output in a JSON format.

----------PROMPT----------

Classify the given news headlines into one of the following categories: [business, entertainment, health, sports, technology]

Text: Pixel 7 Pro Expert Hands On Review. 
Category: technology 

Text: Quit smoking? 
Category: health 

Text: Birdies or bogeys? Top 5 tips to hit under par 
Category: sports 

Text: Relief from local minimum-wage hike looking more remote 
Category: business 

Text: Introducing Apple Vision Pro: Apple’s first spatial computer
Category: 

---------RESPONSE---------
technology


### Sentiment Classification

via Claude 2

In [30]:
# Claude 2 doesn't focus on system prompts. 
# It also requires specific formatting.
prompt = """\n\nHuman: Classify the sentiment of any given reviews as "positive", "neutral" or "negative" with a single word.
Sentence: I loved the new Spider-Man movie!! The animation was really fluid.
\n\nAssistant:"""

In [31]:
print("----------PROMPT----------")
print(prompt)
print("---------RESPONSE---------")
print(llm_call(model="claude-v2", prompt=prompt, system_prompt=system_prompt, parameters={"max_tokens_to_sample": 256}))

----------PROMPT----------


Human: Classify the sentiment of any given reviews as "positive", "neutral" or "negative" with a single word.
Sentence: I loved the new Spider-Man movie!! The animation was really fluid.


Assistant:
---------RESPONSE---------
 positive


## Example 3: Summarization

via LLaMa 2 70B

As the name suggests, we simply task the model to summarize a document.

In [11]:
system_prompt=f"""
You are a helpful chat assistant that summarizes given documents into a maximum of 3 sentences.
"""

In [12]:
prompt = """
Overview:
Tech Solutions Inc. is a leading technology consulting firm specializing in providing innovative solutions to businesses across various industries. We offer a comprehensive range of services including software development, IT consulting, project management, and cybersecurity solutions. 
With a strong focus on delivering exceptional quality and customer satisfaction, we have established ourselves as a trusted partner for organizations seeking digital transformation.
They were founded on April 12, 2005 and raise their first seed in 2007 and series A on May 25, 2007.

Founders:

Background: John Smith is a visionary entrepreneur with over 20 years of experience in the technology industry. He has a deep understanding of market trends and has successfully led several software development projects for multinational corporations.
Role in the Company: As a co-founder of Tech Solutions Inc., John Smith plays a pivotal role in shaping the company's strategic direction. His expertise in software development and leadership skills have been instrumental in driving the company's growth.
Sarah Johnson:

Background: Sarah Johnson is a highly accomplished technologist with a strong background in software engineering. She has extensive experience in managing complex IT projects and has a proven track record of delivering innovative solutions.
Role in the Company: As a co-founder of Tech Solutions Inc., Sarah Johnson leads the company's technical operations. Her deep knowledge of software engineering principles and commitment to excellence have been crucial in establishing the company as a leader in the industry.
Together, John Smith and Sarah Johnson founded Tech Solutions Inc. with the aim of providing cutting-edge technology solutions to help businesses thrive in the digital age. Their combined expertise and passion for innovation have been instrumental in the company's success.
"""

In [13]:
print("------SYSTEM PROMPT-------")
print(system_prompt)
print("----------PROMPT----------")
print(prompt)
print("---------RESPONSE---------")
print(llm_call(model="Llama-2-70b-chat-hf", prompt=prompt, system_prompt=system_prompt))

------SYSTEM PROMPT-------

You are a helpful chat assistant that summarizes given documents into a maximum of 3 sentences.

----------PROMPT----------

Overview:
Tech Solutions Inc. is a leading technology consulting firm specializing in providing innovative solutions to businesses across various industries. We offer a comprehensive range of services including software development, IT consulting, project management, and cybersecurity solutions. 
With a strong focus on delivering exceptional quality and customer satisfaction, we have established ourselves as a trusted partner for organizations seeking digital transformation.
They were founded on April 12, 2005 and raise their first seed in 2007 and series A on May 25, 2007.

Founders:

Background: John Smith is a visionary entrepreneur with over 20 years of experience in the technology industry. He has a deep understanding of market trends and has successfully led several software development projects for multinational corporations.
Ro

## Example 4: Entity Extraction

via LLaMa 2 13B

In the following example, we illustrate how to prompt the model to extract information from documents. We further extract this in a JSON format so it is usable in downstream applications.

In [32]:
system_prompt=f"""
You are a helpful chat assistant. Always follow these instructions:
1. Given a document, you must extract the following information:
    a. Name of the company.
    b. Founders of the company.
    c. Founding date.
    d. Date of Series A funding.
2. If the information is not in the document, output "Not Specified"
3. Return the output in a JSON format.
"""

In [33]:
prompt = """
Overview:
Tech Solutions Inc. is a leading technology consulting firm specializing in providing innovative solutions to businesses across various industries. We offer a comprehensive range of services including software development, IT consulting, project management, and cybersecurity solutions. 
With a strong focus on delivering exceptional quality and customer satisfaction, we have established ourselves as a trusted partner for organizations seeking digital transformation.
They were founded on April 12, 2005 and raise their first seed in 2007 and series A on May 25, 2007.

Founders:

Background: John Smith is a visionary entrepreneur with over 20 years of experience in the technology industry. He has a deep understanding of market trends and has successfully led several software development projects for multinational corporations.
Role in the Company: As a co-founder of Tech Solutions Inc., John Smith plays a pivotal role in shaping the company's strategic direction. His expertise in software development and leadership skills have been instrumental in driving the company's growth.
Sarah Johnson:

Background: Sarah Johnson is a highly accomplished technologist with a strong background in software engineering. She has extensive experience in managing complex IT projects and has a proven track record of delivering innovative solutions.
Role in the Company: As a co-founder of Tech Solutions Inc., Sarah Johnson leads the company's technical operations. Her deep knowledge of software engineering principles and commitment to excellence have been crucial in establishing the company as a leader in the industry.
Together, John Smith and Sarah Johnson founded Tech Solutions Inc. with the aim of providing cutting-edge technology solutions to help businesses thrive in the digital age. Their combined expertise and passion for innovation have been instrumental in the company's success.
"""

In [34]:
print("------SYSTEM PROMPT-------")
print(system_prompt)
print("----------PROMPT----------")
print(prompt)
print("---------RESPONSE---------")
print(llm_call(model="Llama-2-13b-chat-hf", prompt=prompt, system_prompt=system_prompt))

------SYSTEM PROMPT-------

You are a helpful chat assistant. Always follow these instructions:
1. Given a document, you must extract the following information:
    a. Name of the company.
    b. Founders of the company.
    c. Founding date.
    d. Date of Series A funding.
2. If the information is not in the document, output "Not Specified"
3. Return the output in a JSON format.

----------PROMPT----------

Overview:
Tech Solutions Inc. is a leading technology consulting firm specializing in providing innovative solutions to businesses across various industries. We offer a comprehensive range of services including software development, IT consulting, project management, and cybersecurity solutions. 
With a strong focus on delivering exceptional quality and customer satisfaction, we have established ourselves as a trusted partner for organizations seeking digital transformation.
They were founded on April 12, 2005 and raise their first seed in 2007 and series A on May 25, 2007.

Found

In practice, we can modify the system prompt with rules and conditions to structure it in the way we want. For instance, we could offer possible values for each field we extract - say `founding_date: <2000-2010, 2010-2020, or 2020-2030>`. We can even provide an entire JSON template for the LLM to return. In the next notebook we take this a little further and demonstrate the function calling capabilities of the GPT line of models.

## Example 5: Code Generation

We consider two examples in this area. The first is SQL generation and the second is Python generation.

### SQL Generation

via PaLM 2 (chat-bison)

In [35]:
system_prompt = f"""
You are a database analyst. Always follow these instructions:
1. Given SQL database schemas, return queries to answer the following questions:
    a. How many employees are in the sales department?
    b. Which employee earns the most?
    c. Which is the biggest department?
"""

In [36]:
prompt = """
Employee(id, name, department_id)
Department(id, name, address)
Salary(employee_id, amount)
"""

In [37]:
print("------SYSTEM PROMPT-------")
print(system_prompt)
print("----------PROMPT----------")
print(prompt)
print("---------RESPONSE---------")
print(llm_call(model="chat-bison", prompt=prompt, system_prompt=system_prompt))

------SYSTEM PROMPT-------

You are a database analyst. Always follow these instructions:
1. Given SQL database schemas, return queries to answer the following questions:
    a. How many employees are in the sales department?
    b. Which employee earns the most?
    c. Which is the biggest department?

----------PROMPT----------

Employee(id, name, department_id)
Department(id, name, address)
Salary(employee_id, amount)

---------RESPONSE---------
a. ```
SELECT COUNT(*)
FROM Employee
WHERE department_id = (
  SELECT id
  FROM Department
  WHERE name = 'Sales'
);
```

b. ```
SELECT name, amount
FROM Employee
JOIN Salary
ON Employee.id = Salary.employee_id
ORDER BY amount DESC
LIMIT 1;
```

c. ```
SELECT name, COUNT(*)
FROM Employee
JOIN Department
ON Employee.department_id = Department.id
GROUP BY Department.name
ORDER BY COUNT(*) DESC
LIMIT 1;
```


While this seems right at first glance, it may not be a good idea to use this SQL as-is. In any case of code generation, it is important to add some validation and security checks. For instance, we should ensure that the queries are not able to modify the database. We also need some way to check if the query is actually a valid SQL query. In tomorrow's session we go over Agents and Tools which can help with this.

### Python Generation

via GPT-3.5-turbo

In [38]:
system_prompt = f"""
You are a coding assistant. Always follow these instructions:
1. Given a prompt, return code in Python to complete the task.
2. Ensure that the code if properly formatted with illustrative variable names.
"""

In [39]:
prompt = """How do I find all prime numbers?"""

In [40]:
print("------SYSTEM PROMPT-------")
print(system_prompt)
print("----------PROMPT----------")
print(prompt)
print("---------RESPONSE---------")
print(llm_call(model="gpt-3.5-turbo", prompt=prompt, system_prompt=system_prompt))

------SYSTEM PROMPT-------

You are a coding assistant. Always follow these instructions:
1. Given a prompt, return code in Python to complete the task.
2. Ensure that the code if properly formatted with illustrative variable names.

----------PROMPT----------
How do I find all prime numbers?
---------RESPONSE---------
To find all prime numbers, you can use the Sieve of Eratosthenes algorithm. This algorithm works by iteratively marking the multiples of each prime number, starting from 2, as composite (not prime). Here's an example code that implements the Sieve of Eratosthenes algorithm:

```python
def find_all_primes(n):
    # Create a boolean array "is_prime[0..n]" and initialize
    # all entries it as true. A value in is_prime[i] will
    # finally be false if i is Not a prime, else true.
    is_prime = [True] * (n+1)
    is_prime[0] = is_prime[1] = False

    # Iterate through all numbers from 2 to sqrt(n)
    for i in range(2, int(n**0.5) + 1):
        if is_prime[i]:
          

As before, it's important to validate this code and ensure it is safe and doesn't have any bugs. In tomorrow's session we go over Agents and Tools which can help with this.

## Evaluation & Answer Spaces

Something to keep in mind while working on these problems is evaluation. Broadly, we can think of problems as falling into one of two categories:

**Natural Language Generation (NLG)**: 

Natural Language Generation implies an unconstrained answer space where the model can output pretty much anything. Tasks that fall into this category are things like summarization, code generation and chatbots. In these scenarios, we may have difficulty determining if an answer is correct. There are solutions to this kind of problem though and we will go into the details in future sessions.
  
**Natural Language Understanding (NLU)**: 

With NLU, the set of possible answers is constrained. Tasks that fall into this category include things like classification and entity extraction. In these scenarios, there's a finite set of answers that we care about. For example, in sentiment analysis we're looking for one of positive, negative and neutral. We can easily evaluate models under this class as there are many quantitative metrics we can use here such as Precision and Recall. 

With open source models, it is pretty easy to get the probabilities of only the answers we want by calculating the conditional probability of each of the possible answers. That is, for a given answer X, we calculate the probability `P(X|input)` for all possible answers. The highest probability then corresponds to the prediction.

However, this approach is not possible for closed source models where we only receive direct predictions. While we unfortunately don't have access to the actual confidence of a prediction, we can still get the prediction itself and evaluate the model. This does however require us to ensure that the model gives us outputs in a consistent format that we can work with, such as a JSON or just a single word output like we've seen in the above examples.