# Prompt Engineering with LLMs using Gemini

**Learning Objective**

1. Learn how to use Google Gen AI SDK to call Gemini
1. Learn how to setup the Gemini parameters 
1. Learn prompt engineering for text generation
1. Learn prompt engineering for chat applications


The Google Gen AI SDK lets you test, customize, and deploy instances of Google's Gemini large language models (LLM) so that you can leverage the capabilities of Gemini in your applications. The Gemini family of models supports text completion, multi-turn chat, and text embeddings generation.

This notebook will provide examples of accessing pre-trained Gemini models with the SDK for use cases like text classification, summarization, extraction, and chat.

## Setup

In [None]:
from google import genai
from google.genai.types import Content, Part
from IPython.display import Markdown

## Text generation

The cell below implements the helper function `generate_content_stream` to generate stream responses from Gemini using the Gen AI SDK. <br>
If you don't need streaming outputs, use `generate_content` instead.

In [None]:
client = genai.Client(vertexai=True, location="us-central1")

In [None]:
def generate(
    prompt,
    model_name="gemini-2.0-flash-001",
):
    responses = client.models.generate_content_stream(
        model=model_name, contents=prompt
    )
    return responses

In [None]:
responses = generate(
    "What are five important things to understand about large language models?"
)
for response in responses:
    print(response.text, end="")

### Text Classification 

Now that we've tested our wrapper function, let us explore prompting for classification. Text classification is a common machine learning use-case and is frequently used for tasks like spam detection, sentiment analysis, topic classification, and more. 

Both **zero-shot** and **few-shot** prompting are common with text classification use cases. Zero-shot prompting is where you do not provide examples with labels in the input, and few-shot prompting is where you provide (a few) examples in the input. 

Let us start with zero-shot classification:

**Exercise**

Write a zero-shot prompt that allows you to categorize a text into the following categories: "technology", "polictics", and "sport":

In [None]:
prompt = """
TODO - Fill in your prompt here
"""

responses = generate(prompt)
for response in responses:
    print(response.text, end="")

**Exercise**

Write a few-shot prompting for classification. Along with increasing the accuracy of your model, few-shot prompting gives you a certain control over the output format. The prompt should be able to classify a text into the categories "dogs" and "cats", and return only these categories.

In [None]:
prompt = """
What is the topic for a given text? 
- cats 
- dogs 

Text: #TODO
The answer is: #TODO

Text: #TODO
The answer is: #TODO

#TODO - add more examples
"""

responses = generate(prompt)
for response in responses:
    print(response.text, end="")

### Text Summarization

**Exercise**

Gemini can also be used for text summarization use cases. Text summarization produces a concise and fluent summary of a longer text document. 
In the cell below, write a prompt that can summarize the following text:

```
A transformer is a deep learning model. It is distinguished by its adoption of self-attention, 
differentially weighting the significance of each part of the input (which includes the 
recursive output) data. Like recurrent neural networks (RNNs), transformers are designed to 
process sequential input data, such as natural language, with applications towards tasks such 
as translation and text summarization. However, unlike RNNs, transformers process the 
entire input all at once. The attention mechanism provides context for any position in the 
input sequence. For example, if the input data is a natural language sentence, the transformer 
does not have to process one word at a time. This allows for more parallelization than RNNs 
and therefore reduces training times.
```

In [None]:
prompt = """
TODO
"""
responses = generate(prompt)
for response in responses:
    print(response.text, end="")

**Exercise**

Modify the prompt in the cell above so that it outputs 4 bullet point summary of the text.

In [None]:
prompt = """
#TODO

Summary:

"""
responses = generate(prompt)
for response in responses:
    print(response.text, end="")

**Exercise**

Consider the following dialog between a customer and service representative:

```
Customer: Hi! I'm reaching out to customer service because I am having issues.

Service Rep: What seems to be the problem? 

Customer: I am trying to use Gemini but I keep getting an error. 

Service Rep: Can you share the error with me? 

Customer: Sure. The error says: "ResourceExhausted: 429 Quota exceeded for 
      aiplatform.googleapis.com/online_prediction_requests_per_base_model 
      with base model: gemini-2.0-flash-001"
      
Service Rep: It looks like you have exceeded the quota for usage. Please refer to 
             https://cloud.google.com/vertex-ai/docs/quotas for information about quotas
             and limits. 
             
Customer: Can you increase my quota?

Service Rep: I cannot, but let me follow up with somebody who will be able to help.

```

Write a prompt that can give a short summary of what was said along with todo items for 
the support representative:

In [None]:
prompt = """
#TODO
"""

responses = generate(prompt)
for response in responses:
    print(response.text, end="")

### Text Extraction 
Gemini can be used to extract and structure text. Text extraction can be used for a variety of purposes. One common purpose is to convert documents into a machine-readable format. This can be useful for storing documents in a database or for processing documents with software. Another common purpose is to extract information from documents. This can be useful for finding specific information in a document or for summarizing the content of a document. 

**Exercise**

Consider the following recipe:

```
Ingredients:
* 1 tablespoon olive oil
* 1 onion, chopped
* 2 carrots, chopped
* 2 celery stalks, chopped
* 1 teaspoon ground cumin
* 1/2 teaspoon ground coriander
* 1/4 teaspoon turmeric powder
* 1/4 teaspoon cayenne pepper (optional)
* Salt and pepper to taste
* 1 (15 ounce) can black beans, rinsed and drained
* 1 (15 ounce) can kidney beans, rinsed and drained
* 1 (14.5 ounce) can diced tomatoes, undrained
* 1 (10 ounce) can diced tomatoes with green chilies, undrained
* 4 cups vegetable broth
* 1 cup chopped fresh cilantro
```

Write a zero-shot prompt that can return the ingredients in JSON format with keys:
"ingredient", "quantity", and "type".

In [None]:
prompt = """
TODO
"""
responses = generate(prompt)
for response in responses:
    print(response.text, end="")

**Exercise**

Consider a product description of the type

```
 Google Nest WiFi, network speed up to 1200Mpbs, 2.4GHz and 5GHz frequencies, WP3 protocol
```

Write a few-shot prompt that can output a the product characteristics in JSON format, as for example:

```python
JSON: {
  "product":"Google Nest WiFi",
  "speed":"1200Mpbs",
  "frequencies": ["2.4GHz", "5GHz"],
  "protocol":"WP3"
}
```

In [None]:
prompt = """
#TODO
"""

responses = generate(prompt)
for response in responses:
    print(response.text, end="")

## Prompt engineering for chat

The Gen AI SDK for chat is optimized for multi-turn chat. Multi-turn chat is when a model tracks the history of a chat conversation and then uses that history as the context for responses.

Gemini enables you to have freeform conversations across multiple turns. The `Chat` class simplifies the process by managing the state of the conversation, so unlike with `generate_content`, you do not have to store the conversation history as a list.

Let's initialize the chat:

In [None]:
chat = client.chats.create(model="gemini-2.0-flash-001")
chat

The `Chat.send_message` method returns the same `GenerateContentResponse` type as `client.models.generate_content`. It also appends your message and the response to the chat history:

In [None]:
response = chat.send_message(
    "In one sentence, explain how a computer works to a young child."
)
response.text

Recall that within a chat session, history is preserved. This enables the model to remember things within a given chat session for context. You can get this history with the `get_history()` method of the chat session object. Notice that the history is simply a list of previous input/output pairs.

In [None]:
chat.get_history()

**Exercise**

Ask a follow up question to our previous question using the `send_message()` API

In [None]:
response = #TODO
response.text

### Adding chat history

You can add chat history to a chat by adding messages from role user and model alternately. System messages can be set in the first part for the first message.

In [None]:
chat2 = client.chats.create(
    model="gemini-2.0-flash-001",
    history=[
        Content(
            role="user",
            parts=[
                Part.from_text(
                    text="""My name is Ned. You are my personal assistant. My favorite movies are Lord of the Rings and Hobbit.Who do you work for?"""
                )
            ],
        ),
        Content(role="model", parts=[Part.from_text(text="I work for Ned.")]),
        Content(role="user", parts=[Part.from_text(text="What do I like?")]),
        Content(
            role="model",
            parts=[Part.from_text(text="Ned likes watching movies.")],
        ),
    ],
)

response = chat2.send_message("Are my favorite movies based on a book series?")
Markdown(response.text)

In [None]:
response = chat2.send_message("When were these books published?")
Markdown(response.text)

Copyright 2024 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

     https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.