In [1]:
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

## Overview

This notebook covers the essentials of prompt engineering, including some best practices.

Learn more about prompt design in the [official documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/text/text-overview).


## Getting Started

### Install Vertex AI SDK and other required packages


In [2]:
%pip install --upgrade --user --quiet google-cloud-aiplatform

Note: you may need to restart the kernel to use updated packages.


### Restart runtime

To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which will restart the current kernel.

In [3]:
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

<div class="alert alert-block alert-warning">
<b>⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. ⚠️</b>
</div>


### Authenticate your notebook environment (Colab only)

Authenticate your environment on Google Colab.


In [1]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

### Set Google Cloud project information and initialize Vertex AI SDK

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [None]:
PROJECT_ID = "<YOUR_PROJECT_ID>"  # @param {type:"string"}
LOCATION = "<YOUR_LOCATION>"  # @param {type:"string"}

import vertexai

vertexai.init(project=PROJECT_ID, location=LOCATION)

In [22]:
from vertexai.generative_models import GenerationConfig, GenerativeModel
import time

### Load model

In [4]:
model = GenerativeModel("gemini-1.5-flash")

In [None]:
import time

def generate_gemini_response(user_prompt, config=GenerationConfig(temperature=1.0)):
    """Generate content using the Gemini model with retry logic."""
    retry_interval = 1
    while True:
        try:
          
            model_response = model.generate_content(user_prompt, generation_config=config).text
            return model_response
            break  
        except Exception as error:  
            print(f"Error occurred during content generation: {error}. Retrying...")
            time.sleep(retry_interval)
            retry_interval *= 2  

def send_gemini_message(chat_model, user_message):    
    """Send a message to the Gemini model with retry logic."""
    retry_interval = 1
    while True:
        try:
           
            chat_response = chat_model.send_message(user_message).text
            return chat_response
            break 
        except Exception as error:  
            print(f"Error occurred while sending message: {error}. Retrying...")
            time.sleep(retry_interval)
            retry_interval *= 2  


## Prompt engineering best practices

Prompt engineering is all about how to design your prompts so that the response is what you were indeed hoping to see.

The idea of using "unfancy" prompts is to minimize the noise in your prompt to reduce the possibility of the LLM misinterpreting the intent of the prompt. Below are a few guidelines on how to engineer "unfancy" prompts.

In this section, you'll cover the following best practices when engineering prompts:

* Be concise
* Be specific, and well-defined
* Ask one task at a time
* Improve response quality by including examples
* Turn generative tasks to classification tasks to improve safety

### Be concise

🛑 Not recommended. The prompt below is unnecessarily verbose.

In [None]:
startup_naming_prompt = "What do you think could be a good name for a startup that specializes in AI-driven healthcare solutions?"


startup_name_suggestion = generate_gemini_response(startup_naming_prompt)
print("Suggested startup name based on the prompt:")
print(startup_name_suggestion)


: 

✅ Recommended. The prompt below is to the point and concise.

In [None]:
prompt = "Suggest a name for a startup that specializes in AI-driven healthcare solutions."

print(generate_gemini_response(prompt))

: 

### Be specific, and well-defined

Suppose that you want to brainstorm creative ways to describe Earth.

🛑 The prompt below might be a bit too generic (which is certainly OK if you'd like to ask a generic question!)

In [None]:
a_prompt = "Tell me about yourself"

print(generate_gemini_response(a_prompt))

: 

✅ Recommended. The prompt below is specific and well-defined.

In [None]:
b_prompt = "Generate a list of ways that makes you unique compared to other AI models!    "

print(generate_gemini_response(b_prompt))

: 

### Ask one task at a time

🛑 Not recommended. The prompt below has two parts to the question that could be asked separately.

In [None]:
whats_a_prompt = "What's the best way to cook food and why are the clouds white?"

print(generate_gemini_response(whats_a_prompt))

: 

✅ Recommended. The prompts below asks one task a time.

In [None]:
shitted_prompt = "What's the best method of boiling water?"

print(generate_gemini_response(shitted_prompt))

: 

In [None]:
fed_prompt = "Why is the sky blue?"

print(generate_gemini_response(fed_prompt))

: 

### Watch out for hallucinations

Although LLMs have been trained on a large amount of data, they can generate text containing statements not grounded in truth or reality; these responses from the LLM are often referred to as "hallucinations" due to their limited memorization capabilities. Note that simply prompting the LLM to provide a citation isn't a fix to this problem, as there are instances of LLMs providing false or inaccurate citations. Dealing with hallucinations is a fundamental challenge of LLMs and an ongoing research area, so it is important to be cognizant that LLMs may seem to give you confident, correct-sounding statements that are in fact incorrect.

Note that if you intend to use LLMs for the creative use cases, hallucinating could actually be quite useful.

Try the prompt like the one below repeatedly. We set the temperature to 1.0 so that it takes more risks in its choices. It's possible that it may provide an inaccurate, but confident answer.

In [None]:
gen_conf = GenConf(temp=1.0)

zlatt_prompt = "What day is it today?"

print(generate_gemini_response(zlatt_prompt, gen_conf))

: 

Since LLMs do not have access to real-time information without further integrations, you may have noticed it hallucinates what day it is today in some of the outputs.

Now let us pretend to be a user asks the chatbot a question that is unrelated to travel.

You can see that this way, a guardrail in the prompt prevented the chatbot from veering off course.

### Turn generative tasks into classification tasks to reduce output variability

#### Generative tasks lead to higher output variability

The prompt below results in an open-ended response, useful for brainstorming, but response is highly variable.

In [None]:
damned_prompt = "I'm a college student. Recommend me a programming activity to improve upon myself."

print(generate_gemini_response(damned_prompt))

: 

#### Classification tasks reduces output variability

The prompt below results in a choice and may be useful if you want the output to be easier to control.

In [None]:
ghufeoig_p = """I'm a college student. Which of these activities would you suggest and why:
a) learn c++
b) learn JavaScript
c) learn Assembly
"""

print(generate_gemini_response(ghufeoig_p))

: 

### Improve response quality by including examples

Another way to improve response quality is to add examples in your prompt. The LLM learns in-context from the examples on how to respond. Typically, one to five examples (shots) are enough to improve the quality of responses. Including too many examples can cause the model to over-fit the data and reduce the quality of responses.

Similar to classical model training, the quality and distribution of the examples is very important. Pick examples that are representative of the scenarios that you need the model to learn, and keep the distribution of the examples (e.g. number of examples per class in the case of classification) aligned with your actual distribution.

#### Zero-shot prompt

Below is an example of zero-shot prompting, where you don't provide any examples to the LLM within the prompt itself.

In [None]:
lets_prompt = """Decide whether a caption's sentiment is positive, neutral, or negative.

Caption: I loved the new gork video you made!
Sentiment:
"""

print(generate_gemini_response(lets_prompt))

: 

#### One-shot prompt

Below is an example of one-shot prompting, where you provide one example to the LLM within the prompt to give some guidance on what type of response you want.

In [None]:
prompt = """Decide whether a caption's sentiment is positive, neutral, or negative.

Caption: I loved the new gork video you made!
Sentiment:positive

Tweet: That was awful. Super boring 😠
Sentiment:
"""

print(call_gemini(prompt))

: 

#### Choosing between zero-shot, one-shot, few-shot prompting methods

Which prompt technique to use will solely depends on your goal. The zero-shot prompts are more open-ended and can give you creative answers, while one-shot and few-shot prompts teach the model how to behave so you can get more predictable answers that are consistent with the examples provided.