## **Chapter 1. Basic Prompt Structure** 

Upstage offers various types of APIs, including Chat, Text Embedding, Translation, Grounding Check, Layout Analysis, Key Information Extraction, and Document Processing. In this book, we will exclusively focus on using the `Chat API`.

For more information about the APIs, please refer to the following [link](https://github.com/UpstageAI/cookbook?tab=readme-ov-file#api-list).

- [ 1.1 Chat API ](#section1)
- [ 1.2 Understanding Parameters ](#section2)
- [ 1.3 Understanding Structure ](#section3)
- [ 1.4 Understanding System Prompt ](#section4)

---

<a id="section1"></a>
### **1.1 `Chat API`**

Following is a standard API call format used to interact with Upstage’s API for generating chat completions.

In [6]:
from openai import OpenAI

# Retrieve the UPSTAGE_API_KEY variable from the IPython store
%store -r UPSTAGE_API_KEY

client = OpenAI(
    api_key= UPSTAGE_API_KEY,
    base_url="https://api.upstage.ai/v1/solar"
)
 
response = client.chat.completions.create(
    model="solar-pro",
    messages=[
        {
            "role": "user",
            "content": "Describe how we plan to leverage Upstage products to achieve your mission of AGI for Work."
        }
    ],
)

print(response.choices[0].message.content)

OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable

In [9]:
# Store your API key (run this first)
%store UPSTAGE_API_KEY=""

# Then your existing code will work
client = OpenAI(...)

UsageError: Unknown variable 'UPSTAGE_API_KEY=your-api-key-here'


<a id="section2"></a>
### **1.2 Understanding Parameters**

When doing prompt engineering, parameters are key to controlling how the model behaves and the type of output you receive. <br>
Here’s a detailed explanation of these parameters and their role in the completion generation process.

- [model](#model)
- [max_tokens](#maxtoken)
- [temperature](#temp)
- [Top_P](#topp)

[summary](#summary)

---

<a id="model"></a>
**`Model`**:  `Solar_pro`

The specific model you are intending to interact with. 

<a id="maxtoken"></a>
**`Max_Tokens`**: 

- This parameter limits the total number of tokens (words or parts of words) in the output. Controlling `max_tokens` allows you to set a maximum length for the model’s output. This is useful to avoid overly long responses, control API usage costs, or tailor the output for specific use cases (e.g., short answers, summaries, etc.).

- **Hard** **stop :**
    - Prevents the model from generating tokens beyond the specific limit.
    - The generation may stop mid-word or mid-sentence when the token limit is reached.

- **Prompt tokens** : The number of tokens in the input prompt.

- If `max_tokens` is set, the sum of input tokens and max_tokens must less than or equal to the model’s  context length (≤ 4096 )

<a id="temp"></a>
**`Temperature`**

This parameter controls the randomness or creativity of the model’s responses. 

- A higher value allows for more flexibility, resulting in more diverse text generation.
- A lower value makes the model more deterministic, typically generating more accurate and consistent output.

The valid range is between **0** and **2.0** (`0 ≤ Temperature ≤ 2.0`).

- `0.0` : The output is deterministic and predictable, meaning the model will likely return the same response to the same prompt every time.
- **`0.7`**: This is a balanced level, where the model is creative but still focused. The responses may vary, but they tend to stay on topic.
- **`2.0`**: This encourages highly creative or random output, potentially producing more unusual or diverse responses.

<a id="topp"></a>
**`Top_P`** 

 This is an alternative way to control the randomness of the model's output by considering the cumulative probability of token choices. `Top_P` allows you to control how "safe" or "risky" the model is in generating its response. Lower values reduce the model’s sampling range, forcing it to stick to higher-probability tokens, while higher values increase diversity in the responses.

- **Top_P = 0.9** means the model will sample tokens from the smallest set whose cumulative probability is 90%.

**! How it differs from `temperature`**: While `temperature` affects how creative the model is overall, `Top_P` affects how many of the high-probability tokens are considered in the final response.

---

<a id="summary"></a>
**Summary** 

- **model**: Defines the specific AI model being used.
- **max_tokens**: Limits the length of the response.
- **temperature**: Controls the creativity or randomness of the response.
- **Top_P**: Controls how many token choices the model considers based on probability.

#### **Content Examples**

**Example #1: Configuration**

In [2]:
config_model = {
    "model": "solar-pro",
    "max_tokens": 150,
    "temperature": 0.7,
    "top_p": 0.9,
}

**Example #2: Temperature and Top_P Adjustment**

Objective:  Compare how creativity and randomness affect responses. 

In [3]:
config_robust = {
    "model": "solar-pro",
    "messages": [
        {
            "role": "user",
            "content": "What are the potential benefits of AI in healthcare?"
        }
    ],
    "max_tokens": 100,
    "temperature": 0.0,
    "top_p": 1.0
}

response = client.chat.completions.create(**config_robust)
print(response.choices[0].message.content)

AI in healthcare can bring numerous benefits, such as improved diagnostics, personalized treatment plans, and enhanced patient care. It can also help in drug discovery, reducing medical errors, and managing healthcare costs. Additionally, AI can assist in remote patient monitoring and provide valuable insights through data analysis.


In [6]:
config_creative = {
    "model": "solar-pro",
    "messages": [
        {
            "role": "user",
            "content": "What are the potential benefits of AI in healthcare?"
        }
    ],
    "max_tokens": 100,
    "temperature": 2.0,
    "top_p": 0.8
}

response = client.chat.completions.create(**config_creative)
print(response.choices[0].message.content)

Here are some key benefits of AI in healthcare: accurate diagnosis through imaging, predictive clinical supports, virtual assistants for hospital staff and streaming consulting for patients around the world, mentioning just a few. These technologies can optimize care resource access, precision, and personalization.


**Example #3: Limiting Output with max_tokens**

Objective: Control the length of responses and stop them at specific points. 

In [7]:
config_output_400 = {
    "model": "solar-pro",
    "messages": [
        {
            "role": "user",
            "content": "Explain how Upstage AI models handle natural language processing. Explain it in a way that non-developers can easily understand."
        }
    ],
    "max_tokens": 400,
    "temperature": 0.5,
}

response = client.chat.completions.create(**config_output_400)
print(response.choices[0].message.content)

Sure! Upstage AI models handle natural language processing (NLP) by using advanced machine learning techniques. Think of it like teaching a robot to understand and respond to human language.

First, we feed our models with a large amount of text data, like books, articles, and conversations. This helps the models learn the patterns, structures, and meanings of different languages.

Next, when you ask a question or give a command, the model analyzes your words, identifies the key elements, and interprets their meaning. It then generates a response that's relevant and accurate, based on what it learned during training.

In simple terms, Upstage AI models learn from lots of text and use that knowledge to understand and respond to human language in a smart and helpful way.


In [8]:
config_output_40 = {
    "model": "solar-pro",
    "messages": [
        {
            "role": "user",
            "content": "Explain how Upstage AI models handle natural language processing. Explain it in a way that non-developers can easily understand."
        }
    ],
    "max_tokens": 40,
    "temperature": 0.5,
}

response = client.chat.completions.create(**config_output_40)
print(response.choices[0].message.content)

Upstage AI models handle natural language processing by first breaking down sentences into smaller pieces, like words and phrases. Then, they analyze these pieces to understand their meanings and relationships with each other


---

<a id="section3"></a>
### **1.3 Understanding Structure**

- [**messages: system, user, assistant**](#message)

- [**content example**](#contentex)

<a id="message"></a>
**`messages`** : 

It is an array containing the conversation context. It includes exchanged between the user and the model. Each contains: 

- “role”:
    
    The role can be `"user"`, `"assistant"`, or `"system"`, indicating the source of the message.
    
    In the case of `"role": "system"`, it sets the behavior, tone, and knowledge base of the assistant, acting as an initial instruction.
    
    In the case of `"role": "user"`, it specifies that the message comes from the user.
    
    In the case of `"role": "assistant"`, it contains responses generated by the AI to address the user’s queries or continue the conversation.

<a id="contentex"></a>
#### **Content Example**

In [None]:
{
  "role": "system",
  "content": "You are my Assistant. Your role is to answer my questions faithfully and in detail. "
}

In [None]:
{
  "role": "user",
  "content": "Hello, Solar. Can you help me plan a weekend trip to New York City?"
}

In [None]:
{
  "role": "assistnat",
  "content": "Hello! I'd be happy to help you plan your weekend trip to New York City. Let's start by discussing your interests and preferences. Are you looking for sightseeing, shopping, diningor perhaps a mix of all?"
}

---

<a id="section4"></a>
### **1.4 Understanding System Prompt**

**The system prompt** plays a key role in shaping how the AI model interprets and responds to user inputs. In the context of prompt engineering, understanding and utilizing the system prompt effectively can help guide the model’s behavior and ensure that its responses are aligned with user expectations.  In this book, we shows three different types of system prompts. 

> **Tips**:
>
> If the system prompt is short, the responses tend to be short as well, and if the system prompt is long, the responses tend to be longer. The Solar Pro Preview model also shows a linear increase in response quantity based on the number of tokens used in the system prompt.

**Definition** : 
If you’re curious about the conceptual difference between a *system prompt* and a *user prompt*, read the following definition.
- System instructions provide additional high-levle guidlines that an LLM must follow when responding to individual user prompts. These instructions are distinguished by a "system" role flag within a ChatML dialogue interface. 


#### **Content Examples**

**Example #1: Short Version**

In [13]:
config_model = {
    "model": "solar-pro",
    "max_tokens": 800,
    "temperature": 0.2,
    "top_p": 0.9,
}

message = [
    {
        "role": "system",
        "content": "You are an AI assistant to help user's various tasks. Please provide me with an accurate information."
    },
    {
        "role": "user",
        "content": "Explain about Blockchain in detail."
    }
]

config = {**config_model, "messages": message}

response = client.chat.completions.create(**config)
print(response.choices[0].message.content)

Blockchain is a decentralized, distributed digital ledger that records transactions across many computers so that any involved record cannot be altered retroactively, without the alteration of all subsequent blocks. This technology allows for secure, transparent, and tamper-proof record-keeping. It's most commonly associated with cryptocurrencies like Bitcoin, but it has many other potential applications, such as in supply chain management, voting systems, and digital identity verification.


**Example #2: Long Version**

In [None]:
config_model = {
    "model": "solar-pro",
    "max_tokens": 800,
    "temperature": 0.2,
    "top_p": 0.9,
}

message = [
    {
        "role": "system",
        "content": "Your name is Solar. As my friendly AI language assistant, you are tasked with providing me an accurate information. If you find that the information at hand is inadequate, please ask me for further information. [Strong Rule] If you don't have any real-time information about the user’s query, please be honesty."
    },
    {
        "role": "user",
        "content": "Explain about Blockchain in detail."
    }
]

config = {**config_model, "messages": message}

response = client.chat.completions.create(**config)
print(response.choices[0].message.content)

Blockchain is a decentralized, distributed digital ledger that is used to record transactions across many computers so that any involved record cannot be altered retroactively, without the alteration of all subsequent blocks. This technology was first introduced in 2008 with the creation of Bitcoin, a cryptocurrency.

A blockchain is composed of a series of blocks, each containing a list of transactions. Each block is linked to the previous one through a cryptographic hash, creating a chain of blocks. This structure ensures the integrity and security of the data, as any change in a block would require changing all subsequent blocks, which is practically impossible due to the distributed nature of the network.

Blockchain technology has many potential applications beyond cryptocurrencies, such as supply chain management, voting systems, and digital identity verification. Its key features include decentralization, transparency, immutability, and security.

If you need more specific infor

#### **Practice Exercises**

Try switching between the short and long versions of the system prompt with different questions to experience the difference in responses.

**Short System Prompt**

In [None]:
config_model = {
    "model": "solar-pro",
    "max_tokens": 800,
    "temperature": 0.0,
    "top_p": 0.9,
}

message = [
    {
        "role": "system",
        "content": "You are an AI assistant to help user's various tasks. Please provide me with an accurate information."
    },
    {
        "role": "user",
        "content": "<<[ Replace this text ]>>"
    }
]

config = {**config_model, "messages": message}

response = client.chat.completions.create(**config)
print(response.choices[0].message.content)

**Long System Prompt**

In [None]:
config_model = {
    "model": "solar-pro",
    "max_tokens": 800,
    "temperature": 0.0,
    "top_p": 0.9,
}

message = [
    {
        "role": "system",
        "content": "Your name is Solar. As my friendly AI language assistant, you are tasked with providing me an accurate information. If you find that the information at hand is inadequate, please ask me for further information. [Strong Rule] If you don't have any real-time information about the user’s query, please be honesty."
    },
    {
        "role": "user",
        "content": "<<[ Replace this text ]>>"
    }
]

config = {**config_model, "messages": message}

response = client.chat.completions.create(**config)
print(response.choices[0].message.content)

**Custom Your System Prompt**

In [None]:
config_model = {
    "model": "solar-pro",
    "max_tokens": 800,
    "temperature": 0.0,
    "top_p": 0.9,
}

message = [
    {
        "role": "system",
        "content": "<<[ Replace this text ]>>"
    },
    {
        "role": "user",
        "content": "<<[ Replace this text ]>>"
    }
]

config = {**config_model, "messages": message}

response = client.chat.completions.create(**config)
print(response.choices[0].message.content)

**Well Done!**

Having completed all the exercises, you're now ready to proceed to the next chapter.