# 2.3 Optimizing prompts to improve the quality of responses from Q&A bots



## 🚄 Preface  



The Q&A bot from the previous section can already answer questions about company rules and regulations, but this is only the first step in building an excellent user experience. Users often expect more personalized and precise interactions. This section will introduce techniques in prompt engineering, teaching you how to improve or control the output of the Q&A bot by optimizing prompts—for example, adjusting tone, standardizing formats, or even enabling it to handle tasks like text summarization, inference, and transformation.


## 🍁 Course Objectives

After completing this section , you will know about:

* How prompt frameworks and templates work
* Prompt techniques and their best practices
* How to apply LLMs in engineering to handle various tasks

## 1. Previous Content Recap

In the previous section, through the RAG method, the LLM has already obtained the company's private knowledge. For ease of invocation, this was encapsulated into several functions and saved in chatbot/rag.py. You can now quickly invoke it using the following code:

In [2]:
from chatbot import rag
import os
from config.load_key import load_key

In [3]:
os.environ["BASE_URL"] = "https://dashscope.aliyuncs.com/compatible-mode/v1"

# Load API key
load_key()
print(f'Your configured API Key is: {os.environ["DASHSCOPE_API_KEY"][:5]+"*"*5}')

Your configured API Key is: sk-4b*****


In [4]:
# Load index
# The previous section has already built the index, so the index can be loaded directly here. If you need to rebuild the index, you can add a line of code: rag.indexing()
index = rag.load_index(persist_path="knowledge_base/test")
query_engine = rag.create_query_engine(index=index)

In [5]:
# Define the question-answering function
def ask_llm(question, query_engine):
  streaming_response = query_engine.query(question)
  streaming_response.print_response_stream()

## 2. Prompt optimization to improve response quality 

In the previous section, you obtained the management tools used internally by the company through the RAG method. However, one of your colleagues hopes that the Q&A bot not only provides the tool name, but also includes a link to the tool. You could ask all your colleagues to include a note like "provide the download address" in their questions, but this approach is inefficient. You decide to enhance the program by automatically supplementing user questions with additional requirements for the response content.

In [6]:
question = "What tools should our company use for project management?"
instruction = "If it's a tool consultation question, be sure to provide the download link."
new_question = question + instruction
ask_llm(new_question, query_engine)

For project management, the company can use Asana for task tracking and Jira for technical team coordination. These tools will help streamline workflows and enhance collaboration across teams.

- **Asana**: Ideal for coordinating tasks and projects among instructional designers and other teams. You can sign up or learn more at [Asana’s official website](https://asana.com/).

- **Jira**: Perfect for technical teams to manage daily standups and project sprints. More information and downloads are available at [Jira’s official website](https://www.atlassian.com/software/jira).

After adjusting the way prompts were provided, the Q&A bot's responses became more aligned with your colleagues' requirements. Therefore, the choice of prompt significantly determines the quality of the LLM's responses. Next, you can refer to some prompt frameworks to construct your prompts.

## 3. Prompt Framework

### 3.1 Basic Elements

When communicating with an LLM, you can think of it as a person who has undergone "socialization training." The way you interact should mirror how humans exchange information. Your requirements must be clear and unambiguous. The clearer and more precise your questiom (or prompt), the better the large language model will understand the core of the query, and the more likely its response will meet your expectations. Generally speaking, the following elements should be clarified in a prompt, and form a prompt framework that can help you construct a complete and effective prompt:

|Element|Meaning|
|----|----|
|Objective|Clearly specify what task the LLM is required to complete, allowing it to focus on specific goals.|
|Context|Background information about the task, such as operational processes or task scenarios, clarifying the scope of the discussion for the LLM.|
|Role|The role the LLM plays, or  the voice or writing style that the model should use, clarifying the expected emotional tone of the response.|
|Audience|Clarify the specific audience the LLM is targeting, constraining its response style.|
|Sample|Provide specific examples for the LLM to refer to; it will abstract solutions and focus on specific formats from these samples.|
|Output Format|Specify the format of the output, output type, and range of enumerated values. You can also state content and information that should not be included in the output, which can be further clarified with examples.|



Of course, in addition to the prompt framework discussed above, many problem analysis paradigms can be used to help you articulate clear and specific requirements. For example, SWOT analysis and 5W2H analysis offer structured approaches to thinking through your needs and expressing them more effectively. Additionally, you can also write a meta-prompt—a prompt designed to optimize or enhance another prompt—to guide the LLM in refining your initial input.

By using such a meta-prompt, you can ask the model to:
* Clarify ambiguous statements
* Expand vague instructions into detailed guidelines
* Add context, tone, format, and examples where necessary
* Ensure the final prompt aligns with best practices for interacting with AI models

This self-reflection approach improves the quality of your prompts, while deepening your understanding of how to communicate effectively with AI systems.

In [None]:
prompt_expand = """
    You are a Prompt Optimization and Expansion Assistant. Your primary role is to receive an arbitrary user input prompt — which may be vague, incomplete, or lacking in detail — and **refine and expand** it into a more structured, precise, and actionable instruction that can be clearly understood and executed by large language models.

    Given the input prompt, you should generate an optimized version according to the following instructions:

    Your task is to enhance the clarity of the original prompt while preserving its core intent. The resulting optimized prompt should:
    - Be semantically richer and more descriptive.
    - Include relevant context, constraints, or formatting instructions as needed.
    - Specify desired tone, style, or audience (if applicable).
    - Suggest example inputs and outputs where appropriate to guide model behavior.

    ---

    ### Output Format:

    Please respond with the following structure:

    #### 1. Optimized Prompt:
    [Here goes the enhanced and expanded version of the original prompt.]

    #### 2. Example Input (if applicable):
    [Provide one or more example(s) of what a user might input based on the prompt.]

    #### 3. Example Output (based on the example input):
    [Show what kind of output would be expected from the model when using this prompt with the given example input.]

    #### 4. Optimization Notes:
    [Briefly explain how you improved the original prompt — e.g., added clarity, specified format, included context, etc.]

    ---

    ### Example:

    #### Original Prompt:
    "Write a story about animals."

    #### Optimized Prompt:
    "Compose a short narrative (approximately 300–500 words) about a group of forest animals who must work together to solve a problem. The story should include at least three distinct animal characters, a central conflict or challenge, and a resolution that highlights themes such as cooperation, courage, or environmental awareness. Use descriptive language to set the scene and convey character emotions. Aim for a tone suitable for children aged 6–10."

    #### Example Input:
    None required — the model generates content autonomously.

    #### Example Output:
    Once upon a time in Greenleaf Forest, a wise old owl named Oliver noticed that the river had stopped flowing. Concerned, he called a meeting with his neighbors: Mina the mouse, Benny the bear, and Tilly the turtle. Together, they discovered a fallen tree blocking the stream. With teamwork and determination, they cleared the path and restored the water supply. From then on, the animals made sure to keep their home safe and healthy...

    #### Optimization Notes:
    - Added specific narrative elements (characters, setting, conflict, theme).  
    - Defined word count and target audience.  
    - Clarified tone and style expectations.  
    - Introduced emotional and moral dimensions to enrich storytelling.

    ---

    Please apply this process to any user-submitted prompt and return your response using the above template.

    The original prompt is:

"""

def input_enhanced_llm_output(questions):
    expanded_input = prompt_expand + questions
    return ask_llm(expanded_input, query_engine)

question = "Please help me write a promotional copy for our company's new book. The general content of the book is xxx(will be given later)"
input_enhanced_llm_output(question)

#### 1. Optimized Prompt:
"Craft a compelling promotional copy (approximately 150-200 words) for our company’s new book titled [Book Title]. The book focuses on [brief description of the book's content, e.g., innovative strategies for educational content development, practical tips for teacher training, or data-driven approaches to improving learning outcomes]. The promotional copy should highlight the key benefits and unique selling points of the book, such as its practical insights, real-world applications, and expert authorship. Emphasize how the book can help educators, trainers, and learners achieve better results. Use a persuasive and engaging tone that resonates with educational professionals and enthusiasts. Ensure the copy includes a call-to-action encouraging readers to purchase the book or learn more about it."

#### 2. Example Input:
"The book is titled 'Mastering Educational Content Development.' It provides comprehensive guidance on creating high-quality educational mater

### 3.2 Prompt Template

When developing large language model applications, it's not ideal to let users write prompts based on a framework. Instead, you can use elements from various prompt frameworks to build a structured prompt template. A prompt template allows you to predefine key information, such as the role of the LLM and any important guidelines, which helps guide the model’s behavior. Developers only need to adjust input parameters within the template to create consistent and standardized applications for LLMs.


In the RAG chatbot created using LlamaIndex, there is a default prompt template that looks like this:

- The default template can be viewed using code. You can refer to [the code on the LlamaIndex official website.](https://docs.llamaindex.ai/en/stable/examples/prompts/prompts_rag/) The original LlamaIndex prompt template is:  



```text
Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {query_str}
Answer:
```

Both `context_str` and `query_str`  are placeholders that get replaced during the retrieval and querying process. The `context_str` is filled with relevant information retrieved from the vector database, while `query_str` is replaced with the user's actual question.



Since the default template is general-purpose , it may not be suitable for all scenarios. For example, you might adjust the tone, add specific instructions, or define the expected format of the response. This customization ensures that the model’s output aligns more closely with your requirements. 

You can use the following sample code, where `prompt_template_string` represents the new prompt template:

In [9]:
# Build prompt template
prompt_template_string = (
    "You are the company's customer service assistant, you need to answer users' questions concisely."
    "\n【Notes】：\n"
    "1. Answer user questions based on contextual information.\n"
    "2. Only answer the user's question, do not output other information\n"
    "The following is reference information."
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Question: {query_str}\n."
    "Answer: "
)

# Update prompt template
rag.update_prompt_template(query_engine, prompt_template_string)

<llama_index.core.query_engine.retriever_query_engine.RetrieverQueryEngine at 0x3415c6e10>

## 4. Techniques for Building Effective Prompts

In Section 3.1, some key elements of prompt design were introduced. This section will explain specific prompt techniques in detail, focusing on practical scenarios and how to apply the elements of a prompt effectively.



### 4.1 Clearly Express Requirements and Use Delimiters

Clearly stating your requirements ensures that the content generated by the LLM is highly relevant to the task at hand. These requirements typically include the **task objective**, **context**, and **background** information.

Using **delimiters** helps LLMs focus on specific goals, avoid ambiguous interpretations, and reduce unnecessary processing. Delimiters also help structure your prompt in a way that improves readability and model understanding.

Here are somecommonly used delimiters in English prompt engineering:

| Delimiter Type      | Example                         | Use Case / Description |
|---------------------|----------------------------------|--------------------------|
| Triple Quotes         | `"""..."""`                      | For multi-line strings or input blocks |
| Triple Hashes         | `### Section ###`                | To mark headings or instruction sections |
| Double Angle Brackets | `<<example>>...<</example>>`     | To define custom tags for variables or examples |
| XML-like Tags         | `<input>...</input>`             | For structured formatting and content separation |
| Dashes                | `--- Section ---`                | To separate paragraphs or logical blocks |
| Equal Signs           | `=== Output Format ===`          | Similar to headings; often used for formatting clarity |

> ⚠️ Note: Avoid using symbols that appear frequently in your content as delimiters to prevent confusion. For example, if square brackets `[]` are commonly used  in your input text, they may not be suitable as delimiters.

---

### ✅ Example Usage in an English Prompt

```text
### Instruction ###
You are a helpful assistant. Please summarize the following article.

=== Article Content ===
""" 
Artificial intelligence (AI) is a branch of computer science that aims to create systems capable of performing tasks that typically require human intelligence.
"""

### Output Format ###
Please provide a summary in no more than 50 words.
```

In [10]:
question = """
### Instruction ###
Expand and polish the text enclosed in XML tags below to make the copy vivid, creative, and appealing to new employees. 
The tone should be professional yet welcoming, suitable for onboarding materials.

=== Original Text ===
<text>
New Employee Training Camp Activity
</text>

=== Output Requirements ===
- Enhance the content with engaging language.
- Add context about the purpose and benefits of the training camp.
- Include key elements such as team-building, mentorship, and orientation.
- Keep the final output concise and motivational.
"""

ask_llm(question, query_engine)

Welcome to the New Employee Training Camp!

Embark on an exciting journey designed to not only equip you with the essential skills and knowledge but also to foster a sense of community and collaboration. This training camp is your gateway to becoming a valued member of our dynamic team. Here, you will:

- **Dive Deep into Our Culture:** Understand the values, mission, and vision that drive us forward.
- **Build Strong Foundations:** Gain hands-on experience and in-depth knowledge of our products and services.
- **Forge Lasting Connections:** Engage in team-building activities that promote camaraderie and teamwork.
- **Learn from the Best:** Receive mentorship from experienced professionals who are eager to share their insights and expertise.
- **Navigate Your Path:** Participate in orientation sessions that guide you through your role and responsibilities, setting you up for success.

Join us in this enriching experience and take the first step towards a rewarding career with us!

From the above execution results, it is evident that the prompt effectively communicates the task requirements: to expand and enhance the provided text.

The core theme of the task is "**New Employee Training Camp Activities,**" and the intended audience is "**new company employees.**" In the prompt, the delimiter `<text>...</text>` is used to clearly encapsulate and isolate the original content for processing.

By structuring the prompt with clear instructions, appropriate context, and standardized delimiters, the model is better guided to generate output that not only preserves the original meaning but also becomes more vivid, engaging, and tailored to the target audience.

### 4.2 Define Role and Audience

The **role** refers to the identity the LLM should assume in a specific scenario, such as an expert, advisor, or assistant. The **audience** is the target user group for which the model's output is intended, such as general consumers or students. Defining these elements helps developers tailor the tone, depth, and structure of the content accordingly.

The example below demonstrates how different system roles affect the style and content of the generated output. 

> Pass the prompt template into `prompt_template_string`, and the user question into `question`, then call the `ask_llm` function to execute it.

<table>
  <thead>
    <tr>
      <th width = '80px'>Role</th>
      <th width = '500px'>LLM Algorithm Engineer</th>
      <th width = '500px'>Elementary School Teacher</th>
    </tr>
  </thead>
  <tbody valign = 'top'>
    <tr>
      <td>Prompt Template</td>
      <td>"You are a senior large language model algorithm engineer. Please answer the user's question based on the provided information\n"
"---------------------\n"
"{context_str}\n"
"---------------------\n"
"Question: {query_str}\n."
"Answer: "</td>
      <td>"You are an elementary school teacher. Please answer the user's question based on the provided information\n"
"---------------------\n"
"{context_str}\n"
"---------------------\n"
"Question: {query_str}\n."
"Answer: "</td>
    </tr>
    <tr>
      <td>User Question</td>
      <td>Qwen-VL</td>
      <td>Qwen-VL</td>
    </tr>
    <tr>
      <td>Output Answer</td>
      <td>Qwen-VL is a multimodal pre-trained model introduced by Alibaba Cloud. It combines text and image processing capabilities, enabling it to understand and generate text descriptions related to images. Simply put, Qwen-VL can be regarded as an artificial intelligence assistant that can "describe images in words." It not only understands the content of images but also generates corresponding textual descriptions or answers questions related to the images. This technology has broad application prospects in various fields, such as automatic image tagging, visual question-answering systems, and content-based image retrieval. For new media operations, tools like Qwen-VL can help improve content creation efficiency, enhance user experience, and provide users with richer and more diverse ways of presenting information.</td>
      <td>Kids, today we’re going to meet a new friend named Qwen-VL!\nQwen-VL is like a very smart robot that has "eyes" 👀 and a "brain" 🧠. It can "see" pictures and videos and understand their meaning, just like us!\nFor example, if you show it a picture of a kitten playing with a ball of yarn, it can not only recognize that it’s a kitten 🐱 but also tell you that it’s playing with a ball of yarn 🧶. Even better, you can ask it, "What color is the kitten?", and it will tell you the color of the kitten.\nIt can also understand the content in videos. For instance, if you show it a video of kids playing soccer ⚽️, it can tell you who is kicking the ball, where the ball goes, and even guess what might happen next!\nQwen-VL is like a child who is still learning. It keeps getting smarter and improving, and in the future, it will become even more intelligent, capable of doing more things and helping us better understand the world!</td>
    </tr>
  </tbody>
</table>  



Sometimes, when developers design applications using large language models (LLMs), they need the model to produce structured data as output so it can be used by downstream systems to complete the full application workflow. While LLMs typically generate free-form natural language, they are fully capable of producing structured outputs such as JSON, XML, or CSV formats — as long as the desired format and requirements are clearly specified in the prompt.

In the following example, based on the prompt template from Section 3.2, we will act as a Q&A bot to assist in reviewing the quality of company-developed documentation, and return the results in **JSON format**:

In [11]:
question_task = """
[Task Requirements]
You will be given a sentence or paragraph. Your task is to review it for any spelling or grammatical errors. 
If there are errors, you should identify them, explain the issue, and provide a corrected version. 

---
[Output Requirements]
Please output only in JSON format. Do not include markdown or code blocks.
The JSON must contain:
- "label": 0 if an error is found, 1 if no error.
- "reason": A brief explanation of the error (if any).
- "correct": The corrected version of the text (or the original if no error).

---
[User Input]
The following is the user input. Please review it:
"""

question_doc = "Delimiters are special symbols that help large langauge models (LLMs) identify which parts of the prompt should be considered a complete unit of meaning."

question = question_task + question_doc

ask_llm(question, query_engine)

{
  "label": 0,
  "reason": "There is a typo in 'langauge', which should be 'language'.",
  "correct": "Delimiters are special symbols that help large language models (LLMs) identify which parts of the prompt should be considered a complete unit of meaning."
}

As can be seen from the results of the above example, in the prompt `question_task`, it was specified that the output format should be JSON and the content of the output was defined. The LLM successfully generated content in the required format. This stable formatted output makes it feasible to integrate the LLM into existing systems.

On news websites, blog platforms, or internal knowledge-sharing platforms within enterprises, articles edited or published by users may contain typos, grammatical errors, or even sensitive information. However, traditional manual review methods are prone to oversight. In this case, LLM can be integrated to review the content. If an article is flagged for serious grammatical errors or high-risk sensitive words, its priority for revision will be set to "high." For articles with minor issues, the priority for revision can be set to "low." This approach saves labor costs and improves the efficiency and accuracy of the system.

Of course, there are many applications similar to the above scenario. Developers can analyze bottlenecks in system processes or focus on data-intensive tasks to explore more use cases for LLMs.



### 4.4 Providing Few-Shot Examples

In the example in section 4.3, the prompt specified the output format, and the large language model successfully generated formatted content. However, if we want the output from the large language model to not only be correctly formatted but also maintain consistency in style and structure, we can provide a few examples as references. This is equivalent to giving the large language model a "reference book." In the following code example, let’s first observe the output of the large language model without any examples!



In [13]:
question_task = """
### 📝 Task Requirements
You are to generate content based on the user's topic, following the **style and structure** of the example provided below. Do not add any extra text or explanations — return only the final Markdown-formatted output.

---
### 🧩 Output Format Requirements
Please ensure your response strictly follows this format:
- Use Markdown syntax (headings, lists, bold, links, image placeholders, etc.)
- Keep the tone informative and easy to follow
- Structure content in a step-by-step or tutorial format
- Avoid markdown code blocks around the final output itself

---
### 👤 User Input Topic
The following is the user's requested topic:
""" 

question_doc = "Handmade Keychain Making Tutorial"

question = question_task + question_doc

ask_llm(question, query_engine)

# Handmade Keychain Making Tutorial

## Introduction
Making handmade keychains is a fun and creative activity that allows you to personalize gifts or accessories. This tutorial will guide you through the steps to create a simple yet stylish keychain using basic materials and tools.

## Materials Needed
- **Beads**: Assorted sizes and colors
- **Jump rings**: Small metal rings for attaching components
- **Key ring**: To attach the keychain to keys
- **Wire**: Thin, flexible wire for stringing beads
- **Pliers**: Round-nose and flat-nose pliers for manipulating wire and jump rings
- **Scissors**: For cutting wire
- **Clasp** (optional): For adding a secure closure

## Step-by-Step Instructions

### Step 1: Gather Your Materials
Before you start, lay out all your materials on a clean, flat surface. Ensure you have everything listed above within reach.

### Step 2: Cut the Wire
Using your scissors, cut a piece of wire that is about 6 inches long. This length should be sufficient for most k

The large language model in the above example successfully output a tutorial for making handmade keychains, but the content was not concise enough. If you only want the large language model to output content in a specific style and structure, such as only outputting the theme, list of materials, steps, etc., you can add a few examples for the large language model so that it can 'imitate' them!



In [15]:
question_task = """
### 📝 Task Requirements
You are to generate content based on the user's topic, following the **style and structure** of the example provided below. Do not add any extra text or explanations — return only the final Markdown-formatted output.

---
### 🧩 Output Format Requirements
Please ensure your response strictly follows this format:
- Use Markdown syntax (headings, lists, bold, links, image placeholders, etc.)
- Keep the tone informative and easy to follow
- Structure content in a step-by-step or tutorial format
- Avoid markdown code blocks around the final output itself

---
### 📚 Example Content
<example>
### Example 1: Making a Simple Bookmark
# Simple Bookmark Making Tutorial

## Steps
1. Choose a piece of colored cardstock.
2. Use a pencil to draw a rectangle on the cardstock, with dimensions approximately 2 inches x 6 inches.
3. Cut out the rectangle along the pencil lines.
4. Personalize the bookmark by decorating it with stickers.
5. Done! You now have a unique bookmark.
</example>

---
### 👤 User Input Topic
The following is the topic requested by the user:
"""

question_doc = "Making Handmade Greeting Cards"

question = question_task + question_doc

ask_llm(question, query_engine)

### Making Handmade Greeting Cards

# Handmade Greeting Card Tutorial

## Materials Needed
- Colored cardstock
- Scissors
- Glue stick
- Markers or colored pencils
- Decorative elements (stickers, glitter, ribbons)

## Steps
1. **Choose Your Cardstock**
   - Select a piece of colored cardstock that matches the occasion (e.g., birthday, holiday).

2. **Cut the Cardstock**
   - Fold the cardstock in half to create a card shape. Ensure the edges are aligned neatly.

3. **Design the Front**
   - Use markers or colored pencils to draw a design on the front of the card. You can draw flowers, animals, or any other pattern you like.

4. **Add Decorations**
   - Enhance your design by adding stickers, glitter, or ribbons. Place them carefully to complement your drawing.

5. **Write a Message**
   - Open the card and write a personal message inside. Use a pen or marker to make it stand out.

6. **Final Touches**
   - Review your card to ensure everything is secure and looks as intended. Make any

From the above example results, it can be seen that the LLM has completely followed the example and output content with the same structure and style. While specifying the output format in the prompt, it is recommended to provide a few examples for the LLM to reference, which can make the output of the LLM more stable and consistent.  



### 4.5 Giving the Model "Thinking" Time

For some complex tasks, using the prompt mentioned above may still not help large language models (LLMs) complete the task. However, you can guide the LLMs to output intermediate steps of the task by allowing them to "think" step by step, providing more evidence before reasoning, thereby enhancing their performance in complex tasks. The chain-of-thought (COT) (COT) method is one way to make the model think. It breaks down complex problems into subproblems by processing intermediate steps, eventually deriving the correct answer.

Suppose there is such a scenario where we ask the LLM to solve the following math problem. As a hint, the correct answer to this question is 10,500 yuan. First, we use a simple prompt: 



In [16]:
question = """
[Background Information]
An educational training institution (hereinafter referred to as the "company") incurred the following main expenses in the 2023 fiscal year:
To provide classes for students in different cities, the company's teachers traveled on business trips 5 times during the year, each trip lasting one week. The specific expenses are as follows:
   - Transportation and accommodation fees: average 1600 yuan/trip
   - Teaching materials procurement costs: At the beginning of the year, the company purchased a batch of teaching materials for a total price of 10,000 yuan, which is expected to last for 4 years.

[Problem Description]
Based on the above background information, complete the following tasks:
Calculate the total travel expenses for the year due to teacher business trips, including the amortized cost of teaching materials.

[Output Requirements]
Provide only the total travel expenses directly, without any other information. """
ask_llm(question, query_engine)

9000 yuan

Based on the experimental results above, the computation results of the large language model (LLMs) are incorrect. Below, we will use the Chain-of-Thought (COT) method to allow the large language model (LLMs) to think step by step.  



In [17]:
question = """An educational training institution (hereinafter referred to as the "company") incurred the following major expenses in the 2023 fiscal year:
To provide classes for students in different cities, the company's teachers traveled on business trips 5 times throughout the year, with each trip lasting one week. The specific expenses are as follows:
   - Transportation and accommodation fees: an average of 1600 yuan per trip
   - Teaching materials procurement costs: At the beginning of the year, the company purchased a batch of teaching materials for a total price of 10,000 yuan, which is expected to last for 4 years.
   
### Problem Description
Based on the above background information, complete the following tasks:
Calculate the total travel expenses for the year caused by teacher business trips, including the amortized cost of teaching materials.

### Output Requirements
Please derive step by step to calculate the total travel expenses."""

ask_llm(question, query_engine)

To calculate the total travel expenses for the year, including the amortized cost of teaching materials, follow these steps:

1. **Calculate the total transportation and accommodation fees:**
   - Number of trips: 5
   - Average cost per trip: 1600 yuan
   - Total transportation and accommodation fees = 5 trips * 1600 yuan/trip = 8000 yuan

2. **Calculate the annual amortized cost of teaching materials:**
   - Total cost of teaching materials: 10,000 yuan
   - Expected lifespan: 4 years
   - Annual amortized cost = 10,000 yuan / 4 years = 2500 yuan

3. **Calculate the total travel expenses for the year:**
   - Total transportation and accommodation fees: 8000 yuan
   - Annual amortized cost of teaching materials: 2500 yuan
   - Total travel expenses = 8000 yuan + 2500 yuan = 10500 yuan

Therefore, the total travel expenses for the year, including the amortized cost of teaching materials, is 10,500 yuan.

After optimization of the prompt, large language models can accurately compute results. Therefore, when developing LLM applications, the method of adding a chain-of-thought (CoT) to prompts can ensure that certain reasoning tasks are executed correctly.

There are also many methods to make large language models “think,” such as: tree of thoughts (ToT), graph of thoughts (GOT), etc. However, given the current development of large language models, relying solely on guiding large language models to “think” is still insufficient for completing more complex tasks. large language models are gradually evolving from the CoT prompting method towards multi-agent systems (MAS). You can read more details in <2_6_Extend_FAQ_Bot_Capabilities_with_Plugins>.

## 5. Using Large Language Models (LLMs) for Intent Recognition

After learning about prompt engineering techniques, colleagues have realized that the Q&A bot can not only be used to look up information but also assist them in reviewing documents and translating documents. Although in most cases, your Q&A bot can help identify issues, it still has some limitations. As shown in the following example:  



In [18]:
rag.ask('Please help me review this sentence: Does a technical content engineer need to design and develop high-quality educational materials and courses?', query_engine=query_engine)

Yes, a Content Development Engineer is responsible for designing and developing high-quality educational materials and courses, including writing syllabi, creating courseware, and designing assessment tools.

From the experimental results above, it can be seen that the Q&A robot responded directly to the user's statement without understanding the intended task of reviewing a sentence for errors. To understand why this happened, let's examine what content the Q&A bot retrieved from the knowledge base before generating its answer.

The following only shows part of the retrieved content:

```
Core Responsibilities
Integrating educational theory with technical practice, support learners' growth and development through high-quality content creation.
```

You can refer to <2_4_Automate_Evaluation_of_FAQ_Bot_Performance> to learn how to view the retrieved fragments from the knowledge base.  

From the retrieved fragments in the knowledge base, most of the content relates to the responsibilities of a technical content engineer. It can be assumed that the Q&A bot was influenced by this context, causing its response to focus on writing or editing technical documents rather than performing the requested document review task.

Since contextual information significantly affects the responses generated by large language models (LLMs), it is unnecessary to use the RAG method for every query. You can leverage LLMs to first perform **intent recognition**: classify the type of user question. 

For example:
- If the task involves **document review**, **grammar checking**, or **content translation**, the input should bypass the RAG pipeline and be handled directly by the LLM.
- Only for **internal knowledge queries** — such as "What are the guidelines for writing API documentation?" — should the RAG pipeline be used to generate answers based on the knowledge base.

<img src="https://img.alicdn.com/imgextra/i3/O1CN01nWw7IO2AGCuuujq8G_!!6000000008175-2-tps-3329-1398.png" width="1000">  



There are the following two methods for using large language models (LLMs) to perform intent recognition: 

- Using prompt: By designing specific prompt, guide the LLM to generate responses that meet expectations. This method does not require modifying the model’s parameters but relies on constructed inputs to activate knowledge already present within the model. 
- Fine-tuning the model: Based on a pre-training foundational model, further train the model using specific annotated data to make it better at classifying intents. Fine-tuning involves adjusting some or all of the model’s parameters. 

In this section, we will help the LLM perform intent recognition by crafting effective prompt. You can learn about the method of fine-tuning in the subsection <2_7_Improve_Model_Accuracy_and_Efficiency_via_Fine_Tuning>. 



### 5.1 Intent Recognition

Next, we will construct prompts to enable large language models (LLMs) to classify questions. Since formatted content is required after intent recognition for document review or the use of RAG applications, the following prompting techniques will be considered to ensure accurate classification of user questions:
- Specify output format: Define the output format to make classification results standardized and easy to parse.
- Few-shot examples: Provide examples to help LLMs understand the characteristics and classification rules of each category.  



In [23]:
from chatbot import llm

# Build the prompt
prompt = '''
[Role Background]
You are a question classification router, responsible for determining the type of user questions and categorizing them into one of the following three categories:
1. Internal Company Document Query
2. Content Translation
3. Document Review

[Task Requirements]
Your task is to judge the intent based on the user's input content and select only the most appropriate category. Output only the category name without additional explanation. The judgment criteria are as follows:

- If the question involves company policies, processes, internal tools, or job descriptions and responsibilities, choose "Internal Company Document Query".
- If the input contains any foreign language or words like "translation", choose "Content Translation".
- If the question involves checking or summarizing external documents or link content, choose "Document Review".
- The user's previous inputs have no relation to the question classification; consider each dialogue independently for classification.

[Few-shot Examples]
Example 1: User input: "What are the commonly used project management tools within our company?"
Category: Internal Company Document Query

Example 2: User input: "Please translate the following sentence into French: How can we finish the assignment on time?"
Category: Content Translation

Example 3: User input: "Please review the document at this link: https://help.aliyun.com/zh/model-studio/user-guide/long-context-qwen-long"
Category: Document Review

Example 4: User input: "Please review the following content: Does a technical content engineer need to design and develop high-quality educational materials and courses?"
Category: Document Review

Example 5: User input: "What are the core responsibilities of a technical content engineer?"
Category: Internal Company Document Query

[User Input]
The following is the user's input, please determine the classification:
'''

# Get the type of question
def get_question_type(question):
    return llm.invoke(prompt + question)

print(get_question_type('https://www.promptingguide.ai/zh/techniques/fewshot'), '\n')
print(get_question_type('Translate the following sentence into German: That is a big one I dont know why'), '\n')
print(get_question_type('As a technical content engineer, what should I pay attention to?'), '\n')

Document Review 

Content Translation 

Internal Company Document Query 



By providing clear output formats and few-shot examples, the Q&A bot can more accurately identify question types and produce outputs in the expected format. This optimization makes classification tasks more standardized and lays the groundwork for incorporating intent recognition into the Q&A bot.  



### 5.2 Applying Intent Recognition to Q&A Bots

After recognizing the intent of a user's question, you can have the Q&A bot first identify the type of question and then use different prompts and workflows to respond to it.



In [28]:
def ask_llm_route(question):
    question_type = get_question_type(question)
    print(f'Question: {question}\nType: {question_type}')
  
    reviewer_prompt = """
    [Role Background]
    You are a document error-checking expert responsible for identifying obvious errors in documents or web content.
    [Task Requirements]
    - Your response should be concise.
    - If there are no obvious issues, reply directly with 'No issues'.\n
    [Input as follows]\n"""
  
    translator_prompt = """
   [Task Requirements]
    You are a translation expert who identifies text in different languages and translates it into the target language. If not specified, the target language is French.
    [Input as follows]\n"""

    if question_type == 'Document Review':
        return llm.invoke(reviewer_prompt + question)
    elif question_type == 'Internal Company Document Query':
        return rag.ask(question, query_engine=query_engine)
    elif question_type == 'Content Translation':
        return llm.invoke(translator_prompt + question)
    else:
        return "Unable to recognize the question type. Please re-enter."

query_engine =rag.create_query_engine(index=rag.load_index())

In [29]:
# Question 1
print(ask_llm_route('https://www.promptingguide.ai/zh/techniques/fewshot'), '\n')

# Question 2
print(ask_llm_route('Please help me check this document: Do technicel content engineers need to optimize and update content and collaborate across departments?'), '\n')

# Question 3
print(ask_llm_route('Do technical content engineers need to optimize and update content and collaborate across departments?'), '\n')

# Question 4:
print(ask_llm_route('Translate the following sentence into French: A true master always carries the heart of a student.'), '\n')

Question: https://www.promptingguide.ai/zh/techniques/fewshot
Type: Document Review
No issues 

Question: Please help me check this document: Do technicel content engineers need to optimize and update content and collaborate across departments?
Type: Document Review
The word "technicel" is misspelled. It should be "technical." 

Question: Do technical content engineers need to optimize and update content and collaborate across departments?
Type: Internal Company Document Query
Yes, technical content engineers are responsible for optimizing and updating content to ensure it remains relevant and effective. They also collaborate across departments, working closely with instructional designers, educational psychologists, technical teams, and marketing personnel to ensure the smooth implementation of technical processes and effective communication to the target audience.None 

Question: Translate the following sentence into French: A true master always carries the heart of a student.
Type: 

As can be seen from the above experiments, the optimized Q&A bot does not always go through the RAG pipeline for every question. This not only saves resources but also avoids interference from the knowledge base with the reasoning of the large language models (LLMs), which could lead to poor response quality.
- Resource savings: For questions about checking document errors, the large language models (LLMs) can directly reply without needing to retrieve reference materials, indicating that there was resource waste in previous implementations.
- Avoiding misunderstandings: In previous implementations, reference materials were retrieved every time, and these recalled text segments might interfere with the large language model's understanding of the question, leading to irrelevant responses.



## 6. Reasoning LLMs

The prompt techniques and prompt frameworks discussed earlier can be widely applied to general large language models (such as Qwen2.5-max, GPT-4, DeepSeek-V3). These models are designed for a wide range of scenarios including general conversation, knowledge-based Q&A, text generation, and more. In addition to general-purpose LLMs, there is currently another category of LLMs specifically designed for "reasoning"—`Reasoning LLMs`.

### 6.1 What Are Reasoning LLMs?

You may have already learned about Alibaba Cloud's reasoning LLM [QwQ](https://help.aliyun.com/zh/model-studio/user-guide/qwq) through the extended reading in Section 2.1. To facilitate subsequent calls in this section, the code provided in Section 2.1 has been rewritten, and you can try running the following code:  



In [30]:
from openai import OpenAI
import os

def reasoning_model_response(user_prompt, system_prompt="You are a programming assistant.", model="qwq-32b"):
    """
    prompt: The prompt input by the user
    model: Here we use qwq-32b as an example. You can replace it with other inference model names as needed, such as: deepseek-r1
    """
    # Initialize client
    client = OpenAI(
        api_key=os.getenv("DASHSCOPE_API_KEY"),
        base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"
    )

    # Initialize status variables
    is_answering = False

    # Initiate streaming request
    completion = client.chat.completions.create(
        model=model,
        # messages=[{"role": "user", "content": prompt}],
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ],
        stream=True,
    )

    # Print thinking process title
    print("\n" + "=" * 20 + "Thinking Process" + "=" * 20 + "\n")

    # Handle streaming response
    for chunk in completion:
        if chunk.choices:
            delta = chunk.choices[0].delta
            if hasattr(delta, 'reasoning_content') and delta.reasoning_content is not None:
                # Process thinking process content
                print(delta.reasoning_content, end='', flush=True)
            else:
                # Switch to answer output mode
                if delta.content != "" and not is_answering:
                    print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
                    is_answering = True
                # Process answer content
                if delta.content:
                    print(delta.content, end='', flush=True)

In [31]:
reasoning_model_response(user_prompt="Who are you?")



Okay, the user asked, "Who are you?" Let me think about how to respond.

First, I need to remember my role. I'm a programming assistant, right? But the user might not know that yet. So I should start by introducing myself as Qwen, a large language model developed by Alibaba Cloud. That's the basic info.

Wait, but the user specifically mentioned being a programming assistant. Maybe I should highlight that aspect more. They might be looking for help with coding. So I should mention my capabilities in programming, like writing code, explaining concepts, debugging, etc.

Hmm, also, I should keep it friendly and open-ended. Let them know I can help with various tasks beyond just programming. Maybe list some examples to give them an idea of what I can do. But not too long, keep it concise.

Oh, and I should make sure to invite them to ask for help. Something like "Feel free to ask..." to encourage interaction. Alright, putting it all together now.

Hello, I’m Qwen, a large language model 

As can be seen from the example, the reasoning model has an additional "`thinking process`" compared to general LLMs. It is like when solving math problems, some people will first derive step by step on scratch paper instead of directly giving the answer, reducing the model's "off-the-cuff" errors. At the same time, during the step-by-step thinking process, if a contradiction is found in a certain step, it is possible to go back and check, then readjust the thinking. Showing the reasoning steps also makes it convenient for people to understand and verify the logic along the model’s line of thought.<br>
Compared with general LLMs, reasoning model are usually more reliable when solving complex problems, such as mathematical problem-solving, code writing, legal case analysis, and other scenarios that require rigorous reasoning. This does not mean that reasoning models are necessarily better; both types of models have their own application scenarios. The following table compares these two types of models from some typical dimensions:<br>

| Dimension            | Reasoning Model  | General Model  |
|-------------------|------------------|------------|
| Design Goal     | Focuses on tasks requiring deep analysis such as **logical reasoning, multi-step problem solving, and mathematical calculations** | Targets broad scenarios such as **general conversation, knowledge-based Q&A, text generation |
| Training Data Emphasis  | Enhanced reasoning capabilities through large datasets of **math problem solutions, code logic, scientific reasoning**, etc. | Covers massive data across multiple domains such as **encyclopedias, literature, conversations**, etc. |
| Typical Output Characteristics | Outputs include **complete derivations**, focusing on the integrity of logical chains | Outputs are **concise and direct**, emphasizing natural language expression of results |
| Response Speed | Complex reasoning tasks result in **slower responses** (requires multi-step calculations) | Routine tasks have **faster responses** (mainly single-step generation) |

Reasoning model or general model? How to choose? Here are some recommendations:
- **Clearly defined general tasks**: For clearly defined problems, **general models** generally handle them well.
- **Complex tasks**: For very complex tasks that require relatively **more precise and reliable** answers, it is recommended to use **reasoning models**. These tasks may include:
    - Ambiguous tasks: Very little task-related information is available, and you cannot provide the model with relatively clear guidance.
    - Finding a needle in a haystack: Passing a large amount of unstructured data, extracting the most relevant information, or finding connections/differences.
    - Debugging and improving code: Requires reviewing and further debugging and improving large amounts of code.
- **Speed and cost**: Generally speaking, reasoning models have longer reasoning times. If you are sensitive to time and cost and the task complexity is not high, **general models** may be a better choice.<br>

Of course, you can also combine the two types of models in your application: using the reasoning model to complete the Agent's planning and decision-making, and using the general model to execute tasks.

### 6.2 Prompt Techniques Suitable for Reasoning Large Language Models

Reasoning models can provide detailed and well-structured responses even when faced with relatively ambiguous tasks. You can still ensure the baseline quality of reasoning in large language models through **prompt techniques**:<br>
#### Technique One: Keep task prompts concise and clear, providing sufficient background information<br>
The **clear expression of requirements** introduced in Section 4.1 also applies to reasoning models. Although reasoning models are highly capable, they cannot "read minds." You need to keep your prompts **concise and clear**, allowing the reasoning model to focus on the core task.<br>  



In [32]:
def example(a):
  b = []
  for i in range(len(a)):
    b.append(a[i]*2)
  return sum(b)

Through the above example, you can see that even if you only provide a large inference model with a piece of code, it can still generate rich answers through a series of reasoning. However, the returned reasoning may contain a lot of information that you don't care about. You can try to clarify the **task objective** to obtain more targeted suggestions:  



In [33]:
prompt_A = """
What's wrong with the following Python code? How to optimize it?
def example(a):
    b = []
    for i in range(len(a)):
        b.append(a[i]*2)
    return sum(b)
"""

reasoning_model_response(user_prompt=prompt_A)



Okay, let me try to figure out what's wrong with this Python code and how to optimize it. The function example(a) takes a list 'a', creates a new list 'b' by doubling each element, then returns the sum of 'b'. 

Hmm, first, I should check if there are any obvious issues. The code seems to work, but maybe there's a more efficient way. Let me think step by step.

First, the code uses a for loop with range(len(a)), which is common but not the most Pythonic way. Maybe using a list comprehension would be better here. That would make the code shorter and possibly faster since list comprehensions are optimized in Python.

Wait, but even better, maybe we can avoid creating the list 'b' altogether. Since the final result is the sum of all elements multiplied by 2, instead of creating a list and then summing, we can just iterate through each element, multiply by 2, and add to a running total. That way, we don't need to store all the doubled elements in memory. That would save space, especially

Similarly, you can further narrow down the scope by combining techniques from this section, such as **4.2 Limiting Roles and Audience** and **4.3 Specifying Output Formats**, to ensure the results meet your expectations.<br>
At the same time, if the prompt is relatively complex, you can use **delimiters** to help the model better understand your intent.<br>  



In [34]:
prompt_B = """
<audience>Beginner Python developers</audience>

<task>Function performance optimization, optimize the code in the code.</task>

<format>
If there are multiple optimization solutions, please output them in the following format:
[Optimization Solution X]
Problem Description: [Description]
Optimization Solution: [Description]
Example Code: [Code Block]
</format>

<code>
def example(a):
  b = []
  for i in range(len(a)):
    b.append(a[i]*2)
  return sum(b)
</code>
"""

reasoning_model_response(user_prompt=prompt_B)



Okay, let's see. The user wants me to optimize this Python function. The function takes a list 'a', creates a new list 'b' where each element is doubled, then returns the sum of 'b'. Hmm, right now it's using a for loop with append and then summing. I need to think of ways to make this faster or more efficient.

First, maybe using a list comprehension instead of a for loop. List comprehensions are generally faster in Python. So instead of initializing an empty list and appending each time, just do [x*2 for x in a]. That should be quicker.

Wait, but then you still have to sum the list. Oh right, but even better, maybe combine the doubling and summing in a generator expression to avoid creating the list at all. Because sum can take a generator, so you don't need to store all elements in a list first. That would save memory and time, especially for large lists. So instead of building the list 'b', just compute each element on the fly and sum them directly. So like sum(x*2 for x in a). 

#### Tip Two: Avoid Chain-of-Thought Prompts<br>
In section 4.5, you learned about improving response quality by using the chain-of-thought technique with large language models (LLMs).<br>
Generally, there is no need to prompt reasoning models with phrases like "think step by step" or "explain your reasoning," as they inherently perform deep thinking. Your prompts might actually restrict the model's performance. Unless you require the LLM to strictly follow a fixed line of reasoning, which is rare.<br>


#### Tip Three: Adjust Prompts Based on Model Responses<br>
Reasoning models are naturally suited for analyzing their thought processes due to their response format (including **reasoning steps**), making it easier for you to refine prompts.<br>
Thus, there’s no need to worry about whether your prompts are perfect. Simply keep engaging with the reasoning model, providing additional information and refining prompts during the conversation.<br>
For instance, when your descriptions are **too abstract** or **not accurately described**, you can use the technique of **adding examples** discussed in section 4.4 to clarify these details. Such examples can often be selected from past conversations with the model.<br>
This process can be repeated multiple times—continuously adjusting prompts and allowing the model to iteratively reason until it meets your requirements.<br>

#### Tip Four: Collaborative Task Completion Between Reasoning Models and General Models<br>
Reasoning models are akin to "smarter employees," effectively assisting you with tasks like reasoning and planning. However, for execution tasks involving repetitive actions, reasoning models may overthink, which could be unnecessary.<br>
A better approach is to let the models collaborate by focusing on their respective strengths: reasoning models handle "slow thinking" for planning or reasoning, while general models focus on "fast thinking" or executing specific actions using tools.<br>
You can try running and understanding the following project:  



In [None]:
from openai import OpenAI
import os
import json
from typing import Generator

class TaskPlanningSystem:
    def __init__(self):
        # Initialize client connection
        self.client = OpenAI(
            api_key=os.getenv("DASHSCOPE_API_KEY"),
            base_url=os.environ["BASE_URL"]
        )
        
        # System toolset (can be extended as needed)
        self.tools = {
            "create_task": {
                "desc": "Create a new task item",
                "params": {"description": "str", "priority": "int"}
            },
            "schedule_event": {
                "desc": "Schedule an event", 
                "params": {"title": "str", "time": "datetime"}
            }
        }

    def generate_plan(self, user_request: str) -> Generator[str, None, None]:
        """Stream-generated task planning"""
        # Build planning prompt
        system_prompt = f"""You are a senior task planner, please convert the user's requirements into executable steps:
        
        Available tools (format: [[tool name]]):
        {json.dumps(self.tools, indent=2)}
        
        Output requirements:
        1. Use Markdown list format
        2. Each step should indicate the tool name
        3. Include necessary parameter examples
        """
        model="qwq-32b"
        # Initiate streaming request
        completion = self.client.chat.completions.create(
            model=model,
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": user_request}
            ],
            stream=True,
            temperature=0.3
        )

        # Print thinking process title
        print(f"\n\n=== [Planning Thinking Phase] Using reasoning model: {model}===")
        is_answering = False
        # Process streaming response
        for chunk in completion:
            if chunk.choices:
                delta = chunk.choices[0].delta
                if hasattr(delta, 'reasoning_content') and delta.reasoning_content is not None:
                    # Process thinking process content
                    reasoning_content = delta.reasoning_content
                    yield reasoning_content
                else:
                    # Switch to answer output mode
                    if delta.content != "" and not is_answering:
                        print(f"\n\n=== [Planning Generation Phase] Using reasoning model: {model}===")
                        is_answering = True
                    # Process answer content
                    if delta.content:
                        content = delta.content
                        yield content

    def execute_plan(self, plan: str) -> dict:
        """Execute the generated task plan"""
        # Call general model to parse the plan
        analysis_prompt = f"""Please parse the following task plan and generate executable instructions:
        
        Plan content:
        {plan}
        
        Output requirements:
        - The returned content must be in JSON format only, do not include other information, do not output code blocks.
        - Include tool names and parameters
        - Example:
            {{"steps": [
                {{"tool": "create_task", "params": {{"description": "..."}}}}
            ]}}
        """
        model="qwen-plus-0919"
        response = self.client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": analysis_prompt}],
            temperature=0
        )

        print(f"\n\nUsing general model {model} to format output:\n{response.choices[0].message.content}")
        # Parse execution instructions
        try:
            instructions = json.loads(response.choices[0].message.content)
            return self._run_instructions(instructions)
        except json.JSONDecodeError:
            return {"error": "Instruction parsing failed"}

    def _run_instructions(self, instructions: dict) -> dict:
        """Actually execute tool calls"""
        results = []
        for step in instructions.get("steps", []):
            tool = step.get("tool")
            params = step.get("params", {})
            
            # Execute tool call (this is an example implementation)
            if tool == "create_task":
                results.append({
                    "tool": tool,
                    "result": f"Task created: {params.get('description')}"
                })
            elif tool == "schedule_event":
                results.append({
                    "tool": tool,
                    "result": f"Event scheduled: {params.get('title')}"
                })
            else:
                results.append({"error": f"Unknown tool: {tool}"})
        
        return {"status": "completed", "results": results}

# Usage example
if __name__ == "__main__":
    PlanningSystem = TaskPlanningSystem()
    
    # Example user request
    user_request = "Please schedule next week's product launch and create related preparation tasks"
    
    # Planning generation phase
    plan_stream = PlanningSystem.generate_plan(user_request)
    generated_plan = []
    for chunk in plan_stream:
        print(chunk, end="", flush=True)
        generated_plan.append(chunk)
    
    full_plan = "".join(generated_plan)
    
    print("\n\n=== Plan Execution Phase ===")
    result = PlanningSystem.execute_plan(full_plan)
    print(f"\nCall tool execution, execution result:")
    print(json.dumps(result, indent=2, ensure_ascii=False))



=== [Planning Thinking Phase] Using reasoning model: qwq-32b===
Okay, the user wants me to schedule next week's product launch and create related preparation tasks. Let me break this down.

First, I need to use the available tools. The tools are create_task and schedule_event. The schedule_event requires a title and a datetime. The create_task needs a description and priority.

Starting with scheduling the event. The product launch is next week. I should pick a specific day and time. Let's say next Friday at 2 PM. So the title could be "Product Launch Event" and the time would be something like "2023-11-17T14:00:00" assuming next Friday is the 17th. Wait, but the exact date might depend on the current date. Since the user didn't specify, I'll use a placeholder or example date.

Next, the preparation tasks. What tasks are usually needed before a product launch? Maybe finalizing the product features, preparing marketing materials, setting up the event venue, and training the team. Each

The above code implements the following core interaction process. The code mainly uses the qwq-32b model for streaming planning generation, then uses qwen-plus-0919 for instruction parsing, and finally decouples tool invocation through the _run_instructions method, making it easier to extend in the future:

<a href="https://img.alicdn.com/imgextra/i4/O1CN01OOAmtb1flZx0MxbUn_!!6000000004047-2-tps-1623-603.png" target="_blank">
<img src="https://img.alicdn.com/imgextra/i4/O1CN01OOAmtb1flZx0MxbUn_!!6000000004047-2-tps-1623-603.png" width="1000">
</a>



## ✅ Summary of this section

Through learning this section, you have mastered prompt that are suitable for general large language models and reasoning models. The flexible application of these techniques can effectively guarantee the lower limit of AI response quality.<br>
However, as AI technology continues to evolve, these techniques may also change (for example, reasoning models may no longer need to improve performance through chain-of-thought prompting). What you need to understand is that the core purpose of these prompt techniques is:<br>
<mark>To clearly express your core needs and provide rich and useful background information (context)</mark> — this is crucial for making good use of large language models.

In the actual implementation of LLM applications, domain experts often participate in designing the prompts. Therefore, hardcoding prompt into your engineering code should be adjusted to be configurable, or even the entire application process should be made configurable, so that it is more convenient for domain experts to participate in the design of prompts and the overall process. Alibaba Cloud's Model Studio provides a visual [application building](https://help.aliyun.com/zh/model-studio/user-guide/application-introduction#7c79befb2djg9) capability, allowing users to complete prompt writing and visually build the entire complex application flow on the page, which is very suitable for LLM application development projects requiring participation from non-technical domain experts.

In the next section, you will learn automated evaluation methods to test the performance of Q&A bots. You can use quantitative metrics to evaluate the effectiveness of your prompt.

## 🔥 Post-class Quiz

### 🔍 Single-choice Question
<details>
<summary style="cursor: pointer; padding: 12px; border: 1px solid #dee2e6; border-radius: 6px;">
<b>Which of the following prompt elements is used to explicitly define the task that the large language model needs to complete❓</b>

- A. Role
- B. Audience
- C. Objective
- D. Context

**[Click to view answer]**
</summary>

<div style="margin-top: 10px; padding: 15px; border: 1px solid #dee2e6; border-radius: 0 0 6px 6px;">

✅ **Reference Answer: C**
📝 **Explanation**:
- The Objective explicitly defines the operation or outcome that the large language model (LLMs) is required to perform. Other options do not directly define the task itself.
- Role defines the identity that the LLMs should assume, Audience specifies the target group, and Context provides background information.

</div>
</details>

---


### 🔍 Multiple-choice Question
<details>
<summary style="cursor: pointer; padding: 12px; border: 1px solid #dee2e6; border-radius: 6px;">
<b>Suppose you want to use a large language model to generate a description of the solar system suitable for third-grade students. Which of the following user_query designs is more reasonable❓</b>

- A. user_query="Write about the solar system."
- B. user_query="Tell me about the solar system."
- C. user_query="Explain the solar system as if you were talking to a third-grade student."
- D. user_query="Write a short and engaging description of the solar system for a third-grade audience, focusing on key planets and their characteristics."
- E. user_query="What is the solar system?"

**[Click to view answer]**
</summary>

<div style="margin-top: 10px; padding: 15px; border: 1px solid #dee2e6; border-radius: 0 0 6px 6px;">

✅ **Reference Answer: CD**
📝 **Explanation**:
- Options C and D clearly specify the target audience and provide requirements for content and style.

</div>
</details>

---


### 🔍 Multiple-choice Question
<details>
<summary style="cursor: pointer; padding: 12px; border: 1px solid #dee2e6; border-radius: 6px;">
<b>When using a reasoning-based large language model (such as qwq-32b) to perform reasoning tasks, which of the following prompting techniques are recommended❓</b>

- A. Provide simple and clear task instructions
- B. Supplement with sufficient background information
- C. If there are specific user groups or task requirements, set roles or audiences
- D. Use chain-of-thought prompts to make the reasoning process more rigorous and reliable
- E. When descriptions are too abstract or cannot be accurately described, clarify them by adding examples

**[Click to view answer]**
</summary>

<div style="margin-top: 10px; padding: 15px; border: 1px solid #dee2e6; border-radius: 0 0 6px 6px;">

✅ **Reference Answer: ABCE**
📝 **Explanation**:
- When using reasoning models, it is advisable to avoid using chain-of-thought prompting, as this may reduce the effectiveness of reasoning.

</div>
</details>
