# Few-Shots Prompting

Few-shot prompting can be used as a technique to enable in-context learning where we provide demonstrations in the prompt to steer the model to better performance. The demonstrations serve as conditioning for subsequent examples where we would like the model to generate a response.

## References:
* [Touvron et al. 2023](https://arxiv.org/pdf/2302.13971.pdf): present few shot properties  when models were scaled to a sufficient size
* [Kaplan et al., 2020](https://arxiv.org/abs/2001.08361)
* [Brown et al. 2020](https://arxiv.org/abs/2005.14165)


## Running this code on MyBind.org

Note: remember that you will need to **adjust CONFIG** with **proper URL and API_KEY**!

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/GenILab-FAU/prompt-eng/HEAD?urlpath=%2Fdoc%2Ftree%2Fprompt-eng%2Ffew_shots.ipynb)



In [21]:
############################################################
## FEW-SHOT PROMPTING: PROJECT OVERVIEW
############################################################

import os
import csv
from datetime import datetime
from _pipeline import create_payload, model_req

# 1) Prepare enhanced few-shot examples
FEW_SHOT_EXAMPLES = """
Example 1:
INPUT:
"You are an AI that generates project reports. 
Please create the 'Project Overview' section covering:
- Title
- Goal
- Problem Statement
- Key Objectives
- Scope
Format it in clear paragraphs, and use Markdown headings."

OUTPUT:
"# Project Overview
## Title
AI-Driven Data Analysis
## Goal
To automate data processing and generate insights by leveraging advanced machine learning algorithms and data visualization techniques.
## Problem Statement
Many organizations struggle with processing large volumes of data efficiently, leading to missed opportunities for actionable insights.
## Key Objectives
1. Develop a scalable data processing pipeline.
2. Implement machine learning models for predictive analytics.
3. Create interactive dashboards for data visualization.
## Scope
This project focuses on data from e-commerce platforms, including sales, customer behavior, and inventory management."

-------------------------------------------------------

Example 2:
INPUT:
"You are an AI that generates project reports.
Please create the 'Project Overview' section covering:
- Title
- Goal
- Problem Statement
- Key Objectives
- Scope
Format it in clear paragraphs, and use Markdown headings."

OUTPUT:
"# Project Overview
## Title
Natural Language Processing Toolkit
## Goal
To provide an end-to-end solution for text analytics, enabling users to extract meaningful insights from unstructured text data.
## Problem Statement
Text data is abundant but difficult to process and analyze due to its unstructured nature, posing challenges for organizations seeking to leverage this data for decision-making.
## Key Objectives
1. Develop text preprocessing and cleaning modules.
2. Implement sentiment analysis and topic modeling algorithms.
3. Create a user-friendly interface for data exploration and visualization.
## Scope
We will address the text ingestion pipeline, focusing on social media data, customer reviews, and support tickets."
"""

# 2) Read the zero-shot output from the file
with open("data/zero_shot_output.txt", "r", encoding="utf-8") as f:
    zero_shot_output = f.read()

# 3) Define the professional refinement prompt
PROFESSIONAL_REFINEMENT_PROMPT = f"""

You are an advanced report generation system tasked with transforming a preliminary report skeleton (generated through a Zero-Shot approach) into a fully polished, structured, and professional project report in Markdown format. Your objective is to enhance clarity, coherence, and visual appeal while ensuring completeness and adherence to high standards of technical documentation.

## Refinement Guidelines
When refining, ensure the following key aspects are met:
1. **Comprehensive Section Inclusion:** Validate the presence of essential sections, including:
   - Project Overview
   - Methodology
   - Results & Evaluation
   - Discussion & Conclusion
   - Appendices
2. **Enhanced Readability & Structure:** Improve content flow, incorporate appropriate subheadings, and use bullet points or tables when applicable.
3. **Refined Technical Presentation:** Utilize professional language, improve technical explanations, and ensure consistency in formatting.
4. **Code Block Optimization:** Properly structure code snippets for clarity and functionality, ensuring correctness and readability.

## Sample Input (Zero-Shot Output):

### Project Overview
- **Project Title:** Automated Social Media Analyzer
- **Problem Statement:** The project aims to automate the analysis of social media trends.
- **Goal:** To provide insights into trending topics and sentiment analysis.
- **Key Objectives:**
  - Extract data from various social media platforms.
  - Perform real-time sentiment analysis.
- **Scope:** The analysis is limited to public posts and trending hashtags.

### Methodology
- **Data & Methods:** Outline data collection methods and analytical techniques.
- **Notebook Code Blocks:**
```python
# Data loading and preprocessing code goes here
```

### Results & Evaluation
- **Results Overview:** Summary of analytical findings.
- **Evaluation Metrics:** Accuracy, precision, recall.

### Discussion & Conclusion
- **Discussion:** Insights on the analysis and potential impact.
- **Conclusion:** Summary of the project outcomes.

### Appendix/Additional Resources
- **Supporting Code and Data:**
```python
# Additional visualization code
```

---

## Sample Output (Refined Report):

# **Automated Social Media Analyzer: Comprehensive Report**

## 1. **Project Overview**
### 1.1 **Project Title**
**Automated Social Media Analyzer**

### 1.2 **Problem Statement**
This project focuses on automating the extraction and analysis of social media data to identify trends and gauge public sentiment. The primary challenge is dealing with unstructured and voluminous data from various platforms.

### 1.3 **Goal**
Develop a robust system to extract, process, and analyze social media data in real time, providing actionable insights into trending topics and overall sentiment.

### 1.4 **Key Objectives**
- **Data Extraction:** Gather data from multiple social media APIs.
- **Data Processing:** Clean and preprocess large datasets.
- **Sentiment Analysis:** Apply machine learning models to evaluate sentiment.
- **Visualization:** Create dynamic visualizations for trend analysis.

### 1.5 **Scope**
The analysis focuses exclusively on public posts and trending hashtags, providing a snapshot of real-time social dynamics without delving into private data.

## 2. **Methodology**
### 2.1 **Data Collection & Preprocessing**
Outline the processes for API integration, data scraping, and preprocessing routines.
```python
# Example: Load and preprocess social media data
import pandas as pd
data = pd.read_csv('social_media_data.csv')
# Data cleaning steps here...
```

### 2.2 **Analytical Techniques**
Describe the statistical or machine learning methods used for sentiment analysis and trend detection.

## 3. **Results & Evaluation**
### 3.1 **Results Overview**
Summarize key findings from the data analysis, including visual charts or graphs if applicable.

### 3.2 **Evaluation Metrics**
Detail the metrics used to assess the accuracy and effectiveness of the analysis (e.g., accuracy, precision, recall).

## 4. **Discussion & Conclusion**
### 4.1 **Discussion**
Discuss the implications of the results, challenges faced during the analysis, and recommendations for future improvements.

### 4.2 **Conclusion**
Provide a concise summary of the project outcomes and insights gained from the analysis.

## 5. **Additional Sections (Optional)**
### 5.1 **Future Work**
Outline potential enhancements or additional analyses for future iterations of the project.

### 5.2 **Appendix & Supporting Materials**
Include any supplementary code, data sources, or further reading materials.
```python
# Additional analysis or visualization code
```

---

## Task:

Using the sample inputs/outputs above as a reference, refine the provided Zero-Shot output into a complete, visually appealing, and detailed project report in Markdown. Ensure that:
- All key sections are clearly defined.
- The extracted project title and problem statement are prominently highlighted.
- Code blocks are used appropriately for notebook sections.
- The report flows logically from introduction to conclusion, with optional sections added if they enhance the report.

Generate the refined report as your final output.

Zero-Shot Output:
{zero_shot_output}
"""

# 4) Combine the examples + professional refinement prompt into one prompt
FEW_SHOT_PROMPT = f"{FEW_SHOT_EXAMPLES}\n\nNow, follow the format shown in the examples:\n\n{PROFESSIONAL_REFINEMENT_PROMPT}"

# 5) Create the payload for your model (adjust model name/params as needed)
payload = create_payload(
    target="open-webui",
    model="qwen2",          # Example model; adjust as desired
    prompt=FEW_SHOT_PROMPT,
    temperature=1.0,        # Balanced creativity
    num_ctx=200,            # Enough context for examples + new content
    num_predict=300         # Enough tokens to produce the entire overview
)

# 6) Make the request
time_taken, few_shot_response = model_req(payload=payload)

# 7) Display the output and timing
print("===== FEW-SHOT OUTPUT =====")
print(few_shot_response)
if time_taken:
    print(f"\nTime taken: {time_taken}s")

# Save the few-shot response to a variable and a CSV file for use in the chain-of-thought notebook
with open("data/few_shot_output.txt", "w", encoding="utf-8") as f:
    f.write(few_shot_response)

log_entry = [
    datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
    "few_shot",
    "qwen2",  # model name
    1.0,
    time_taken,
    few_shot_response.replace("\n", "\\n")  # escape newlines for CSV
]

with open("data/few_shot_logs.csv", "a", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(log_entry)


Payload: {'model': 'qwen2', 'messages': [{'role': 'user', 'content': '\nExample 1:\nINPUT:\n"You are an AI that generates project reports. \nPlease create the \'Project Overview\' section covering:\n- Title\n- Goal\n- Problem Statement\n- Key Objectives\n- Scope\nFormat it in clear paragraphs, and use Markdown headings."\n\nOUTPUT:\n"# Project Overview\n## Title\nAI-Driven Data Analysis\n## Goal\nTo automate data processing and generate insights by leveraging advanced machine learning algorithms and data visualization techniques.\n## Problem Statement\nMany organizations struggle with processing large volumes of data efficiently, leading to missed opportunities for actionable insights.\n## Key Objectives\n1. Develop a scalable data processing pipeline.\n2. Implement machine learning models for predictive analytics.\n3. Create interactive dashboards for data visualization.\n## Scope\nThis project focuses on data from e-commerce platforms, including sales, customer behavior, and inventor

## How to improve it?

Following the findings from [Min et al. (2022)](https://arxiv.org/abs/2202.12837), here are a few more tips about demonstrations/exemplars when doing few-shot:

* "the label space and the distribution of the input text specified by the demonstrations are both important (regardless of whether the labels are correct for individual inputs)"
* the format you use also plays a key role in performance, even if you just use random labels, this is much better than no labels at all.
* additional results show that selecting random labels from a true distribution of labels (instead of a uniform distribution) also helps.