# Chain-of-Thought Prompting

Chain-of-Thought (CoT) prompting enhances complex reasoning by encouraging the model to break down problems into intermediate reasoning steps. When combined with few-shot prompting, it can significantly improve performance on tasks that require multi-step reasoning before arriving at a response.

## Automatic Chain-of-Thought (Auto-CoT)

Traditionally, using CoT prompting with demonstrations involves manually crafting diverse and effective examples. This manual effort is time-consuming and can lead to less-than-optimal results. To address this, Zhang et al. (2022) introduced Auto-CoT, an automated approach that minimizes manual involvement. Their method uses the prompt “Let’s think step by step” to generate reasoning chains automatically for demonstrations. However, this automatic process is not immune to errors. To reduce the impact of such mistakes, the approach emphasizes the importance of diverse demonstrations.

Auto-CoT operates in two main stages:

1. **Question Clustering:** Questions from the dataset are grouped into clusters based on similarity or relevance.
2. **Demonstration Sampling:** A representative question from each cluster is selected, and its reasoning chain is generated using Zero-Shot-CoT guided by simple heuristics.


## References:

* (Wei et al. (2022),)[https://arxiv.org/abs/2201.11903]
* (OpenAI Documentation for Prompt Engineering)[https://platform.openai.com/docs/guides/prompt-engineering]

## Running this code on MyBind.org

Note: remember that you will need to **adjust CONFIG** with **proper URL and API_KEY**!

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/GenILab-FAU/prompt-eng/HEAD?urlpath=%2Fdoc%2Ftree%2Fprompt-eng%2Fchain_of_thought.ipynb)


In [25]:
############################################################
## CHAIN-OF-THOUGHT PROMPTING: PROJECT OVERVIEW + DATA & METHODOLOGIES
############################################################

import os
import csv
import time
from datetime import datetime
from _pipeline import create_payload, model_req

# 1) Define a more detailed Chain-of-Thought prompt
COT_PROMPT = """
You are an AI that generates a comprehensive project report using chain-of-thought reasoning.
First, reason internally about the best way to describe:
1) Project Overview (Title, Goal, Problem Statement, Key Objectives, Scope)
2) Data & Methodologies (Data Sources, Data Processing, Methodology, Approach)

Then, provide a detailed final answer in Markdown:
- Use multiple paragraphs and headings.
- Expand on each point with descriptive explanations.
- Use subheadings if necessary to organize information.

[START OF INSTRUCTIONS]
Project Details:
- Title: "Advanced Image Recognition"
- Goal: "Achieve 95% accuracy on a custom image dataset"
- Problem Statement: "Manual image labeling is slow and error-prone"
- Key Objectives:
  1) Develop a robust CNN model
  2) Implement automated data augmentation
- Scope: "Focus on large-scale image datasets"

Data & Methodologies:
- Data Sources: "Proprietary image dataset + open-source augmentation libraries"
- Data Processing: "Image normalization, labeling, and augmentation pipeline"
- Methodology: "Convolutional Neural Networks, transfer learning"
- Approach: "Iterative training with real-time validation"

Instructions:
1) Think step-by-step internally about how to structure and expand these sections.
2) Present only the final output in Markdown, using multiple paragraphs, bullet points, or subheadings as needed.
3) Aim for a thorough, descriptive style.
[END OF INSTRUCTIONS]
"""

# 2) Create a payload with moderate token usage
payload_cot = create_payload(
    target="open-webui",
    model="Llama-3.2-3B-Instruct",  # Updated model
    prompt=COT_PROMPT,
    temperature=0.8,                # Slightly higher to encourage more detailed text
    num_ctx=300,                    # Increase context for longer expansions
    num_predict=400                 # Enough tokens for detailed paragraphs
)

def request_with_retry(payload, max_retries=2, delay=3):
    """
    Attempts the model_req call up to `max_retries` times,
    waiting `delay` seconds between tries if a 504 error occurs.
    """
    attempt = 0
    while attempt < max_retries:
        try:
            time_cot, cot_output = model_req(payload=payload)
            return time_cot, cot_output
        except Exception as e:
            error_str = str(e)
            if "504" in error_str or "Bad Gateway" in error_str:
                print(f"Got 504 error. Retrying in {delay} seconds...")
                attempt += 1
                time.sleep(delay)
            else:
                # Different error; re-raise
                raise e
    raise RuntimeError("Max retries reached. 504 error persists.")

# 3) Execute the chain-of-thought request with retries
time_cot, cot_output = request_with_retry(payload_cot)

# 4) Print the final Markdown output
print("===== CHAIN-OF-THOUGHT OUTPUT (PROJECT OVERVIEW + DATA & METHODOLOGIES) =====")
print(cot_output)
if time_cot:
    print(f"\nTime taken: {time_cot}s")

# 5) (Optional) Log the result for future reference
os.makedirs("data", exist_ok=True)
log_entry = [
    datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
    "chain_of_thought_data_methods",
    "Llama-3.2-3B-Instruct",
    0.8,
    time_cot,
    cot_output.replace("\n", "\\n")
]

with open("data/chain_of_thought_data_methods_logs.csv", "a", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(log_entry)


Payload: {'model': 'Llama-3.2-3B-Instruct', 'messages': [{'role': 'user', 'content': '\nYou are an AI that generates a comprehensive project report using chain-of-thought reasoning.\nFirst, reason internally about the best way to describe:\n1) Project Overview (Title, Goal, Problem Statement, Key Objectives, Scope)\n2) Data & Methodologies (Data Sources, Data Processing, Methodology, Approach)\n\nThen, provide a detailed final answer in Markdown:\n- Use multiple paragraphs and headings.\n- Expand on each point with descriptive explanations.\n- Use subheadings if necessary to organize information.\n\n[START OF INSTRUCTIONS]\nProject Details:\n- Title: "Advanced Image Recognition"\n- Goal: "Achieve 95% accuracy on a custom image dataset"\n- Problem Statement: "Manual image labeling is slow and error-prone"\n- Key Objectives:\n  1) Develop a robust CNN model\n  2) Implement automated data augmentation\n- Scope: "Focus on large-scale image datasets"\n\nData & Methodologies:\n- Data Sources