# Chain-of-Thought Prompting

Chain-of-Thought (CoT) prompting enhances complex reasoning by encouraging the model to break down problems into intermediate reasoning steps. When combined with few-shot prompting, it can significantly improve performance on tasks that require multi-step reasoning before arriving at a response.

## Automatic Chain-of-Thought (Auto-CoT)

Traditionally, using CoT prompting with demonstrations involves manually crafting diverse and effective examples. This manual effort is time-consuming and can lead to less-than-optimal results. To address this, Zhang et al. (2022) introduced Auto-CoT, an automated approach that minimizes manual involvement. Their method uses the prompt “Let’s think step by step” to generate reasoning chains automatically for demonstrations. However, this automatic process is not immune to errors. To reduce the impact of such mistakes, the approach emphasizes the importance of diverse demonstrations.

Auto-CoT operates in two main stages:

1. **Question Clustering:** Questions from the dataset are grouped into clusters based on similarity or relevance.
2. **Demonstration Sampling:** A representative question from each cluster is selected, and its reasoning chain is generated using Zero-Shot-CoT guided by simple heuristics.


## References:

* (Wei et al. (2022),)[https://arxiv.org/abs/2201.11903]
* (OpenAI Documentation for Prompt Engineering)[https://platform.openai.com/docs/guides/prompt-engineering]

## Running this code on MyBind.org

Note: remember that you will need to **adjust CONFIG** with **proper URL and API_KEY**!

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/GenILab-FAU/prompt-eng/HEAD?urlpath=%2Fdoc%2Ftree%2Fprompt-eng%2Fchain_of_thought.ipynb)


In [4]:
############################################################
## CHAIN-OF-THOUGHT PROMPTING: PROJECT OVERVIEW + DATA & METHODOLOGIES
############################################################

import os
import csv
import time
from datetime import datetime
from _pipeline import create_payload, model_req

# 1) Read the last response from the few-shot output CSV file
with open("data/few_shot_logs.csv", "r", encoding="utf-8") as f:
    reader = csv.reader(f)
    few_shot_output = list(reader)[-1][5]  # Assuming the response is in the sixth column

# 2) Define a more detailed Chain-of-Thought prompt
COT_PROMPT = f"""
# Instructions for Advanced Hybrid Chain-of-Thought Report Generation

You are a top-tier technical report architect and advanced prompt engineer. Your task is to generate a complete, highly professional project report in Markdown that integrates all essential sections—Project Overview, Data & Methodologies, Results & Evaluation, Discussion & Conclusion, and Appendix—while meeting the highest technical and formatting standards. For this task, you must employ an internal chain-of-thought process to deeply analyze and synthesize input data, ensuring every section is detailed, logically structured, and technically rigorous. Do not include any internal chain-of-thought details or reasoning in the final output.

**Internal Chain-of-Thought Process (Do Not Output):**
{few_shot_output}
- Review the last response from the Few-Shot prompt to understand the project context and requirements.
- Analyze the key details and requirements for each section of the final report.
- Identify all data sources mentioned and evaluate their relevance.
- Detail every data acquisition technique (e.g., API integration, web scraping, database queries) and justify each choice with references to tools/libraries.
- List and rationalize all data preprocessing steps (handling missing values, normalization, feature engineering).
- Specify the analytical techniques (statistical methods or ML models) with justifications regarding key parameters and suitability for the project goals.
- Decide on code examples that clearly illustrate these processes.
- Outline a coherent structure for the final report.

1. **Input Synthesis & Section Structuring:**  
   - Review all raw inputs and sample outputs from previous techniques (Zero-Shot, Few-Shot, Chain-of-Thought, Self-Consistency).  
   - Identify key details and requirements for each section.  
   - Outline a coherent structure for the final report.

2. **Deep Technical Analysis for Data & Methodologies:**  
   - Identify all data sources mentioned and evaluate their relevance.  
   - Detail every data acquisition technique (e.g., API integration, web scraping, database queries) and justify each choice with references to tools/libraries.  
   - List and rationalize all data preprocessing steps (handling missing values, normalization, feature engineering).  
   - Specify the analytical techniques (statistical methods or ML models) with justifications regarding key parameters and suitability for the project goals.  
   - Decide on code examples that clearly illustrate these processes.

3. **Section-by-Section Report Generation:**  
   - For **Project Overview:** Generate a clear and concise summary including title, problem statement, goal, objectives, and scope.  
   - For **Data & Methodologies:** Organize the content into sub-sections (Data Sources, Data Collection Methods, Data Preprocessing, Analytical Techniques) and integrate well-formatted code blocks.  
   - For **Results & Evaluation:** Summarize key findings, detail evaluation metrics, and include code/visualization examples.  
   - For **Discussion & Conclusion:** Provide thoughtful analysis, discuss challenges and future directions, and summarize overall findings.  
   - For **Appendix:** Include additional code, data descriptions, and supplementary materials.

4. **Formatting and Integration:**  
   - Ensure professional Markdown formatting with clear headings, subheadings, bullet points, and code blocks.  
   - Guarantee a smooth narrative flow and logical transitions between sections.

5. **Final Synthesis:**  
   - Combine all sections into one cohesive report.  
   - Verify that the output is comprehensive, technically detailed, and maintains high professional standards.

**Final Report Requirements (Output Must Include):**

1. **Project Overview:**
   - **Project Title:** Clearly state the title.
   - **Problem Statement:** Concisely describe the core problem.
   - **Goal:** Summarize the primary aim and expected outcomes.
   - **Key Objectives:** List the major objectives.
   - **Scope:** Define the project boundaries and focus areas.

2. **Data & Methodologies:**
   - **Overview of Data Sources:**  
     - Identify and describe all data sources.
     - Explain the relevance and importance of each source.
   - **Data Collection Methods:**  
     - Detail the techniques (e.g., API integration, web scraping, database queries).
     - Justify the selection of each method with references to specific tools or libraries.
   - **Data Preprocessing:**  
     - Enumerate and explain all cleaning and transformation steps (e.g., handling missing values, normalization, feature engineering).
     - Describe the rationale behind each step.
   - **Analytical Techniques:**  
     - Provide a comprehensive explanation of the statistical methods or machine learning models used.
     - Justify the choice of techniques, discussing key parameters and alignment with project goals.
   - **Code Block Illustrations:**  
     - Include clearly formatted code blocks or pseudocode examples to illustrate data loading, preprocessing, and analytical processes.

3. **Results & Evaluation:**
   - **Results Summary:** Present a clear, concise summary of key findings.
   - **Evaluation Metrics:**  
     - Detail relevant metrics (e.g., accuracy, precision, recall, F1-score).
     - Justify the significance of each metric.
   - **Visual Aids & Code Examples:**  
     - Embed placeholders for figures, charts, or tables.
     - Provide code examples where applicable to demonstrate visualization or evaluation techniques.

4. **Discussion & Conclusion:**
   - **Discussion:** Analyze the implications of the results, discuss challenges, and highlight potential improvements.
   - **Conclusion:** Summarize overall findings, project impact, and suggest future directions.

5. **Appendix / Additional Resources:**
   - **Supporting Code and Data:**  
     - Include supplementary code snippets, data descriptions, and additional documentation as needed.

**Additional Requirements:**
- Use professional, consistent Markdown formatting with appropriate headings, bullet points, and code blocks.
- Ensure seamless narrative flow and logical transitions between all sections.
- Emphasize a deep technical focus, particularly in the "Data & Methodologies" section.
- Do not reveal any internal chain-of-thought or reasoning in the final output.

Generate your complete, polished final project report in Markdown as your output.
"""

# 3) Create a payload with moderate token usage
payload_cot = create_payload(
    target="open-webui",
    model="Llama-3.2-3B-Instruct",  # Updated model
    prompt=COT_PROMPT,
    temperature=0.8,                # Slightly higher to encourage more detailed text
    num_ctx=300,                    # Increase context for longer expansions
    num_predict=400                 # Enough tokens for detailed paragraphs
)

def request_with_retry(payload, max_retries=2, delay=3):
    """
    Attempts the model_req call up to `max_retries` times,
    waiting `delay` seconds between tries if a 504 error occurs.
    """
    attempt = 0
    while attempt < max_retries:
        try:
            time_cot, cot_output = model_req(payload=payload)
            return time_cot, cot_output
        except Exception as e:
            error_str = str(e)
            if "504" in error_str or "Bad Gateway" in error_str:
                print(f"Got 504 error. Retrying in {delay} seconds...")
                attempt += 1
                time.sleep(delay)
            else:
                # Different error; re-raise
                raise e
    raise RuntimeError("Max retries reached. 504 error persists.")

# 4) Execute the chain-of-thought request with retries
time_cot, cot_output = request_with_retry(payload_cot)

# 5) Print the final Markdown output
print("===== CHAIN-OF-THOUGHT OUTPUT (PROJECT OVERVIEW + DATA & METHODOLOGIES) =====")
print(cot_output)
if time_cot:
    print(f"\nTime taken: {time_cot}s")

# 6) Save the chain-of-thought response to a variable and a CSV file for future reference
with open("data/chain_of_thought_output.txt", "w", encoding="utf-8") as f:
    f.write(cot_output)

log_entry = [
    datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
    "chain_of_thought",
    "Llama-3.2-3B-Instruct",
    0.8,
    time_cot,
    cot_output.replace("\n", "\\n")
]

with open("data/chain_of_thought_data_methods_logs.csv", "a", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(log_entry)


Payload: {'model': 'Llama-3.2-3B-Instruct', 'messages': [{'role': 'user', 'content': '\n# Instructions for Advanced Hybrid Chain-of-Thought Report Generation\n\nYou are a top-tier technical report architect and advanced prompt engineer. Your task is to generate a complete, highly professional project report in Markdown that integrates all essential sections—Project Overview, Data & Methodologies, Results & Evaluation, Discussion & Conclusion, and Appendix—while meeting the highest technical and formatting standards. For this task, you must employ an internal chain-of-thought process to deeply analyze and synthesize input data, ensuring every section is detailed, logically structured, and technically rigorous. Do not include any internal chain-of-thought details or reasoning in the final output.\n\n**Internal Chain-of-Thought Process (Do Not Output):**\n# Health Tracking Chatbot: A Project Report\\n\\n\\n## 1. Introduction\\n\\n**1.1 Project Title:** Health Tracking Chatbot\\n\\n**1.2 P