# o1 Style Thinking

In this recipe, we will show how to achieve Chain-of-Thought Reasoning.
This makes LLMs to breakdown the task in multiple steps and generate a coherent output allowing to solve complex tasks in logical steps.

<div class="admonition tip">
<p class="admonition-title">Mirascope Concepts Used</p>
<ul>
<li><a href="../../../learn/prompts/">Prompts</a></li>
<li><a href="../../../learn/calls/">Calls</a></li>
<li><a href="../../../learn/response_models/">Response Models</a></li>
</ul>
</div>

<div class="admonition note">
<p class="admonition-title">Background</p>
<p>
    Large Language Models (LLMs) are known to generate text that is coherent and fluent. However, they often struggle with tasks that require multi-step reasoning or logical thinking. In this recipe, we will show how to use Mirascope to guide the LLM to break down the task into multiple steps and generate a coherent output.

</p>
</div>


## Setup

To set up our environment, first let's install all of the packages we will use:





In [None]:
!pip install "mirascope[groq]" 
!pip install datetime

In [None]:
# Set the appropriate API key for the provider you're using
# Here we are using GROQ_API_KEY

export GROQ_API="Your API Key"

In [None]:
from mirascope.core import groq
from datetime import datetime

# Without Chain-of-Thought Reasoning

We will begin by showing how a typical LLM performs on a task that requires multi-step reasoning. In this example, we will ask the model to generate a count the number of `s`s in the word `Mississssippi` (Yes it has 7`s`'s). We will use the `llama-3.1-8b-instant` for this example.

In [1]:
history: list[dict] = []


@groq.call("llama-3.1-8b-instant")
def generate_answer(question: str) -> str:
    return f"Generate an answer to this question: {question}"


def run():
    question = "how many s's in the word mississssippi"
    response = generate_answer(question)
    print(f"(User): {question}")
    print(f"(Assissant): {response}")
    history.append({"role": "user", "content": question})
    history.append({"role": "assistant", "content": response})


run()

(User): how many s's in the word mississssippi
(Assissant): The word "mississippi" contains 4 's' characters.


In this example, the zero-shot method is used to generate the output. The model is not provided with any additional information or context to help it generate the output. The model is only given the input prompt and asked to generate the output.

This is not so effective when there is a logcial task to be performed.

Now let's see how the model performs on this task when it can reason using Chain-of-Thought Reasoning.

# With Chain of Thought Reasoning

In [5]:
from mirascope.core import groq

history: list[dict] = []


@groq.call("llama-3.1-8b-instant")
def identify_task(question: str) -> str:
    return f"Identify the task in this question. Wrap your thoughts in <think></think> tags: {question}"


@groq.call("llama-3.1-8b-instant")
def break_down_task(task: str) -> str:
    return f"Break down this task into steps. Wrap your thoughts in <think></think> tags: {task}"


@groq.call("llama-3.1-8b-instant")
def execute_task(steps: str) -> str:
    return f"Execute these steps to solve the task. Wrap your thoughts in <think></think> tags: {steps}"


@groq.call("llama-3.1-8b-instant")
def final_answer(execution: str) -> str:
    return f"Based on this execution, provide a final answer. Do not use <think></think> tags for the final answer: {execution}"


def generate_cot_response(user_query):
    steps = []
    total_thinking_time = 0.0

    # Step 1: Identify the task
    start_time = datetime.now()
    identify_result = identify_task(user_query)
    end_time = datetime.now()
    thinking_time = (end_time - start_time).total_seconds()
    steps.append(("Step 1: Identify the task", identify_result.content, thinking_time))
    total_thinking_time += thinking_time

    # Step 2: Break down the task
    start_time = datetime.now()
    breakdown_result = break_down_task(identify_result.content)
    end_time = datetime.now()
    thinking_time = (end_time - start_time).total_seconds()
    steps.append(
        ("Step 2: Break down the task", breakdown_result.content, thinking_time)
    )
    total_thinking_time += thinking_time

    # Step 3: Execute the task
    start_time = datetime.now()
    execute_result = execute_task(breakdown_result.content)
    end_time = datetime.now()
    thinking_time = (end_time - start_time).total_seconds()
    steps.append(("Step 3: Execute the task", execute_result.content, thinking_time))
    total_thinking_time += thinking_time

    # Final answer
    start_time = datetime.now()
    final_result = final_answer(execute_result.content)
    end_time = datetime.now()
    thinking_time = (end_time - start_time).total_seconds()
    steps.append(("Final Answer", final_result.content, thinking_time))
    total_thinking_time += thinking_time

    return steps, total_thinking_time


def display_cot_response(steps, total_thinking_time):
    for title, content, thinking_time in steps:
        print(f"{title}:")
        if "<think>" in content and "</think>" in content:
            think_parts = content.split("<think>")
            for part in think_parts[
                1:
            ]:  # Skip the first split as it's before any <think> tag
                thought, remaining = part.split("</think>", 1)
                print(f"<think>{thought.strip()}</think>")
                if remaining.strip():
                    print(remaining.strip())
        else:
            print(content.strip())
        print(f"**Thinking time: {thinking_time:.2f} seconds**\n")

    print(f"**Total thinking time: {total_thinking_time:.2f} seconds**")


def run():
    question = "how many s's in the word mississssippi"
    print("(User):", question)
    # Generate COT response
    steps, total_thinking_time = generate_cot_response(question)
    display_cot_response(steps, total_thinking_time)

    # Add the interaction to the history
    history.append({"role": "user", "content": question})
    history.append(
        {"role": "assistant", "content": steps[-1][1]}
    )  # Add only the final answer to the history


# Run the the function
run()

(User): how many s's in the word mississssippi
Step 1: Identify the task:
<think>To solve this task, we need to count the number of 's's in the word "mississssippi". This involves:

1. Breaking down the word "mississssippi" into individual letters.
2. Counting the number of 's's in the word.

The word "mississssippi" can be broken down as follows: 

m - i - s - s - i - s - s - s - s - s - i - p - p - i

Now, counting the 's's: 1. one 's', 2. 's', 3. 's', 4. 's', 5. 's', 6. 's'. Therefore, the word "mississssippi" contains 6 's's.

The task is a simple linguistic exercise that involves character counting and string manipulation.</think>
**Thinking time: 1.34 seconds**

Step 2: Break down the task:
<think>We will start by decomposing the word into its constituent parts. This will help us examine each letter individually and accurately count the number of 's's.</think>
1.1 Decompose the word into individual letters:
   m - i - s - s - i - s - s - s - s - s - i - p - p - i

**Step 2: Ident

As demonstrated in the COT Reasoning example, we can guide the model to break down the task into multiple steps and generate a coherent output. This allows the model to solve complex tasks in logical steps.
However, this requires multiple calls to the model, which may be expensive in terms of cost and time.
Also model may not always identify the correct steps to solve the task, hence is not deterministic.

# Conclusion
Chain of Thought Reasoning is a powerful technique that allows LLMs to solve complex tasks in logical steps. However, it requires multiple calls to the model and may not always identify the correct steps to solve the task. This technique can be useful when the task requires multi-step reasoning or logical thinking.

Care should be taken to ensure that the model is guided correctly and that the output is coherent and accurate.