# Chain-of-thought Reasoning in Granite

This notebook demonstrates using the chain-of-thought reasoning capabilities of the Granite Instruct models.

The reasoning capability is baked into the Instruct models rather then being a separate model. The model’s internal reasoning process can be easily toggled on and off, ensuring the appropriate use of compute resources for the task at hand. See [Reasoning when you need it](https://www.ibm.com/new/announcements/ibm-granite-3-2-open-source-reasoning-and-vision#Granite+3.2+Instruct%3A+Reasoning+when+you+need+it) for additional information.

NOTE: In Granite 3.2, the chain-of-thought reasoning capabilities are currently considered experimental.

## Install Dependencies

We first need to install some Python package dependencies for this notebook. Granite utils provides some helpful functions for recipes.

In [None]:
! pip install git+https://github.com/ibm-granite-community/utils.git \
    langchain_community \
    transformers \
    replicate

## Select your model

Select a Granite model from the [`ibm-granite`](https://replicate.com/ibm-granite) org on Replicate. Here we use the Replicate Langchain client to connect to the model.

To get set up with Replicate, see [Getting Started with Replicate](https://github.com/ibm-granite-community/granite-kitchen/blob/main/recipes/Getting_Started/Getting_Started_with_Replicate.ipynb).

To connect to a model on a provider other than Replicate, substitute this code cell with one from the [LLM component recipe](https://github.com/ibm-granite-community/granite-kitchen/blob/main/recipes/Components/Langchain_LLMs.ipynb).

In [None]:
from ibm_granite_community.notebook_utils import get_env_var
from langchain_community.llms import Replicate
from transformers import AutoTokenizer

model_path = "ibm-granite/granite-3.2-8b-instruct"
model = Replicate(
    model=model_path,
    replicate_api_token=get_env_var("REPLICATE_API_TOKEN"),
    model_kwargs={
        "max_tokens": 4000, # Set the maximum number of tokens to generate as output.
        "min_tokens": 200, # Set the minimum number of tokens to generate as output.
        "temperature": 0.0, # Lower the temperature
    },
)
tokenizer = AutoTokenizer.from_pretrained(model_path)

## Setup two prompts

We will create two prompt chains. The first chain will use the normal (non-chain-of-thought reasoning) response mode. This is the default prompt mode for Granite. The second chain is configured to use the chain-of-thought reasoning response mode by passing `thinking=True` to the chat template. This adds specific instructions to the system prompt to cause the model's internal reasoning process to be activated which results in the response containing the chain-of-thought.

In [None]:
from langchain.prompts import PromptTemplate

# Create a Granite prompt without chain-of-thought reasoning
prompt = tokenizer.apply_chat_template(
    conversation=[{
        "role": "user",
        "content": "{input}",
    }],
    add_generation_prompt=True,
    tokenize=False,
)
prompt_template = PromptTemplate.from_template(template=prompt)
chain = prompt_template | model

# Create a Granite prompt using chain-of-thought reasoning
reasoning_prompt = tokenizer.apply_chat_template(
    conversation=[{
        "role": "user",
        "content": "{input}",
    }],
    thinking=True, # Use chain-of-thought reasoning
    add_generation_prompt=True,
    tokenize=False,
)
reasoning_prompt_template = PromptTemplate.from_template(template=reasoning_prompt)
reasoning_chain = reasoning_prompt_template | model

Now that we have the two prompts. Lets looks at the difference between them which activates the Granite model's internal reasoning process in the reasoning prompt.

NOTE: This additional prompt text is specific to the chat template for the version of Granite used and may change in future releases of Granite.

In [None]:
import difflib
from ibm_granite_community.notebook_utils import wrap_text

print("==== System prompt instructions for chain-of-thought reasoning ====")
diff = difflib.ndiff(prompt, reasoning_prompt)
print(wrap_text("".join(d[-1] for d in diff if d[0] == "+"), indent="  "))

## Compare the responses of the two prompts

First we define a helper function to take a question and use both prompts to respond to the question. The function will display the question and then the response from the normal prompt followed by the response from the chain-of-thought reasoning prompt.

In [None]:
def question(question: str) -> None:
    print("==== Question ====")
    print(wrap_text(question, indent="  "))

    print("==== Normal prompt response ====")
    output = chain.invoke({"input": question})
    print(wrap_text(output, indent="  "))

    print("\n==== Reasoning prompt response ====")
    reasoning_output = reasoning_chain.invoke({"input": question})
    print(wrap_text(reasoning_output, indent="  "))

### Questions

The following cells each contain an interesting questions for the model to answer. Sometimes the normal response mode will not properly answer the question while the chain-of-thought reasoning response mode will reason out the correct answer.

Try the following questions. You can edit/add a cell to try you own question!

In [None]:
question("""\
Sally is a girl and has 3 brothers.
Each brother has 2 sisters.
How many sisters does Sally have?\
""")

In [None]:
question("""\
Which of the following items weigh more: a pound of water, two pounds of bricks, a pound of feathers, or three pounds of air?\
""")

In [None]:
question("""\
Which one is greater, 9.11 or 9.9?\
""")

In [None]:
question("""\
Which version number is greater, 9.11 or 9.9?\
""")

In [None]:
question("""\
Which Maven version number is greater, 9.9-rc1 or 9.9?\
""")

In [None]:
question("""\
You have 10 liters of a 30% acid solution.
How many liters of a 70% acid solution must be added to achieve a 50% acid mixture?\
""")

In [None]:
question("""\
In an isosceles triangle, the vertex angle measures 40 degrees.
What is the measure of each base angle?\
""")

You have now seen how to use the chain-of-though reasoning capability in the Granite Instruct models.