In [1]:
import os
from getpass import getpass

os.environ['LANGSMITH_TRACING'] = 'true'
os.environ['LANGSMITH_ENDPOINT'] = "https://eu.api.smith.langchain.com "
os.environ['LANGSMITH_API_KEY'] =  os.getenv('LANGSMITH_API_KEY') or getpass('Enter your LangSmith API Key: ')
os.environ['LANGSMITH_PROJECT'] = 'LangChain-LangSmith-Demo'

In [2]:
os.environ["GOOGLE_API_KEY"] = os.getenv("GOOGLE_API_KEY") or getpass(
    "Enter GOOGLE API Key: "
)

In [4]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",      
    temperature=0.0)

The `temperature` parameter controls the randomness of the LLM's output. A temperature of `0.0` makes an LLM's output more determinstic which _in theory_ should lead to a lower likelihood of hallucination.

Now, the question here may be, _why would we ever not use `temperature=0.0`?_ The answer to that is that sometimes a little bit of randomness can useful. Randomness tends to translate to text that feels more human and creative, so if we'd like an LLM to help us write an article or even a poem, that lack of determinism becomes a feature rather than a bug.

## Basic Prompting

In [6]:
prompt = '''
Answer the user's query based on the context below.
If you cannot answer the question using the
provided information answer with "I don't know".

Context: {context}
'''

LangChain uses a `ChatPromptTemplate` object to format the various prompt types into a single list which will be passed to our LLM:

In [8]:
from langchain_core.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_messages([
    ("system", prompt),
    ("user", "{query}")
])

In [10]:
prompt_template

ChatPromptTemplate(input_variables=['context', 'query'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='\nAnswer the user\'s query based on the context below.\nIf you cannot answer the question using the\nprovided information answer with "I don\'t know".\n\nContext: {context}\n'), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['query'], input_types={}, partial_variables={}, template='{query}'), additional_kwargs={})])

When we call the template it will expect us to provide two variables, the `context` and the `query`. Both of these variables are pulled from the strings we wrote, as LangChain interprets curly-bracket syntax (ie `{context}` and `{query}`) as indicating a dynamic variable that we expect to be inserted at query time. We can see that these variables have been picked up by our template object by viewing it's `input_variables` attribute:

In [9]:
prompt_template.input_variables

['context', 'query']

We can also view the structure of the messages (currently _prompt templates_) that the `ChatPromptTemplate` will construct by viewing the `messages` attribute:

In [11]:
prompt_template.messages

[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='\nAnswer the user\'s query based on the context below.\nIf you cannot answer the question using the\nprovided information answer with "I don\'t know".\n\nContext: {context}\n'), additional_kwargs={}),
 HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['query'], input_types={}, partial_variables={}, template='{query}'), additional_kwargs={})]

From this, we can see that each tuple provided when using `ChatPromptTemplate.from_messages` becomes an individual prompt template itself. Within each of these tuples, the first value defines the _role_ of the message, which is typically `system`, `human`, or `ai`. Using these tuples is shorthand for the following, more explicit code:

In [12]:
from langchain_core.prompts import SystemMessagePromptTemplate, HumanMessagePromptTemplate

prompt_template = ChatPromptTemplate.from_messages([
    SystemMessagePromptTemplate.from_template(prompt),
    HumanMessagePromptTemplate.from_template("{query}"),
])

We can see the structure of this new chat prompt template is identical to our previous:

In [14]:
prompt_template.input_variables

['context', 'query']

In [13]:
prompt_template.messages

[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='\nAnswer the user\'s query based on the context below.\nIf you cannot answer the question using the\nprovided information answer with "I don\'t know".\n\nContext: {context}\n'), additional_kwargs={}),
 HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['query'], input_types={}, partial_variables={}, template='{query}'), additional_kwargs={})]

`prompt_template.format_messages()` produces a structured list of message objects (such as system and human), containing the formatted content and type. This allows us to preview exactly what the LLM will receive as input.

In [22]:
formatted = prompt_template.format_messages(query="This the user's query.", context = "This is a test context.")

for msg in formatted:
    print(f"{msg.type} : {msg.content}")

system : 
Answer the user's query based on the context below.
If you cannot answer the question using the
provided information answer with "I don't know".

Context: This is a test context.

human : This the user's query.


We'll setup the pipeline to consume two variables when our LLM pipeline is called, `query` and `context`, we'll feed them into our chat prompt template, and then invoke our LLM with our formatted messages.

Although that sounds complicated, all we're doing is connecting our `prompt_template` and `llm`. We do this with **L**ang**C**hain **E**xpression **L**anguage (LCEL), which uses the `|` operator to connect our each component.

In [30]:
chain = (
    {
        "query": lambda x: x["query"],
        "context": lambda x: x["context"]
    }
    | prompt_template
    | llm
)

In [23]:
context = """LangChain is a framework for developing applications powered by language models. 
It can be used for chatbots, Generative Question-Answering (GQA), summarization, and much more."""

query = "What is LangChain used for?"

In [31]:
chain.invoke({"query": query, "context": context})

AIMessage(content='LangChain is used for developing applications powered by language models, including chatbots, Generative Question-Answering (GQA), and summarization.', additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash', 'safety_ratings': [], 'grounding_metadata': {}, 'model_provider': 'google_genai'}, id='lc_run--ca33ac44-657b-4956-a706-6f8148e0bfdb-0', usage_metadata={'input_tokens': 85, 'output_tokens': 90, 'total_tokens': 175, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 61}})

In [32]:
chain.invoke({"query": "What is LangSmith?", "context": context})

AIMessage(content="I don't know.", additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash', 'safety_ratings': [], 'grounding_metadata': {}, 'model_provider': 'google_genai'}, id='lc_run--e5e79dc5-502a-4a25-8445-1a5940896c49-0', usage_metadata={'input_tokens': 83, 'output_tokens': 105, 'total_tokens': 188, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 99}})

## Few Shot Prompting

Many **S**tate-**o**f-**t**he-**A**rt (SotA) LLMs are incredible at instruction following. Meaning that it requires much less effort to get the intended output or behavior from these models than is the case for older LLMs and smaller LLMs.

Before creating an example let's first see how to use LangChain's few shot prompting objects. We will provide multiple examples and we'll feed them in as sequential human and ai messages so we setup the template like this:

In [49]:
example_prompt = ChatPromptTemplate.from_messages([
    ("human", "{input}"),
    ("ai", "{output}")
])

Then we define a list of examples with dictionaries containing the correct `input` and `output` keys.

In [50]:
examples = [
    {"input": "Here is query #1", "output": "Here is the answer #1"},
    {"input": "Here is query #2", "output": "Here is the answer #2"},
    {"input": "Here is query #3", "output": "Here is the answer #3"}
]

We then feed both of these into our `FewShotChatMessagePromptTemplate` object:

In [51]:
from langchain_core.prompts import FewShotChatMessagePromptTemplate

few_shot_prompt = FewShotChatMessagePromptTemplate(
    examples=examples,
    example_prompt=example_prompt
)


In [52]:
print(few_shot_prompt.format()) # here is the formatted prompt

Human: Here is query #1
AI: Here is the answer #1
Human: Here is query #2
AI: Here is the answer #2
Human: Here is query #3
AI: Here is the answer #3


Using this we can provide different sets of examples or even different individual example_prompt templates to the FewShotChatMessagePromptTemplate object to build our prompt structure. 

### A Few-Shot Example

In [57]:
new_system_prompt = """
Answer the user's query based on the context below.
If you cannot answer the question using the
provided information answer with "I don't know".

Always answer in markdown format. When doing so please
provide headers, short summaries, follow with bullet
points, then conclude.

Context: {context}
"""
prompt_template.messages[0].prompt.template = new_system_prompt
prompt_template.messages


[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='\nAnswer the user\'s query based on the context below.\nIf you cannot answer the question using the\nprovided information answer with "I don\'t know".\n\nAlways answer in markdown format. When doing so please\nprovide headers, short summaries, follow with bullet\npoints, then conclude.\n\nContext: {context}\n'), additional_kwargs={}),
 HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['query'], input_types={}, partial_variables={}, template='{query}'), additional_kwargs={})]

In [61]:
query

'What is LangChain used for?'

In [62]:
response = chain.invoke({'query': query, 'context': context}).content
print(response)

### LangChain Applications

LangChain is a versatile framework designed for building applications that leverage language models. It offers a wide range of uses, enabling developers to create sophisticated AI-powered tools.

**Summary:**
LangChain is primarily used for developing various applications powered by language models.

**Key Uses:**
*   **Chatbots:** Creating interactive conversational agents.
*   **Generative Question-Answering (GQA):** Developing systems that can generate answers to questions based on provided information.
*   **Summarization:** Building tools that can condense longer texts into shorter, coherent summaries.
*   **Much more:** The framework's flexibility allows for a broad spectrum of other language model-powered applications.

**Conclusion:**
In essence, LangChain serves as a foundational tool for anyone looking to develop applications that harness the power of language models, from simple chatbots to complex generative AI systems.


We can display our markdown nicely with IPython like so:

In [63]:
from IPython.display import display, Markdown

display(Markdown(response))

### LangChain Applications

LangChain is a versatile framework designed for building applications that leverage language models. It offers a wide range of uses, enabling developers to create sophisticated AI-powered tools.

**Summary:**
LangChain is primarily used for developing various applications powered by language models.

**Key Uses:**
*   **Chatbots:** Creating interactive conversational agents.
*   **Generative Question-Answering (GQA):** Developing systems that can generate answers to questions based on provided information.
*   **Summarization:** Building tools that can condense longer texts into shorter, coherent summaries.
*   **Much more:** The framework's flexibility allows for a broad spectrum of other language model-powered applications.

**Conclusion:**
In essence, LangChain serves as a foundational tool for anyone looking to develop applications that harness the power of language models, from simple chatbots to complex generative AI systems.

This is not bad, but also not quite the format we wanted. We could try improving our initial prompting instructions, but when this doesn't work we can move on to our few-shot prompting. We want to build something like this:

We have already defined our `example_prompt` so now we just change our `examples` to use some examples of a user asking a question and the LLM answering in the exact markdown format we need.

In [64]:
examples = [
    {
        "input": "Can you explain gravity?",
        "output": (
            "## Gravity\n\n"
            "Gravity is one of the fundamental forces in the universe.\n\n"
            "### Discovery\n\n"
            "* Gravity was first discovered by Sir Isaac Newton in the late 17th century.\n"
            "* It was said that Newton theorized about gravity after seeing an apple fall from a tree.\n\n"
            "### In General Relativity\n\n"
            "* Gravity is described as the curvature of spacetime.\n"
            "* The more massive an object is, the more it curves spacetime.\n"
            "* This curvature is what causes objects to fall towards each other.\n\n"
            "### Gravitons\n\n"
            "* Gravitons are hypothetical particles that mediate the force of gravity.\n"
            "* They have not yet been detected.\n\n"
            "**To conclude**, Gravity is a fascinating topic and has been studied extensively since the time of Newton.\n\n"
        )
    },
    {
        "input": "What is the capital of France?",
        "output": (
            "## France\n\n"
            "The capital of France is Paris.\n\n"
            "### Origins\n\n"
            "* The name Paris comes from the Latin word \"Parisini\" which referred to a Celtic people living in the area.\n"
            "* The Romans named the city Lutetia, which means \"the place where the river turns\".\n"
            "* The city was renamed Paris in the 3rd century BC by the Celtic-speaking Parisii tribe.\n\n"
            "**To conclude**, Paris is highly regarded as one of the most beautiful cities in the world and is one of the world's greatest cultural and economic centres.\n\n"
        )
    }
]

We feed these into our `FewShotChatMessagePromptTemplate` object:

In [65]:
few_shot_prompt = FewShotChatMessagePromptTemplate(
    examples=examples,
    example_prompt=example_prompt
)

In [66]:
display(Markdown(few_shot_prompt.format()))

Human: Can you explain gravity?
AI: ## Gravity

Gravity is one of the fundamental forces in the universe.

### Discovery

* Gravity was first discovered by Sir Isaac Newton in the late 17th century.
* It was said that Newton theorized about gravity after seeing an apple fall from a tree.

### In General Relativity

* Gravity is described as the curvature of spacetime.
* The more massive an object is, the more it curves spacetime.
* This curvature is what causes objects to fall towards each other.

### Gravitons

* Gravitons are hypothetical particles that mediate the force of gravity.
* They have not yet been detected.

**To conclude**, Gravity is a fascinating topic and has been studied extensively since the time of Newton.


Human: What is the capital of France?
AI: ## France

The capital of France is Paris.

### Origins

* The name Paris comes from the Latin word "Parisini" which referred to a Celtic people living in the area.
* The Romans named the city Lutetia, which means "the place where the river turns".
* The city was renamed Paris in the 3rd century BC by the Celtic-speaking Parisii tribe.

**To conclude**, Paris is highly regarded as one of the most beautiful cities in the world and is one of the world's greatest cultural and economic centres.



We then pull all of this together with our system prompt and final user query to create our final prompt and feed it into our LLM.

In [68]:
few_shot_prompt

FewShotChatMessagePromptTemplate(examples=[{'input': 'Can you explain gravity?', 'output': '## Gravity\n\nGravity is one of the fundamental forces in the universe.\n\n### Discovery\n\n* Gravity was first discovered by Sir Isaac Newton in the late 17th century.\n* It was said that Newton theorized about gravity after seeing an apple fall from a tree.\n\n### In General Relativity\n\n* Gravity is described as the curvature of spacetime.\n* The more massive an object is, the more it curves spacetime.\n* This curvature is what causes objects to fall towards each other.\n\n### Gravitons\n\n* Gravitons are hypothetical particles that mediate the force of gravity.\n* They have not yet been detected.\n\n**To conclude**, Gravity is a fascinating topic and has been studied extensively since the time of Newton.\n\n'}, {'input': 'What is the capital of France?', 'output': '## France\n\nThe capital of France is Paris.\n\n### Origins\n\n* The name Paris comes from the Latin word "Parisini" which refe

In [69]:
prompt_template = ChatPromptTemplate.from_messages([
    ("system", new_system_prompt),
    few_shot_prompt,
    ("user", "{query}")
])

In [72]:
prompt_template

ChatPromptTemplate(input_variables=['context', 'query'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='\nAnswer the user\'s query based on the context below.\nIf you cannot answer the question using the\nprovided information answer with "I don\'t know".\n\nAlways answer in markdown format. When doing so please\nprovide headers, short summaries, follow with bullet\npoints, then conclude.\n\nContext: {context}\n'), additional_kwargs={}), FewShotChatMessagePromptTemplate(examples=[{'input': 'Can you explain gravity?', 'output': '## Gravity\n\nGravity is one of the fundamental forces in the universe.\n\n### Discovery\n\n* Gravity was first discovered by Sir Isaac Newton in the late 17th century.\n* It was said that Newton theorized about gravity after seeing an apple fall from a tree.\n\n### In General Relativity\n\n* Gravity is described as the curvature of spac

In [71]:
prompt_template.messages

[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='\nAnswer the user\'s query based on the context below.\nIf you cannot answer the question using the\nprovided information answer with "I don\'t know".\n\nAlways answer in markdown format. When doing so please\nprovide headers, short summaries, follow with bullet\npoints, then conclude.\n\nContext: {context}\n'), additional_kwargs={}),
 FewShotChatMessagePromptTemplate(examples=[{'input': 'Can you explain gravity?', 'output': '## Gravity\n\nGravity is one of the fundamental forces in the universe.\n\n### Discovery\n\n* Gravity was first discovered by Sir Isaac Newton in the late 17th century.\n* It was said that Newton theorized about gravity after seeing an apple fall from a tree.\n\n### In General Relativity\n\n* Gravity is described as the curvature of spacetime.\n* The more massive an object is, the more it curves spacetime.\n* This curvature is what causes

Now feed this back into our pipeline(chain):

In [74]:
chain = (
    {
        "query": lambda x: x["query"],
        "context": lambda x: x["context"]
    }
    | prompt_template
    | llm
)

In [76]:
result = chain.invoke({"query": query, "context": context})
display(Markdown(result.content))

## LangChain Uses

LangChain is a framework designed for building applications that leverage the power of language models.

### Applications

*   **Chatbots**: Developing interactive conversational agents.
*   **Generative Question-Answering (GQA)**: Creating systems that can generate answers to questions based on provided information.
*   **Summarization**: Tools for condensing longer texts into shorter, coherent summaries.
*   **Much more**: Its capabilities extend beyond these specific examples to various other language model-powered applications.

**To conclude**, LangChain provides a versatile toolkit for developers to create a wide range of applications utilizing language models, from conversational AI to information extraction and content generation.

## Chain of Thought Prompting

We'll take a look at one more commonly used prompting technique called _chain of thought_ (CoT). CoT is a technique that encourages the LLM to think through the problem step by step before providing an answer. The idea being that by breaking down the problem into smaller steps, the LLM is more likely to arrive at the correct answer and we are less likely to see hallucinations.

To implement CoT we don't need any specific LangChain objects, instead we are simply modifying how we instruct our LLM within the system prompt. We will ask the LLM to list the problems that need to be solved, to solve each problem individually, and then to arrive at the final answer.

Let's first test our LLM _without_ CoT prompting.

In [91]:
no_cot_system_prompt = """
Be a helpful assistant and answer the user's question.

You MUST answer the question directly without using chain of thoughts and any other
text or explanation.
"""

no_cot_prompt_template = ChatPromptTemplate.from_messages([
    ("system", no_cot_system_prompt),
    ("user", "{query}"),
])

Nowadays most LLMs are trained to use CoT prompting by default, so we actually need to instruct it not to do so for this example which is why we added `"You MUST answer the question directly without any other text or explanation."` to our system prompt.

In [92]:
query = (
    "How many keystrokes are needed to type the numbers from 1 to 1000?"
)

In [93]:
no_cot_chain = no_cot_prompt_template | llm
no_cot_result = no_cot_chain.invoke({"query": query}).content
print(no_cot_result)

2893


The actual answer is `2893` which is correct, but sometimes the LLM _without_ CoT might hallucinats and give us a guess. Now, we can add explicit CoT prompting to our system prompt to see if we can get a better result.

In [94]:
# Define the chain-of-thought prompt template
cot_system_prompt = """
Be a helpful assistant and answer the user's question.

To answer the question, you must:

- List systematically and in precise detail all
  subproblems that need to be solved to answer the
  question.
- Solve each sub problem INDIVIDUALLY and in sequence.
- Finally, use everything you have worked through to
  provide the final answer.
"""

cot_prompt_template = ChatPromptTemplate.from_messages([
    ("system", cot_system_prompt),
    ("user", "{query}"),
])

cot_chain = cot_prompt_template | llm

In [95]:
cot_result = cot_chain.invoke({"query": query}).content
display(Markdown(cot_result))

To determine the total number of keystrokes needed to type the numbers from 1 to 1000, we need to break down the problem based on the number of digits in each number.

### Subproblems to Solve:

1.  **Calculate keystrokes for 1-digit numbers:** Numbers from 1 to 9.
2.  **Calculate keystrokes for 2-digit numbers:** Numbers from 10 to 99.
3.  **Calculate keystrokes for 3-digit numbers:** Numbers from 100 to 999.
4.  **Calculate keystrokes for 4-digit numbers:** The number 1000.
5.  **Sum all the calculated keystrokes** from the above subproblems.

---

### Solving Each Subproblem Individually:

#### Subproblem 1: Keystrokes for 1-digit numbers (1 to 9)

*   **Numbers:** 1, 2, 3, 4, 5, 6, 7, 8, 9
*   **Number of digits per number:** 1
*   **Count of 1-digit numbers:** There are 9 numbers (from 1 to 9).
*   **Total keystrokes for 1-digit numbers:** 9 numbers * 1 keystroke/number = **9 keystrokes**.

#### Subproblem 2: Keystrokes for 2-digit numbers (10 to 99)

*   **Numbers:** 10, 11, ..., 99
*   **Number of digits per number:** 2
*   **Count of 2-digit numbers:** (Last number - First number + 1) = (99 - 10 + 1) = 90 numbers.
*   **Total keystrokes for 2-digit numbers:** 90 numbers * 2 keystrokes/number = **180 keystrokes**.

#### Subproblem 3: Keystrokes for 3-digit numbers (100 to 999)

*   **Numbers:** 100, 101, ..., 999
*   **Number of digits per number:** 3
*   **Count of 3-digit numbers:** (Last number - First number + 1) = (999 - 100 + 1) = 900 numbers.
*   **Total keystrokes for 3-digit numbers:** 900 numbers * 3 keystrokes/number = **2700 keystrokes**.

#### Subproblem 4: Keystrokes for 4-digit numbers (1000)

*   **Numbers:** 1000
*   **Number of digits per number:** 4
*   **Count of 4-digit numbers:** There is only 1 number (1000).
*   **Total keystrokes for 4-digit numbers:** 1 number * 4 keystrokes/number = **4 keystrokes**.

---

### Final Answer:

Now, we sum the keystrokes from all the subproblems:

*   Keystrokes for 1-digit numbers: 9
*   Keystrokes for 2-digit numbers: 180
*   Keystrokes for 3-digit numbers: 2700
*   Keystrokes for 4-digit numbers: 4

**Total Keystrokes = 9 + 180 + 2700 + 4 = 2893**

Therefore, **2893** keystrokes are needed to type the numbers from 1 to 1000.

 Our LLM provides us with a final answer of `2893` which is correct. Finally, as mentioned most LLMs are now trained to use CoT prompting by default. So let's see what happens if we don't explicitly tell the LLM to use CoT.

In [96]:
system_prompt = """
Be a helpful assistant and answer the user's question.
"""

prompt_template = ChatPromptTemplate.from_messages([
    ("system", system_prompt),
    ("user", "{query}"),
])

chain = prompt_template | llm

In [97]:
result = chain.invoke({"query": query}).content
display(Markdown(result))

Let's break this down by the number of digits in each number:

1.  **1-digit numbers (1-9):**
    *   There are 9 numbers (1, 2, 3, 4, 5, 6, 7, 8, 9).
    *   Each requires 1 keystroke.
    *   Total: 9 * 1 = **9 keystrokes**

2.  **2-digit numbers (10-99):**
    *   There are 90 numbers (99 - 10 + 1 = 90).
    *   Each requires 2 keystrokes.
    *   Total: 90 * 2 = **180 keystrokes**

3.  **3-digit numbers (100-999):**
    *   There are 900 numbers (999 - 100 + 1 = 900).
    *   Each requires 3 keystrokes.
    *   Total: 900 * 3 = **2700 keystrokes**

4.  **4-digit numbers (1000):**
    *   There is 1 number (1000).
    *   It requires 4 keystrokes.
    *   Total: 1 * 4 = **4 keystrokes**

Now, add them all up:
9 + 180 + 2700 + 4 = **2893 keystrokes**

So, 2893 keystrokes are needed to type the numbers from 1 to 1000.

We get the _exact_ same result. The formatting isn't quite as nice but the CoT behavior is clearly there, and the LLM produces the correct final answer!

CoT is useful not only for simple question-answering like this, but is also a fundamental component of many agentic systems which will often use CoT steps paired with tool use to solve very complex problems, this is what we see in Google's current flagship model `gemini-2.5-flash`. 