# Access Mistral Models on Amazon Bedrock 

This notebook will walk you through how to get started with Mistral Large on Bedrock and cover prompting 101.

The notebook was tested using the Data Science 3.0 kernel in SageMaker Studio.

# Amazon Bedrock 
[Amazon Bedrock](https://aws.amazon.com/bedrock/) is a fully managed service that provides access to a wide range of powerful foundation models (FMs) through a unified API. It offers models from leading AI companies like Mistral, Anthropic, AI21 Labs, Cohere, Stability AI, and Amazon's own Titan models.

## Mistral Large 2
[Mistral Large](https://mistral.ai/news/mistral-large-2407/) is the most advanced language model developed by French AI startup Mistral AI. It also has support for function calling and JSON format.

Max tokens: 128k

Languages: Natively fluent in French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean

Supported use cases: precise instruction following, text summarization, translation, complex multilingual reasoning tasks, math and coding tasks including code generation

## Mistral 7B
[Mistral 7B](https://mistral.ai/news/announcing-mistral-7b/) is a 7B dense Transformer, fast-deployed and easily customizable. Small, yet powerful for a variety of use cases.

Max tokens: 8K

Languages: English

Supported use cases: Text summarization, structuration, question answering,
and code completion

## Mixtral 8x7B
[Mixtral 8x7B](https://mistral.ai/news/mixtral-of-experts/) is 7B sparse Mixture-of-Experts model with stronger capabilities than Mistral
7B. Uses 12B active parameters out of 45B total

Max tokens: 32K

Languages: English, French, German, Spanish, Italian

Supported use cases: Text summarization, structuration, question answering,
and code completion

## Mistral Small
[Mistral Small](https://mistral.ai/technology/) is a compact yet powerful language model from Mistral AI, designed for efficiency and low latency. Supports native function calling and JSON outputs.

Max tokens: 32k

Languages: English, French, German, Spanish, Italian

Supported use cases: Text generation, Code generation, Classification, RAG, Conversation


## Getting access to Mistral Large (or other Mistral AI models)
In order to start using Mistral Large, make sure that you have access to it from the Bedrock console:

1. Log in to the AWS Console and navigate to the Amazon Bedrock service

2. From the Bedrock home page, click "Get started"

3. On the left-hand navigation menu, click "Model access"

4. On the right side of the page, click "Manage model access"

5. Select the base models you would like access to from the list of available models




In [1]:
import boto3
import json
import pandas as pd
from IPython.display import display_html
import logging
from botocore.exceptions import ClientError

Please note that for the majority of the motebook Mistral Large is being used. However, the prompting technqiues explained throughout the notebook also apply to other Mistral AI models. For more information, please check out the [Official Mistral AI Prompting Guide](https://docs.mistral.ai/guides/prompting_capabilities/).

In [2]:
DEFAULT_MODEL= "mistral.mistral-large-2407-v1:0"
MISTRAL_7B = "mistral.mistral-7b-instruct-v0:2"
MIXTRAL = "mistral.mixtral-8x7b-instruct-v0:1"
MISTRAL_SMALL = "mistral.mistral-small-2402-v1:0"
bedrock = boto3.client(service_name="bedrock-runtime")
model_id = DEFAULT_MODEL


## Supported parameters

The Mistral AI models have the following [inference parameters](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-mistral.html).


```
{
    "prompt": string,
    "max_tokens" : int,
    "stop" : [string],    
    "temperature": float,
    "top_p": float,
    "top_k": int
}
```

- Temperature - Tunes the degree of randomness in generation. Lower temperatures mean less random generations.
- Top P - If set to float less than 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation.
- Top K - Can be used to reduce repetitiveness of generated tokens. The higher the value, the stronger a penalty is applied to previously present tokens, proportional to how many times they have already appeared in the prompt or prior generation.
- Maximum Length - Maximum number of tokens to generate. Responses are not guaranteed to fill up to the maximum desired length.
- Stop sequences - Up to four sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

In [3]:

prompt = "<s>[INST]What is the history behind the Napoleonic Wars? [/INST]"
body = json.dumps({
            "temperature": 0.0,
            "max_tokens": 128,
            "prompt": prompt,
            "top_p": 0.1,
            "top_k": 2,
            "stop":["</s>"]

        })
response = bedrock.invoke_model(
    body = body,
    modelId = model_id
)

response_body = json.loads(response.get("body").read())
df = pd.DataFrame(
    {"Response": [response_body['outputs'][0]['text'].replace("\n", " ")]}
    )
display_html(df.to_html(index=False), raw=True)

Response
"The Napoleonic Wars (1803-1815) were a series of global conflicts fought during the reign of Napoleon Bonaparte, Emperor of the French. Here's a brief history behind them: 1. **Rise of Napoleon**: The Napoleonic Wars are rooted in the aftermath of the French Revolution (1789-1799). Napoleon, a young and brilliant military strategist, rose to prominence during this period. He seized power in a coup d'état in 1799, becoming First Consul and later crowning himself Emperor in"


# How to create a prompt

A prompt is a natural language instruction that guides a large language model (LLM) to perform a specific task effectively. Just like providing clear guidance to a skilled assistant, well-crafted prompts are crucial for leveraging the full potential of LLMs.

An effective prompt should include clear instructions, context, examples, constraints, desired output format, and unbiased language. It may need iterative refinement based on the model's initial response. Appropriate length is also important for coherent and focused outputs.


In [4]:

# Define the prompts
query = "What is the value of the cloud?"
prompt_1 = f"<s>[INST]{query} [/INST]"
prompt_2 = f"""<s>[INST]You are an Amazon Web Services (AWS) Expert. 
Your role is to answer every technical question about AWS as accurately as possible.
If you do not know the answer to a question, say I don't know. Give statistics 
around usage in the Enterprise world.{query} [/INST]"""
prompts = [prompt_1, prompt_2]


# Iterate through the prompts and get the responses
responses = []
for prompt in prompts:
    body = json.dumps({
        "temperature": 0.5,
        "max_tokens": 128,
        "prompt": prompt,
        "stop": ["</s>"]
    })
    response = bedrock.invoke_model(
    body = body,
    modelId = model_id)
    response_body = json.loads(response.get("body").read())
    responses.append(response_body["outputs"][0]["text"].replace("\n", " "))

# Create a pandas DataFrame to display the responses side-by-side
df = pd.DataFrame({
    "Prompt w/o clear instructions and context": [responses[0]],
    "Prompt w/ clear instructions and context": [responses[1]]
})

# Display the DataFrame as HTML
display_html(df.to_html(index=False), raw=True)

Prompt w/o clear instructions and context,Prompt w/ clear instructions and context
"The value of the cloud can be assessed through various benefits it offers to both businesses and individual users. Here are some of the key values: 1. **Cost Savings**: - **CapEx to OpEx**: Cloud services allow businesses to shift from a capital expenditure model (buying and maintaining your own servers) to an operational expenditure model (paying for computing resources as you use them). - **Scalability**: You can easily scale resources up or down based on demand, ensuring you only pay for what you use. 2. **Accessibility and Flexibility**:","The value of the cloud, particularly AWS, can be summarized through several key benefits that it brings to enterprises: ### Cost Efficiency - **Pay-as-You-Go Model**: AWS allows businesses to pay only for the resources they use, eliminating the need for upfront capital expenditures. - **Economies of Scale**: AWS's massive scale allows it to achieve higher economies of scale, which translates to lower pay-as-you-go prices. ### Scalability and Elasticity - **Auto-Scaling**: AWS services like EC2 Auto"


## Prompt Engineering 
Prompt engineering techniques like zero-shot, few-shot, and chain-of-thought prompting can further enhance the quality and behavior of model outputs. Zero-shot relies on the model's general knowledge, few-shot provides examples, and chain-of-thought encourages step-by-step reasoning.

Let's take a look at the different techniques in more detail.

### Zero-Shot Prompting:
- Zero-shot prompting involves providing a pre-trained language model with a prompt it hasn't seen during training and expecting it to generate a relevant output.
- The model relies on its general language understanding and patterns learned during pre-training to produce an output, without any additional examples or fine-tuning.
- Zero-shot prompting leverages the model's ability to perform tasks it hasn't been explicitly trained on, using only the information learned from a diverse range of sources.
- It is useful in scenarios where the model does not have specific training examples for the given task.
- However, zero-shot prompting offers less control over the output compared to few-shot prompting which will be discussed next.

In [5]:
prompt = "<s>[INST]Classify the sentiment of the following text: The movie was terrible, I hated every minute of it!?[/INST]"
body = json.dumps({
            "temperature": 0.5,
            "max_tokens": 128,
            "prompt": prompt,
            "stop":["</s>"]

        })
response = bedrock.invoke_model(
    body = body,
    modelId = model_id
)

response_body = json.loads(response.get("body").read())
df = pd.DataFrame(
    {"Response": [response_body['outputs'][0]['text'].replace("\n", " ")]}
    )
display_html(df.to_html(index=False), raw=True)

Response
"The sentiment of the text ""The movie was terrible, I hated every minute of it!"" is negative. The words ""terrible"" and ""hated"" convey a strong dislike and dissatisfaction, indicating a negative sentiment."


### Few-Shot Prompting:
- Few-shot prompting involves providing a language model with a small number of examples (usually 2-5) demonstrating the desired task, along with the input prompt.
- The examples serve as a conditioning context for the model, guiding it to generate a response similar to the provided examples.
- Few-shot prompting enables the model to quickly adapt to new tasks by leveraging the patterns and structures provided in the examples.
- It is more effective than zero-shot prompting for complex tasks and offers better control over the model's output.
- The performance of few-shot prompting generally improves with larger model sizes.

In [6]:
prompt = """
    <s>[INST]follow these examples to extract the keywords from the text.

    Text: The hotel room was spacious and clean.
    Keywords: hotel room, spacious, clean
    Text: The actor's performance was unconvincing and dull. 
    Keywords: actor's performance, unconvincing, dull
    Text: The new iPhone has an excellent camera and long battery life.
    keywords:
    [/INST]
    """
body = json.dumps({
            "temperature": 0.0,
            "max_tokens": 128,
            "prompt": prompt,
            "stop":["</s>"]

        })
response = bedrock.invoke_model(
    body = body,
    modelId = model_id
)

response_body = json.loads(response.get("body").read())
df = pd.DataFrame(
    {"Response": [response_body['outputs'][0]['text'].replace("\n", " ")]}
    )
display_html(df.to_html(index=False), raw=True)

Response
1. new iPhone 2. excellent camera 3. long battery life


### Chain-of-Thought (CoT) Prompting:
- CoT prompting is a technique that encourages language models to explain their reasoning process when solving complex problems.
- It involves providing the model with a few examples that include step-by-step reasoning, demonstrating how to break down a problem into intermediate steps.
- The model is then expected to follow a similar "chain of thought" when answering the prompt, explaining its reasoning process.
- CoT prompting is particularly effective for tasks that require arithmetic reasoning, commonsense reasoning, and symbolic reasoning.
- It has been shown to improve the performance of large language models on complex reasoning tasks compared to standard prompting.
- CoT prompting is an emergent property of model scale, meaning its benefits are more pronounced in larger models

In [8]:
prompt = """ <s>[INST]Alice is twice as old as Betty was when Alice was as old as Betty is now.
If the sum of their current ages is 42, how old is Alice now? Let's solve this step-by-step. [/INST]"""
body = json.dumps({
            "temperature": 0.0,
            "max_tokens": 1024,
            "prompt": prompt,
            "stop":["</s>"]

        })
response = bedrock.invoke_model(
    body = body,
    modelId = model_id
)

response_body = json.loads(response.get("body").read())
df = pd.DataFrame(
    {"Response": [response_body['outputs'][0]['text'].replace("\n", " ")]}
    )
display_html(df.to_html(index=False), raw=True)

Response
"Let's denote Alice's current age as \( A \) and Betty's current age as \( B \). According to the problem, the sum of their current ages is 42: \[ A + B = 42 \] The problem also states that Alice is twice as old as Betty was when Alice was as old as Betty is now. Let's break this down: 1. Let \( x \) be the number of years ago when Alice was as old as Betty is now. 2. At that time, Alice's age was \( A - x \) and Betty's age was \( B - x \). 3. According to the problem, Alice's age at that time was equal to Betty's current age: \[ A - x = B \] 4. The problem also states that Alice is currently twice as old as Betty was at that time: \[ A = 2(B - x) \] Now we have two equations: 1. \( A + B = 42 \) 2. \( A = 2(B - x) \) From the first equation, we can express \( A \) in terms of \( B \): \[ A = 42 - B \] Substitute \( A = 42 - B \) into the second equation: \[ 42 - B = 2(B - x) \] We also know from the third equation that: \[ A - x = B \] Substitute \( A = 42 - B \) into this equation: \[ 42 - B - x = B \] \[ 42 - 2B = x \] Now substitute \( x = 42 - 2B \) back into the equation \( 42 - B = 2(B - x) \): \[ 42 - B = 2(B - (42 - 2B)) \] \[ 42 - B = 2(B - 42 + 2B) \] \[ 42 - B = 2(3B - 42) \] \[ 42 - B = 6B - 84 \] \[ 42 + 84 = 6B + B \] \[ 126 = 7B \] \[ B = 18 \] Now that we have \( B \), we can find \( A \): \[ A = 42 - B \] \[ A = 42 - 18 \] \[ A = 24 \] So, Alice is currently 24 years old."


### Delimiters
- Delimiters are a useful technique in prompt engineering for large language models (LLMs) that can improve response quality and accuracy.

- Mistral AI models use delimiters like `###`, `<<<>>>` to specify boundaries between different sections of text.

- Delimiters help clearly separate the instructions/task description from the actual input data.

- Delimiters prevent prompt injection attacks by treating anything inside the delimiters as input to be processed according to the original instructions, rather than new directives.

- By strictly delimiting user input, the model will not interpret the input as new instructions, enhancing security.

In [9]:
### User question
question = "I am messaging to find out what this random in flight WIFI charge is. I did not purchase anything else beside my ticket with my credit card. Could you please help understand this charge?"

### Prompt template
prompt = f"""
        <s>[INST]You are an airline customer service bot. Your task is to assess customer intent 
        and categorize customer question after <<<>>> into one of the following predefined categories:
        
        flight delay
        lost luggage
        seat upgrade
        modify reservation
        cancel reservation
        charge dispute
        
        If the text doesn't fit into any of the above categories, classify it as:
        customer service
        
        You will only respond with the predefined category. Do not include the word "Category". Do not provide explanations or notes. 
        
        ####
        Here are some examples:
        
        Question:How can I track the status of my delayed or lost luggage to ensure it will be delivered to me? I'm concerned about the process and would like to know what options are available to monitor the location of my bags. Are there any tracking systems or apps I can use to see if my luggage is on its way or has been lost? Please provide details on how airlines typically handle delayed baggage delivery and the expected timeframes for receiving my suitcase.
        Category: lost luggage
        Question: I have a reservation for a flight and would like to know if there is any additional information I can provide to make my reservation more convenient.
        Category: modify reservation
        Question: When is my flight arriving?I'm trying to catch my connection which is not runing behind. Could you please provide an update on my flight?
        Category: flight delay 
        Question: I need help with my seat upgrade. I have a seat upgrade request but I'm having trouble finding the right seat.
        Category: seat upgrade
        Question: Can I get help starting my computer? I am having difficulty starting my computer, and would appreciate your expertise in helping me troubleshoot the issue. 
        Category: customer service
        ###
    
        <<<
        Question: {question}
        >>>
        [/INST]
        """

body = json.dumps({
            "temperature": 0.0,
            "max_tokens": 512,
            "prompt": prompt,
            "stop":["</s>"]

        })
response = bedrock.invoke_model(
    body = body,
    modelId = model_id
)

response_body = json.loads(response.get("body").read())

df = pd.DataFrame(
    {"Classification": [response_body['outputs'][0]['text'].replace("\n", " ")]}
    )
display_html(df.to_html(index=False), raw=True)

Classification
charge dispute


### Instruction Templates and Stop Sequences
 You can also add an instruction template that further guides the model to generate the response that you're looking for. Instruction templates are extra helpful when dealing with chatbot systems. For the  Mistral models, you can use the following instruction template: 

`<s>[INST] Instruction [/INST] Model answer</s>[INST] Follow-up instruction [/INST]`

The model uses the `[INST]` and `[/INST]` 'tags' to identify the instructions. It's important to note that there should be a space after the closing [/INST] tag. If you don't include this space, the model will likely generate a space at the beginning of its response.

If you want to deep dive into the special tokes, please check out the [Mistral documentation](https://docs.mistral.ai/guides/tokenization/#what-is-tokenization). 


In [10]:
query = "The sky is blue."
prompt_1 = f""" {query} """
prompt_2 = f""" <s>[INST] {query} [/INST] """
prompt_3 = f""" <s>[INST] {query}[/INST] """
prompt_4 = f"""<s>[INST]{query} [/INST] No, it is red.</s> [INST] What color is the sky? [/INST] """
prompts = [prompt_1, prompt_2,prompt_3,prompt_4]

# Iterate through the prompts and get the responses
responses = []
for prompt in prompts:
    body = json.dumps({
        "temperature": 0.5,
        "max_tokens": 512,
        "prompt": prompt,
        "stop": ["</s>"]
    })
    response = bedrock.invoke_model(
    body = body,
    modelId = model_id)
    response_body = json.loads(response.get("body").read())
    responses.append(response_body["outputs"][0]["text"].replace("\n", " "))

# Create a pandas DataFrame to display the responses side-by-side
df = pd.DataFrame({
    "Response w/o instruction template (1)": [responses[0]],
    "Response w/ instruction template and w/ space (2)": [responses[1]],
    "Response w/ instruction template and w/o space (3)": [responses[2]],
    "Response w/ instruction template, w/ space, and w/ follow up instructions (4)": [responses[3]]
})

# Display the DataFrame as HTML
display_html(df.to_html(index=False), raw=True)

Response w/o instruction template (1),Response w/ instruction template and w/ space (2),Response w/ instruction template and w/o space (3),"Response w/ instruction template, w/ space, and w/ follow up instructions (4)"
"2. The sky is not blue. Comment: @SandeepGusain, I'm not sure what you're trying to say. Comment: @SandeepGusain I think you are trying to say that the first statement is not necessarily true, so the second statement is not necessarily false. I would say that the first statement is necessarily true, and the second statement is necessarily false. Comment: @SandeepGusain, I'm not sure what you are trying to say. ## Answer (1) The first statement is necessarily true, and the second statement is necessarily false. Comment: I don't know why you were downvoted. This is the correct answer. Comment: @JoeZeng, I don't know either. Comment: @JoeZeng, I was downvoted because of the comment that I made. Comment: @JoeZeng, I was downvoted because of the comment that I made. Comment: @JoeZeng, I was downvoted because of the comment that I made. Comment: @JoeZeng, I was downvoted because of the comment that I made. Comment: @JoeZeng, I was downvoted because of the comment that I made. Comment: @JoeZeng, I was downvoted because of the comment that I made. Comment: @JoeZeng, I was downvoted because of the comment that I made. Comment: @JoeZeng, I was downvoted because of the comment that I made. Comment: @JoeZeng, I was downvoted because of the comment that I made. Comment: @JoeZeng, I was downvoted because of the comment that I made. Comment: @JoeZeng, I was downvoted because of the comment that I made. Comment: @JoeZeng, I was downvoted because of the comment that I made. Comment: @JoeZeng, I was downvoted because of the comment that I made. Comment: @JoeZeng, I was downvoted because of the comment that I made. Comment: @JoeZeng, I was downvoted","🌟 That's correct! The sky appears blue due to a process called Rayleigh scattering, where the Earth's atmosphere scatters more blue sunlight towards our eyes than any other color. Isn't nature fascinating?","1. The problem states: ""The sky is blue."" 2. This is a declarative statement that asserts a fact. 3. The statement does not contain any mathematical content to solve or prove. 4. Therefore, there is no mathematical solution to provide. Conclusion: The statement ""The sky is blue"" is a factual statement about the color of the sky.","🌈 The color of the sky can vary: * Daytime: Blue (due to a process called Rayleigh scattering) * Sunrise/Sunset: Shades of red, orange, or pink * Nighttime: Black or deep blue (with stars, planets, and other celestial bodies visible)"


The responses in column #1 suggest that the model may have struggled to understand the intended task or instruction, resulting in a response that, while potentially honest and harmless, lacked helpfulness or relevance. This could be due to the model's limitations in recognizing when to stop generating output.

In contrast, the response in column #2 appears to be helpful, harmless, and honest. The model seems to have correctly interpreted the statement as an instruction, prompting it to provide a justification for the given statement.

The difference in column #3 is primarily in the formatting, as the response includes an empty space at the beginning, which is an artifact of the model's output.

Lastly, in column #4, the model appears to have struggled with the instruction and response, along with the follow-up. As a result, it grounded its knowledge by referencing a source labeled as "Common Knowledge," potentially to provide additional context or support for its response.

These examples illustrate how the use of instruction templates and formatting can guide and customize the language model's responses. Explicit instructions can steer the model towards specific tasks, such as question-answering, while the inclusion of spaces and follow-ups can enable more natural conversational flows.

# Common Prompting Issues and How to Overcome them 
When writing prompts for Large Language Models, there are common pitfalls one may encounter that can impact the quality and accuracy of the model's outputs. Effective prompt engineering is crucial to mitigate these issues and unlock the full potential of LLMs. Some of the most prevalent challenges include:

Hallucination and factual inaccuracies: LLMs may generate outputs that contain false or made-up information, particularly when asked about topics outside their training data.

Lack of coherence and logical reasoning: The generated text may suffer from logical inconsistencies, contradictions, or lack a coherent structure, especially in complex, multi-step tasks.

Difficulty with complex, multi-step tasks: LLMs may struggle to maintain context and generate consistent outputs when faced with intricate, multi-step prompts that require reasoning and problem-solving abilities.

Misunderstanding user intent: The model may misinterpret the user's intended meaning or goal, leading to irrelevant or off-topic responses.



### 1. Hallucination and factual inaccuracies

Hallucination and factual inaccuracies are an issue that can arise when LLM produce text outputs. Since these models are trained on vast amounts of data from the internet and other sources, their knowledge can sometimes be incomplete, biased, or simply incorrect.

Solution: Provide clear instructions in the prompt and add any additional context.  In the example below, the context variable acts as the factual information to ground the model's response.

In [11]:

# Define the prompts
query = "What is project Cobra?"
context = """
Project Cobra aims to develop a method of interstellar travel by creating and manipulating closed timelike curves (CTCs) - paths through spacetime that loop back on themselves, allowing travel to the past. The theoretical basis comes from solutions to Einstein's equations of general relativity that permit CTCs in unusual spacetime geometries like wormholes.

To achieve this, Project Cobra plans to generate traversable wormholes stabilized with exotic matter, then accelerate one wormhole mouth to near lightspeed to induce CTCs inside. This would allow a spacecraft to travel through the wormhole and exit in its own past, appearing to an outside observer to have moved faster than light.

Major challenges include creating and stabilizing the wormholes, accelerating them to form the time machine, navigating through them precisely, and avoiding paradoxes from changing the past.

If successful, it could revolutionize interstellar travel by providing shortcuts through spacetime. However, it also carries huge risks of misuse or unintended consequences that would require stringent protocols and safety measures.

In essence, it is an extremely ambitious endeavor aiming to translate predictions of general relativity into functional time travel technology for reaching distant stars, albeit with immense theoretical and engineering hurdles to overcome."""

prompt_1 = f"<s>[INST]{query} [/INST]"
prompt_2 = f"""<s>[INST]Using the conext given below, answer the question: {query}
                Provided context: {context} [/INST]"""
prompts = [prompt_1, prompt_2]

# Iterate through the prompts and get the responses
responses = []
for prompt in prompts:
    body = json.dumps({
        "temperature": 0.5,
        "max_tokens": 128,
        "prompt": prompt,
        "stop": ["</s>"]
    })
    response = bedrock.invoke_model(
    body = body,
    modelId = model_id)
    response_body = json.loads(response.get("body").read())
    responses.append(response_body["outputs"][0]["text"].replace("\n", " "))

# Create a pandas DataFrame to display the responses side-by-side
df = pd.DataFrame({
    "Response w/o context": [responses[0]],
    "Response with context": [responses[1]]
})

# Display the DataFrame as HTML
display_html(df.to_html(index=False), raw=True)

Response w/o context,Response with context
"""Project Cobra"" can refer to several different initiatives depending on the context. Here are a few possibilities: 1. **Military Operations**: In military contexts, ""Project Cobra"" might refer to specific operations or initiatives. For example, Operation Cobra was a World War II offensive by the United States during the Normandy campaign. 2. **Technology Projects**: In the tech industry, ""Project Cobra"" could be a codename for a new product, software, or research initiative. Companies often use codenames to keep projects confidential until they are ready for public announcement.","Project Cobra is an ambitious endeavor focused on developing a method of interstellar travel by creating and manipulating closed timelike curves (CTCs). These CTCs are paths through spacetime that loop back on themselves, enabling travel to the past. The project aims to achieve this by generating traversable wormholes stabilized with exotic matter and then accelerating one mouth of the wormhole to near the speed of light to induce CTCs inside. This would allow a spacecraft to travel through the wormhole and emerge in its own past, effectively appearing to an outside observer to have moved faster than light"


### 2. How to improve coherence in responses 
Sometimes the response the model gives doesn't make sense or follow a clear logic. This happens because the model is not told to explain its thinking step-by-step. When asked a simple question, the model tries to give a direct answer without showing how it got there. This can lead to confusing or illogical responses.

Solution: Use chain-of-thought prompting, where the model is prompted to break down its reasoning into a series of logical steps before providing a final answer. This can significantly improve performance on tasks requiring multi-step reasoning.

In [12]:


# Define the prompts
query = """A classroom has two blue chairs for every three red chairs.
 If there are a total of 30 chairs in the classroom, how many blue chairs are there?"""

prompt_1 = f"<s>[INST]{query} [/INST]"
prompt_2 = f"""<s>[INST]Think-step-by-step and answer the following question: {query} [/INST]"""
prompts = [prompt_1, prompt_2]

# Iterate through the prompts and get the responses
responses = []
for prompt in prompts:
    body = json.dumps({
        "temperature": 0.0,
        "max_tokens": 512,
        "prompt": prompt,
        "stop": ["</s>"]
    })
    response = bedrock.invoke_model(
    body = body,
    modelId = model_id)
    response_body = json.loads(response.get("body").read())
    responses.append(response_body["outputs"][0]["text"].replace("\n", " "))

# Create a pandas DataFrame to display the responses side-by-side
df = pd.DataFrame({
    "Response w/o CoT": [responses[0]],
    "Response w/ CoT": [responses[1]]
})

# Display the DataFrame as HTML
display_html(df.to_html(index=False), raw=True)

Response w/o CoT,Response w/ CoT
"Let's denote the number of blue chairs as B and the number of red chairs as R. According to the problem, the ratio of blue chairs to red chairs is 2:3. This means that for every 2 blue chairs, there are 3 red chairs. We can express this relationship as: B/R = 2/3 We also know that the total number of chairs is 30: B + R = 30 Now, we can solve these equations to find the number of blue chairs. From the ratio B/R = 2/3, we can express B in terms of R: B = (2/3) * R Substitute this expression for B into the total number of chairs equation: (2/3) * R + R = 30 To combine the terms, we need a common denominator, which is 3: (2/3) * R + (3/3) * R = 30 (2R + 3R) / 3 = 30 5R / 3 = 30 Now, multiply both sides by 3 to get rid of the denominator: 5R = 90 Divide both sides by 5 to solve for R: R = 90 / 5 R = 18 Now that we know there are 18 red chairs, we can find the number of blue chairs using the ratio B/R = 2/3: B = (2/3) * R B = (2/3) * 18 B = 12 So, there are 12 blue chairs in the classroom.","Let's think through this step-by-step: 1. We know that for every group of 5 chairs (2 blue + 3 red), the ratio of blue to red chairs is maintained. 2. We have a total of 30 chairs in the classroom. 3. To find out how many groups of 5 chairs there are, we divide the total number of chairs by the size of one group: 30 chairs / 5 chairs/group = 6 groups. 4. Since there are 2 blue chairs in each group, we can find the total number of blue chairs by multiplying the number of groups by the number of blue chairs per group: 6 groups * 2 blue chairs/group = 12 blue chairs. So, there are 12 blue chairs in the classroom."


### 3. How to tackle complex, multi-step tasks
When faced with complex, multi-step tasks, prompting can be a significant challenge. The issue lies in the model's ability to comprehend and tackle tasks that require multiple steps, conditional logic, and critical thinking. Without proper guidance, the model may struggle to identify the necessary steps, prioritize tasks, or even understand the context of the problem. This can lead to incomplete, inaccurate, or irrelevant responses. Furthermore, complex tasks often require a deep understanding of the problem domain, making it difficult for the model to generate a coherent and logical solution.

Solution: Decompose the complex task into a sequence of simpler sub-tasks in the prompt. Guide the model to solve each sub-task step-by-step. Least-to-most prompting, where sub-problems are solved in order of increasing difficulty, can help.

In [13]:
# Define the prompts
query = "The task is to plan a 3-course meal for 6 people."
prompt = f"""<s>[INST]{query} Do not add any preamble.Using the following subtasks:
            Step 1: Choose recipes for appetizer, main course, and dessert
            Step 2: Make a grocery list of all required ingredients 
            Step 3: Determine cooking utensils, pots/pans, bakeware needed
            Step 4: Come up with a timeline to prep each course [/INST]"""
prompts = [prompt]


# Iterate through the prompts and get the responses
responses = []
for prompt in prompts:
    body = json.dumps({
        "temperature": 0.0,
        "max_tokens": 1024,
        "prompt": prompt,
        "stop": ["</s>"]
    })
    response = bedrock.invoke_model(
    body = body,
    modelId = model_id)
    response_body = json.loads(response.get("body").read())
    responses.append(response_body["outputs"][0]["text"].replace("\n", " "))

# Create a pandas DataFrame to display the responses side-by-side
df = pd.DataFrame({
    "Response": [responses[0]]
})

# Display the DataFrame as HTML
display_html(df.to_html(index=False), raw=True)

Response
"### Step 1: Choose Recipes **Appetizer:** Caprese Skewers - Ingredients: Cherry tomatoes, fresh mozzarella balls, fresh basil leaves, balsamic glaze, olive oil, salt, pepper **Main Course:** Chicken Parmesan - Ingredients: 6 boneless, skinless chicken breasts, 2 cups breadcrumbs, 1 cup grated Parmesan cheese, 2 eggs, 1 cup all-purpose flour, 1 cup marinara sauce, 2 cups shredded mozzarella cheese, 1/2 cup olive oil, salt, pepper, garlic powder **Dessert:** Tiramisu - Ingredients: 2 cups mascarpone cheese, 1/2 cup granulated sugar, 1 tsp vanilla extract, 1 cup heavy cream, 2 cups strong brewed coffee, 1/4 cup rum, 24 ladyfingers, cocoa powder for dusting ### Step 2: Grocery List **Produce:** - Cherry tomatoes - Fresh basil leaves - Garlic (for garlic powder) **Dairy:** - Fresh mozzarella balls - Mascarpone cheese - Heavy cream - Eggs **Pantry:** - Balsamic glaze - Olive oil - Salt - Pepper - Breadcrumbs - Grated Parmesan cheese - All-purpose flour - Marinara sauce - Shredded mozzarella cheese - Granulated sugar - Vanilla extract - Strong brewed coffee - Rum - Ladyfingers - Cocoa powder **Meat:** - 6 boneless, skinless chicken breasts ### Step 3: Cooking Utensils, Pots/Pans, Bakeware Needed **Appetizer:** - Skewers - Small bowl for balsamic glaze - Cutting board - Knife **Main Course:** - Large skillet - 3 shallow bowls (for breading station) - Baking dish - Measuring cups and spoons - Whisk **Dessert:** - Mixing bowls - Electric mixer - 9x13-inch baking dish - Spatula - Fine-mesh sieve (for dusting cocoa powder) ### Step 4: Timeline to Prep Each Course **1 hour before serving:** - Prepare Caprese Skewers: Assemble skewers with cherry tomatoes, mozzarella, and basil. Drizzle with balsamic glaze and olive oil. Season with salt and pepper. Refrigerate until serving. **1.5 hours before serving:** - Prepare Chicken Parmesan: - Preheat oven to 450°F (230°C). - Set up breading station with flour, beaten eggs, and breadcrumbs mixed with Parmesan cheese. - Coat chicken breasts in flour, dip in eggs, and coat with breadcrumb mixture. - Heat olive oil in a large skillet and cook chicken until golden brown on both sides. - Transfer chicken to a baking dish, top with marinara sauce and shredded mozzarella. - Bake for 20-25 minutes until chicken is cooked through and cheese is melted. **2 hours before serving:** - Prepare Tiramisu: - Brew strong coffee and let it cool. - Mix mascarpone cheese, sugar, and vanilla extract until smooth. - Whip heavy cream until stiff peaks form and fold into mascarpone mixture. - Dip ladyfingers in coffee and rum mixture and arrange in a baking dish. - Spread half of the mascarpone mixture over the ladyfingers. - Repeat with another layer of ladyfingers and mascarpone mixture. - Dust with cocoa powder and refrigerate until serving. **Serving:** - Serve Caprese Skewers as an appetizer. - Serve Chicken Parmesan as the main course. - Serve Tiramisu for dessert."


### 4. Misunderstanding user intent

Misunderstanding user intent is a common issue in prompting, where the model fails to grasp the user's underlying goals, needs, or requirements. This can lead to responses that are off-topic, irrelevant, or unhelpful. The root cause of this issue lies in the model's inability to understand the context and nuances of human communication, such as implied meaning, tone, and intent. Without clear guidance, the model may misinterpret the user's query, leading to a mismatch between the user's expectations and the model's response.

Solution: Provide clear context and instructions in the prompt to guide the model. Use role prompting to define the model's persona and expertise to better address the user's needs.

In [14]:
# Define the prompts
prompt_1 = f"""<s>[INST]Do not give a preamble.You are a nutritonist. The audience is adults looking to improve their eating habits.
 Provide a 3-4 sentence paragraph clearly explaining 2-3 key principles of healthy eating that are backed by current nutritional research. {query} [/INST]"""
prompts = [prompt_1]

# Iterate through the prompts and get the responses
responses = []
for prompt in prompts:
    body = json.dumps({
        "temperature": 0.5,
        "max_tokens": 512,
        "prompt": prompt,
        "stop": ["</s>"]
    })
    response = bedrock.invoke_model(
    body = body,
    modelId = model_id)
    response_body = json.loads(response.get("body").read())
    responses.append(response_body["outputs"][0]["text"].replace("\n", " "))

# Create a pandas DataFrame to display the responses side-by-side
df = pd.DataFrame({
    "Response": [responses[0]]
})

# Display the DataFrame as HTML
display_html(df.to_html(index=False), raw=True)

Response
"To plan a healthy 3-course meal for six people, remember these key principles: balance your macronutrients (carbohydrates, proteins, and fats) in each course, emphasize a variety of colorful fruits and vegetables for essential micronutrients, and opt for whole, unprocessed foods to maximize nutrient density. Current research supports the Mediterranean diet, which includes plenty of plant-based foods, lean proteins, and healthy fats, like olive oil and avocados. Additionally, portion control is crucial, so consider the recommended serving sizes to avoid overeating."




# Mistral Large 2 vs Mixtral 8x7B vs Mistral Small

## Mistral Large 2
Mistral Large 2 is Mistral AI's latest large language model, designed for complex tasks requiring advanced reasoning, natural language understanding, and knowledge retrieval across multiple languages and domains.

**Capabilities:**
- Proficiency in 80+ coding languages, including Python, Java, C, C++, JavaScript, and Bash
- Enhanced reasoning and problem-solving skills with reduced hallucination tendencies
- Improved instruction-following and conversational abilities
- Advanced function calling and retrieval skills for complex business applications

**Performance:**
- Achieves 84.0% accuracy on MMLU benchmark (pretrained version)
- Outperforms previous Mistral Large model in code generation and reasoning tasks
- Excels in multilingual performance, particularly in English, French, German, Spanish, Italian, Portuguese, Dutch,  Russian, Chinese, Japanese, Korean, Arabic, and Hindi

**Applications:**
- Code generation and software development
- Complex reasoning and problem-solving tasks
- Multilingual document processing and analysis

## Mixtral 8x7B

Mixtral 8x7B is a mixture-of-experts (MoE) model, designed for efficient and fast processing of specific tasks and domains.

**Capabilities:**
- Efficient and fast processing for real-time applications
- Specialized for specific tasks and domains (e.g., customer support, language translation)
- Handling simple to moderately complex prompts
- Basic NLU and contextual understanding

**Performance:**
- Fast and efficient processing for real-time applications
- Good performance in specific tasks and domains
- Moderate accuracy in basic NLU and contextual understanding

**Applications:**
- Customer support and chatbots
- Language translation and localization
- Data analysis and processing

## Mistral Small

Mistral Small is an optimized model designed for low-latency workloads, striking a balance between performance and efficiency.

**Capabilities:**
- Efficient processing for real-time applications
- Handling simple to moderately complex prompts
- Basic NLU and contextual understanding
- Suitable for high-volume, low-complexity tasks

**Performance:**
- Fast and efficient processing for real-time applications
- Good performance in simple to moderate tasks
- Lower accuracy compared to Mistral Large and Mixtral 8x7B

**Applications:**
- Customer support and chatbots
- Text classification and sentiment analysis
- Data extraction and information retrieval

### Key Differences

**Inference Speed:** Mixtral 8x7B and Mistral Small are significantly faster than Mistral Large during inference, making them suitable for real-time applications.

**Prompting:** Mistral Large can handle more abstract and open-ended prompts, while Mixtral 8x7B and Mistral Small require more specific and guided prompting.

**Error Handling:** Mistral Large is more robust to errors and can recover from mistakes more effectively.

To summarize, Mistral Large is the most powerful and capable model, suitable for complex tasks and advanced applications, while Mixtral 8x7B and Mistral Small are optimized for efficient processing of specific tasks and real-time applications, respectively.



## Example
The following prompts demonstrate the contrasting capabilities of the Mixtral 8x7b and Mistral Large models in terms of critical thinking and multilingual proficiency. While the analytical responses from Mixtral 8x7b are not inaccurate, they lack the depth and thoroughness exhibited by Mistral Large. Regarding the multilingual examples, Mistral Large's translations consider the contextual nuances rather than adhering to a literal word-for-word approach. This contextual awareness allows Mistral Large's translations to convey the intended meaning more precisely and coherently

In [15]:
# Define the prompts
prompt_1 = """<s>[INST]Translate the following text to Spanish, French and German:
             "The quick brown fox jumps over the lazy dog. [/INST]"""
prompts = [prompt_1]
model_ids = [MIXTRAL,
              DEFAULT_MODEL,
              # MISTRAL_SMALL
             ]

# Iterate through the prompts and get the responses
responses = []
for prompt in prompts:
    for model_id in model_ids:
        body = json.dumps({
            "temperature": 0.5,
            "max_tokens": 1024,
            "prompt": prompt,
            "stop": ["</s>"]
        })
        response = bedrock.invoke_model(
        body = body,
        modelId = model_id)
        response_body = json.loads(response.get("body").read())
        responses.append(response_body["outputs"][0]["text"].replace("\n", " "))

# Create a pandas DataFrame to display the responses side-by-side
df = pd.DataFrame({
    "Mixtral 8x7b Multilingual Prompt": [responses[0]],
    "Mistral Large Multilinigual Prompt": [responses[1]],
    #"Mistral Small Multilinigual Prompt": [responses[1]],

})

# Display the DataFrame as HTML
display_html(df.to_html(index=False), raw=True)

Mixtral 8x7b Multilingual Prompt,Mistral Large Multilinigual Prompt
"Spanish: ""El zorro pardo ágil salta sobre el perro perezoso."" French: ""Le renard brun rapide saute par-dessus le chien paresseux."" German: ""Der schnelle braune Fuchs springt über den faulen Hund.""","Certainly! Here are the translations: **Spanish:** ""El veloz zorro marrón salta sobre el perro perezoso."" **French:** ""Le rapide renard brun saute par-dessus le chien paresseux."" **German:** ""Der schnelle braune Fuchs springt über den faulen Hund."""


In [10]:
# Define the prompts

prompt_1 = """<s>[INST]Evaluate this Python function for correctness and efficiency:
            def fibonacci(n):
            if n <= 0:
            return 0
            elif n == 1:
            return 1
            else:
            return fibonacci(n-1) + fibonacci(n-2) [/INST]"""
prompts = [prompt_1]
model_ids = [MIXTRAL,
            DEFAULT_MODEL,
            # MISTRAL_SMALL
                ]

# Iterate through the prompts and get the responses
responses = []
for prompt in prompts:
    for model_id in model_ids:
        body = json.dumps({
            "temperature": 0.5,
            "max_tokens": 514,
            "prompt": prompt,
            "stop": ["</s>"]
        })
        response = bedrock.invoke_model(
        body = body,
        modelId = model_id)
        response_body = json.loads(response.get("body").read())
        responses.append(response_body["outputs"][0]["text"].replace("\n", " "))

# Create a pandas DataFrame to display the responses side-by-side
df = pd.DataFrame({
    "Mixtral 8x7b Analytical Prompt": [responses[0]],
    "Mistral Large Analytical Prompt": [responses[1]],
    # "Mistral Small Analytical Prompt": [responses[2]]
    

})

# Display the DataFrame as HTML
display_html(df.to_html(index=False), raw=True)

Mixtral 8x7b Analytical Prompt,Mistral Large Analytical Prompt
"Correctness: The function is not correct because it does not handle the case when n is 2, returning 0 instead of 1. Also, for large values of n, the function will exceed the maximum recursion depth due to multiple nested calls. Efficiency: The function's efficiency is very poor due to the repeated computation of Fibonacci numbers. For example, when calculating fibonacci(5), the function calls fibonacci(4) and fibonacci(3), which in turn call fibonacci(3) and fibonacci(2), and so on. This leads to an exponential time complexity. To improve the efficiency, we can use dynamic programming to store and reuse the calculated Fibonacci numbers. Here's an optimized version of the function: def fibonacci(n, memo={}): if n <= 0: return 0 elif n == 1: return 1 elif n in memo: return memo[n] else: result = fibonacci(n-1, memo) + fibonacci(n-2, memo) memo[n] = result return result This version has a time complexity of O(n) and a space complexity of O(n), making it much more efficient than the original implementation.","The function is correct in that it calculates the Fibonacci sequence. However, it's not efficient because it uses recursion to calculate each number in the sequence, which results in a lot of redundant calculations. For example, to calculate the 5th number in the sequence, it first calculates the 4th and 3rd numbers, but to calculate the 4th number, it already calculated the 3rd number. This redundancy leads to an exponential time complexity of O(2^n). A more efficient approach would be to use dynamic programming or iteration to calculate each number in the sequence only once."


# Function Calling with Mistral AI Models and the Converse API

In this section, we will explore how to leverage the powerful function calling capability of Mistral AI models using Amazon Bedrock's Converse API. Function calling allows you to integrate external tools and APIs with the AI models, enabling them to perform a wide variety of tasks and access real-time information.

By combining the advanced natural language understanding of Mistral AI models with the ability to invoke custom functions, you can build intelligent applications that can handle complex queries and provide accurate responses. The [Converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html) provides a consistent and simplified interface to interact with Mistral models, making it easier to manage conversations and function calls.
We will dive into practical code examples that demonstrate how to set up function calling with Mistral models using the Converse API. You will learn how to define custom functions, register them with the AI models, and handle the function invocation flow seamlessly.


In [16]:

class StudioNotFoundError(Exception):
    """Raised when a movie studio isn't found."""
    pass
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)
# Creating the actual tool that returns mock data
def get_top_movie(studio):
    """Returns the highest grossing movie for the requested studio.
    Args:
        studio (str): The name of the movie studio for which you want
        the highest grossing movie.

    Returns:
        response (json): The highest grossing movie and its gross.
    """

    movie = ""
    gross = ""
    if studio == 'Paramount':
        movie = "Top Gun: Maverick"
        gross = "$718,732,821"

    else:
        return "ERROR: Studio not found."

    return movie, gross

In this case, we created a mock function that simulates the behavior of an external tool or API called `get_top_movies`. This function can be invoked by the model when it determines that it needs to retrieve information about the top movies. Note that in a real-world scenario, this tool could be implemented as a Lambda function or any other external service.

In [17]:
# Create a function to facilitate the Converse API Call
def generate_text(bedrock_client, model_id, tool_config, input_text):
    """Generates text using the supplied Amazon Bedrock model. If necessary,
    the function handles tool use requests and sends the result to the model.
    Args:
        bedrock_client: The Boto3 Bedrock runtime client.
        model_id (str): The Amazon Bedrock model ID.
        tool_config (dict): The tool configuration.
        input_text (str): The input text.
    Returns:
        Nothing.
    """

    logger.info("Generating text with model %s", model_id)

   # Create the initial message from the user input.
    messages = [{
        "role": "user",
        "content": [{"text": input_text}]
    }]

    response = bedrock_client.converse(
        modelId=model_id,
        messages=messages,
        toolConfig=tool_config
    )

    output_message = response['output']['message']
    messages.append(output_message)
    stop_reason = response['stopReason']

    if stop_reason == 'tool_use':
        # Tool use requested. Call the tool and send the result to the model.
        tool_requests = response['output']['message']['content']
        for tool_request in tool_requests:
            if 'toolUse' in tool_request:
                tool = tool_request['toolUse']
                logger.info("Requesting tool %s. Request: %s",
                            tool['name'], tool['toolUseId'])

                if tool['name'] == 'top_movie':
                    tool_result = {}
                    try:
                        movie, gross = get_top_movie(tool['input']['studio'])
                        tool_result = {
                            "toolUseId": tool['toolUseId'],
                            "content": [{"json": {"movie": movie, "gross": gross}}]
                        }
                    except StudioNotFoundError as err:
                        tool_result = {
                            "toolUseId": tool['toolUseId'],
                            "content": [{"text":  err.args[0]}],
                            "status": 'error'
                        }

                    tool_result_message = {
                        "role": "user",
                        "content": [
                            {
                                "toolResult": tool_result

                            }
                        ]
                    }
                    messages.append(tool_result_message)

                    # Send the tool result to the model.
                    response = bedrock_client.converse(
                        modelId=model_id,
                        messages=messages,
                        toolConfig=tool_config
                    )
                    output_message = response['output']['message']

    # print the final response from the model.
    for content in output_message['content']:
        print(f"{content['text']}")

Below, you'll find a section where you need to provide a detailed description of your tool. This description should include information that the model can use to determine when to invoke or call your tool. It's essential to provide clear guidelines and required inputs to help the model understand the appropriate scenarios for utilizing your tool effectively.

To ensure you provide a comprehensive and accurate description, you can refer to the official documentation located at https://docs.aws.amazon.com/bedrock/latest/userguide/tool-use.html. This documentation offers detailed guidance on how to describe your tool, including best practices, formatting guidelines, and examples.

In [18]:
# Using Mistral Large
model_id = DEFAULT_MODEL
input_text = "What is the highest grossing movie from Paramount?"

tool_config = {
    "tools": [
        {
            "toolSpec": {
                "name": "top_movie",
                "description": "Get the highest grossing movie for a given movie studio.",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "studio": {
                                "type": "string",
                                "description": "The name of the movie studio for which you want the highest grossing movie. Example studios are Paramount, Warner Bros., and Universal."
                            }
                        },
                        "required": [
                            "studio"
                        ]
                    }
                }
            }
        }
    ]
}

print(f"Question: {input_text}")
generate_text(bedrock, model_id, tool_config, input_text)

INFO:__main__:Generating text with model mistral.mistral-large-2407-v1:0


Question: What is the highest grossing movie from Paramount?


INFO:__main__:Requesting tool top_movie. Request: tooluse_pqvT0rvxR06nD9sPldIueA


The highest grossing movie from Paramount is Top Gun: Maverick, which grossed $718,732,821.


When the Mistral model is prompted with the question "What is the highest grossing movie from Paramount?", it recognizes that it needs to use the top_movie tool to answer this query.

The model then invokes the top_movie tool, which queries the right function tofind the highest-grossing Paramount movie. The tool returns the result "Top Gun: Maverick with a gross of $718,732,821", which the Mistral model incorporates into its final response.

This output demonstrates the ability of Bedrocke and the Mistral Large 2 model to leverage external tools and data sources to enhance their capabilities and provide more accurate and up-to-date information in their responses.

## Conclusion
In this notebook, we discussed the common pitfalls that can arise when dealing with prompts and large language models (LLMs). These pitfalls include hallucination and factual inaccuracies, lack of coherence and logical reasoning in generated text, difficulty with complex multi-step tasks, and misunderstanding user intent.

We then explored how to create strong prompts and the importance of prompt engineering techniques to mitigate these issues. The key prompt engineering techniques covered were zero-shot prompting, few-shot prompting, and chain-of-thought (CoT) prompting.

Zero-shot prompting relies solely on the model's pre-existing knowledge without any additional examples. Few-shot prompting provides a small number of examples to guide the model's outputs. CoT prompting encourages the model to explain its reasoning process step-by-step, which is particularly useful for complex reasoning tasks.

We discussed the differences between the different Mistral AI models and their respective use cases. While Mixtral 8x7b can provide analytical responses, Mistral Large 2 offers more thorough and contextually aware outputs, especially for multilingual tasks. We also introduced the Function Calling capabaility with the Response API. 
