# Mistral Large 2 (24.07) Deeper Dive

Mistral Large 2 is an advanced large language model with state-of-the-art reasoning, knowledge, and coding capabilities according to Mistral. Mistral Large 2 is designed for single-node inference with long-context applications in mind – its size of 123 billion parameters allows it to run at large throughput on a single node. A significant effort was also devoted by Mistral to enhancing the model's reasoning capabilities. One of Mistral's key focuses during training was to minimize the model's tendency to "hallucinate," or generate plausible-sounding but factually incorrect or irrelevant information. This was achieved by fine-tuning the model to be more cautious and discerning in its responses, ensuring that it provides reliable and accurate outputs. Additionally, the new Mistral Large is trained to acknowledge when it cannot find solutions or does not have sufficient information to provide a confident answer. Mistral Large 2 sets a new frontier in terms of performance / cost of serving on evaluation metrics. In particular, on MMLU, the pretrained version achieves an accuracy of 84.0%, and sets a new point on the performance/cost Pareto front of open models.

The model is currently generally available for use on Amazon Bedrock, check out the blog post for more info.

In this notebook we will be demonstrating python generation, language translation and assessment and tool use. 

## Model Card

Available regions: US-West-2

Model ID: mistral.mistral-large-2407-v1:0

Context Window : 128k

In [2]:
import boto3
import json
from boto3 import client

In [3]:
bedrock_client = boto3.client(service_name='bedrock-runtime', region_name="us-west-2")

In [4]:
mistral_large_2 = 'mistral.mistral-large-2407-v1:0'
mistral_large_1 = 'mistral.mistral-large-2402-v1:0'

## Python Generation 

In [5]:
def converse(prompt, model_id=mistral_large_2, max_tokens=3000, temperature=0.1, top_p=0.9):
    body = json.dumps({
        "prompt": prompt,
        "max_tokens": max_tokens,
        "temperature": temperature,
        "top_p": top_p
    })
    
    response = bedrock_client.invoke_model(
        modelId=model_id,
        body=body
    )
    
    response_body = json.loads(response.get('body').read())
    generated_text = response_body.get('outputs', [{}])[0].get('text', '')
    return generated_text

In [6]:
prompt = """

Create a Python script for a Jupyter notebook that fetches stock market data for NVIDIA (NVDA) and Apple (AAPL) and performs analysis with interactive visualizations. The script should:

Use the yfinance library to fetch stock data for NVDA and AAPL for the last 6 months.
Implement data analysis functions to:

Calculate daily returns
Compute 20-day moving average


Create interactive visualizations using Plotly, including:

A line chart comparing both stocks' performance and their 20-day moving averages
A bar chart showing daily returns for the selected stock


Use ipywidgets to create:

A dropdown menu to select which stock to display in the bar chart
A button to refresh the data


Include error handling for API calls and data processing.
Add brief comments explaining key parts of the code.

The final output should be a single Python script that can be run in a Jupyter notebook cell, creating an interactive dashboard for stock analysis of NVDA and AAPL.
Ensure all necessary libraries are imported at the beginning of the script. The code should be well-structured and follow Python best practices.
Note: Real-time updating is not required; fetching data once when the script runs and on button click is sufficient.
"""

generated_text = converse(prompt)
print(generated_text)


```python
# Import necessary libraries
import yfinance as yf
import pandas as pd
import plotly.graph_objs as go
from plotly.subplots import make_subplots
import ipywidgets as widgets
from IPython.display import display

# Function to fetch stock data
def fetch_stock_data(ticker, period='6mo'):
    try:
        stock_data = yf.download(ticker, period=period)
        return stock_data
    except Exception as e:
        print(f"Error fetching data for {ticker}: {e}")
        return pd.DataFrame()

# Function to calculate daily returns
def calculate_daily_returns(stock_data):
    stock_data['Daily Return'] = stock_data['Adj Close'].pct_change()
    return stock_data

# Function to compute 20-day moving average
def compute_moving_average(stock_data, window=20):
    stock_data[f'{window}-Day MA'] = stock_data['Adj Close'].rolling(window=window).mean()
    return stock_data

# Fetch data for NVDA and AAPL
nvda_data = fetch_stock_data('NVDA')
aapl_data = fetch_stock_data('AAPL')

# Calculate 

Python generated by Large 2 below, or feel free to run the cell above to get the same or similar completion.

In [None]:
# Large 2 generated 
# Import necessary libraries
import yfinance as yf
import pandas as pd
import plotly.graph_objs as go
from plotly.subplots import make_subplots
import ipywidgets as widgets
from IPython.display import display

# Function to fetch stock data
def fetch_stock_data(ticker, period='6mo'):
    try:
        stock_data = yf.download(ticker, period=period)
        return stock_data
    except Exception as e:
        print(f"Error fetching data for {ticker}: {e}")
        return pd.DataFrame()

# Function to calculate daily returns
def calculate_daily_returns(stock_data):
    stock_data['Daily Return'] = stock_data['Adj Close'].pct_change()
    return stock_data

# Function to compute 20-day moving average
def compute_moving_average(stock_data, window=20):
    stock_data[f'{window}-Day MA'] = stock_data['Adj Close'].rolling(window=window).mean()
    return stock_data

# Fetch data for NVDA and AAPL
nvda_data = fetch_stock_data('NVDA')
aapl_data = fetch_stock_data('AAPL')

# Calculate daily returns and moving averages
nvda_data = calculate_daily_returns(nvda_data)
aapl_data = calculate_daily_returns(aapl_data)
nvda_data = compute_moving_average(nvda_data)
aapl_data = compute_moving_average(aapl_data)

# Function to create line chart
def create_line_chart(nvda_data, aapl_data):
    fig = make_subplots(rows=1, cols=1, shared_xaxes=True)

    fig.add_trace(go.Scatter(x=nvda_data.index, y=nvda_data['Adj Close'], mode='lines', name='NVDA'))
    fig.add_trace(go.Scatter(x=nvda_data.index, y=nvda_data['20-Day MA'], mode='lines', name='NVDA 20-Day MA'))
    fig.add_trace(go.Scatter(x=aapl_data.index, y=aapl_data['Adj Close'], mode='lines', name='AAPL'))
    fig.add_trace(go.Scatter(x=aapl_data.index, y=aapl_data['20-Day MA'], mode='lines', name='AAPL 20-Day MA'))

    fig.update_layout(title='NVDA and AAPL Stock Performance', xaxis_title='Date', yaxis_title='Price')
    return fig

# Function to create bar chart
def create_bar_chart(stock_data, ticker):
    fig = go.Figure(data=[go.Bar(x=stock_data.index, y=stock_data['Daily Return'], name=ticker)])
    fig.update_layout(title=f'{ticker} Daily Returns', xaxis_title='Date', yaxis_title='Daily Return')
    return fig

# Dropdown menu to select stock
dropdown = widgets.Dropdown(
    options=[('NVDA', 'NVDA'), ('AAPL', 'AAPL')],
    value='NVDA',
    description='Stock:'
)

# Button to refresh data
button = widgets.Button(description="Refresh Data")

# Output widgets for charts
line_chart_output = widgets.Output()
bar_chart_output = widgets.Output()

# Function to update charts
def update_charts(change):
    with line_chart_output:
        line_chart_output.clear_output()
        fig = create_line_chart(nvda_data, aapl_data)
        fig.show()

    with bar_chart_output:
        bar_chart_output.clear_output()
        selected_stock = dropdown.value
        if selected_stock == 'NVDA':
            fig = create_bar_chart(nvda_data, 'NVDA')
        else:
            fig = create_bar_chart(aapl_data, 'AAPL')
        fig.show()

# Function to refresh data
def refresh_data(button):
    global nvda_data, aapl_data
    nvda_data = fetch_stock_data('NVDA')
    aapl_data = fetch_stock_data('AAPL')
    nvda_data = calculate_daily_returns(nvda_data)
    aapl_data = calculate_daily_returns(aapl_data)
    nvda_data = compute_moving_average(nvda_data)
    aapl_data = compute_moving_average(aapl_data)
    update_charts(None)

# Register button click event
button.on_click(refresh_data)

# Initial chart display
update_charts(None)

# Display widgets
display(dropdown, button, line_chart_output, bar_chart_output)

## Language Translation Performance

Large 1 & Large 2 Japanese translation comparison

In [None]:
def compare_models(prompt, model_1=mistral_large_1, model_2=mistral_large_2, max_tokens=1000, temperature=0.1, top_p=0.9):
    result_1 = converse(prompt, model_id=model_1, max_tokens=max_tokens, temperature=temperature, top_p=top_p)
    result_2 = converse(prompt, model_id=model_2, max_tokens=max_tokens, temperature=temperature, top_p=top_p)
    
    return {
        f"{model_1}": result_1,
        f"{model_2}": result_2
    }

In [None]:
language_input = """
Quantum Entanglement: Spooky Action at a Distance

Quantum entanglement is a phenomenon where two particles become connected in such a way that the quantum state of each particle cannot be described independently, even when separated by a large distance. Einstein famously referred to this as 'spooky action at a distance.'
Key points:

Entangled particles react instantaneously to their partner's state changes.
This seems to violate the speed of light limit for information transfer.
It's fundamental to quantum computing and cryptography.
Entanglement has been experimentally verified up to distances of 1,200 kilometers.

Despite decades of research, the mechanism behind quantum entanglement remains one of the most perplexing aspects of quantum mechanics.
"""

In [None]:
translation_prompt = f"""

<<Text to translate>>

{language_input}

<<Instructions>>

Translate the following text about quantum entanglement into Japanese only. Ensure the translation accurately conveys the scientific concepts and maintains the educational tone of the original text.

Format the response as follows:

Japanese:
"""

comparison_results = compare_models(translation_prompt)

print("Mistral Large 1 Output:")
print(comparison_results[mistral_large_1])
print("\n" + "="*50 + "\n")
print("Mistral Large 2 Output:")
print(comparison_results[mistral_large_2])

Compare Japanese translations using Large 2

In [None]:
comparison_prompt = f"""
Compare the following two Japanese translations of the English text about quantum entanglement. Analyze them in terms of accuracy, naturalness, preservation of scientific terminology, and overall suitability for a scientific publication. Discuss the strengths and weaknesses of each translation together.

Original English text:
{language_input}

Translation 1:
{comparison_results[mistral_large_1]}

Translation 2:
{comparison_results[mistral_large_2]}

Please structure your analysis as follows:
1. Accuracy: How well does each translation convey the original meaning?
2. Naturalness: How natural and fluent does the Japanese sound in each translation?
3. Scientific Terminology: How well are scientific terms preserved and translated?
4. Overall Suitability: Which translation would be more suitable for a scientific publication and why?
5. Specific Differences: Highlight any notable differences between the two translations and discuss their impact.
6. Conclusion: Summarize your findings and give your overall assessment of which translation is superior.

Provide your analysis in English in an easy to read paragraph.
"""

comparison_analysis = converse(comparison_prompt, model_id=mistral_large_2, max_tokens=2000, temperature=0.1, top_p=0.9)
print(comparison_analysis)

## Tool Use

In [None]:
tool_config = {
    "tools": [
        {
            "toolSpec": {
                "name": "shinkansen_schedule",
                "description": "Fetches Shinkansen train schedule departure times for a specified station and time.",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "station": {"type": "string", "description": "The station name."},
                            "departure_time": {"type": "string", "description": "The departure time in HH:MM format."}
                        },
                        "required": ["station", "departure_time"]
                    }
                }
            }
        },
        {
            "toolSpec": {
                "name": "weather_forecast",
                "description": "Fetches the weather forecast for a specified city and date.",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "city": {"type": "string", "description": "The city name."},
                            "date": {"type": "string", "description": "The date in YYYY-MM-DD format."}
                        },
                        "required": ["city", "date"]
                    }
                }
            }
        }
    ]
}


In [None]:
def shinkansen_schedule(station, departure_time):
    schedule = {
        "Tokyo": {"09:00": "Hikari", "12:00": "Nozomi", "15:00": "Kodama"},
        "Osaka": {"10:00": "Nozomi", "13:00": "Hikari", "16:00": "Kodama"}
    }
    return schedule.get(station, {}).get(departure_time, "No train found")

In [None]:
def weather_forecast(city, date):
    forecast = {
        "Tokyo": {"2023-06-12": "Sunny", "2023-06-13": "Rainy"},
        "Osaka": {"2023-06-12": "Cloudy", "2023-06-13": "Sunny"}
    }
    return forecast.get(city, {}).get(date, "No forecast found")

In [None]:
def process_tool_call(tool_name, tool_inputs):
    if tool_name == "shinkansen_schedule":
        return shinkansen_schedule(tool_inputs["station"], tool_inputs["departure_time"])
    elif tool_name == "weather_forecast":
        return weather_forecast(tool_inputs["city"], tool_inputs["date"])
    else:
        return f"Unknown tool: {tool_name}"

In [None]:
MODEL_ID = 'mistral.mistral-large-2407-v1:0'

In [None]:
def chatbot_interaction(user_message, bedrock_client):
    print(f"\n{'='*50}\nUser Message: {user_message}\n{'='*50}")

    messages = [{
        "role": "user",
        "content": [{"text": user_message}]
    }]

    while True:
        converse_response = bedrock_client.converse(
            modelId=MODEL_ID,
            messages=messages,
            inferenceConfig={"maxTokens": 4096},
            toolConfig=tool_config
        )
        message = converse_response['output']['message']

        print(f"\nResponse:")
        print(f"Stop Reason: {converse_response['stopReason']}")
        print(f"Content: {message['content']}")

        if converse_response['stopReason'] != "tool_use":
            break

        tool_results = []
        for block in message['content']:
            if 'toolUse' in block:
                tool_use = block['toolUse']
                tool_name = tool_use["name"]
                tool_input = tool_use["input"]

                print(f"\nTool Used: {tool_name}")
                print(f"Tool Input:")
                print(json.dumps(tool_input, indent=2))

                tool_result = process_tool_call(tool_name, tool_input)

                print(f"\nTool Result:")
                print(json.dumps(tool_result, indent=2))

                tool_results.append({
                    "toolResult": {
                        "toolUseId": tool_use['toolUseId'],
                        "content": [{"text": str(tool_result)}],
                    }
                })

        messages.extend([
            {"role": "assistant", "content": message['content']},
            {"role": "user", "content": tool_results},
        ])

    final_response = next(
        (block['text'] for block in message['content'] if block.get('text')),
        None,
    )

    print(f"\nFinal Response: {final_response}")

In [None]:
bedrock_client = boto3.client('bedrock-runtime')
result = chatbot_interaction("What train departs Osaka at 10:00 and what is the weather in Tokyo on 2023-06-12?", bedrock_client)
print(result)