In [1]:
from langchain.chat_models import init_chat_model
from langchain.callbacks.tracers import LangChainTracer


from main import process_sample
from utils import load_environment

load_environment()

import pandas as pd

In [2]:
START_SAMPLE_ID = 0
END_SAMPLE_ID = 1

df_examples = pd.read_csv("../data/examples.csv")[START_SAMPLE_ID:END_SAMPLE_ID]
df_evaluation_prompts = pd.read_csv("../data/evaluation_prompts.csv")

records = df_examples.to_dict(orient="records")

# Anthropic

In [3]:
# "anthropic:claude-3-5-sonnet-latest"

claude_sonnet_3_7 = process_sample(
    sample_id = 0,
    record = records[0],
    test_models = ["llama3.2"],
    judge_models = ["anthropic:claude-3-7-sonnet-20250219"],
    evaluation_prompts_file_path = "../data/evaluation_prompts.csv",
)

ERROR:LM:Failed to call : llama3.2 | Intention : test | Error : Unable to infer model provider for model='llama3.2', please specify model_provider directly.


In [7]:
test,judge = claude_sonnet_3_7
print(test[0]['test model response'])

Based on the context document, your risk factors for dementia include:

1. High blood pressure: Having consistent high blood pressure in mid-life (ages 45 to 65) increases your risk of developing dementia.
2. Diabetes: Having type 2 diabetes in mid-life (ages 45 to 65) increases your risk of developing dementia.
3. Obesity: Being obese in mid-life (ages 45 to 65) increases your risk of developing dementia.
4. Lack of physical activity: Physical inactivity in later life (ages 65 and up) increases your risk of developing dementia.
5. Poor diet: Eating an unhealthy diet, high in saturated fat, sugar, and salt, can increase your risk of developing dementia.

Regarding cognitive engagement, it is thought to support the development of a "cognitive reserve". This means that actively using your brain throughout your life may help protect against brain cell damage caused by dementia.


In [8]:
print(judge[0][0]['judge model response'])

Sentence 1: Based on the context document, your risk factors for dementia include:
Sentence 1 label: Accurate

Sentence 2: 1. High blood pressure: Having consistent high blood pressure in mid-life (ages 45 to 65) increases your risk of developing dementia.
Sentence 2 label: Accurate

Sentence 3: 2. Diabetes: Having type 2 diabetes in mid-life (ages 45 to 65) increases your risk of developing dementia.
Sentence 3 label: Inaccurate

Sentence 4: 3. Obesity: Being obese in mid-life (ages 45 to 65) increases your risk of developing dementia.
Sentence 4 label: Inaccurate

Sentence 5: 4. Lack of physical activity: Physical inactivity in later life (ages 65 and up) increases your risk of developing dementia.
Sentence 5 label: Inaccurate

Sentence 6: 5. Poor diet: Eating an unhealthy diet, high in saturated fat, sugar, and salt, can increase your risk of developing dementia.
Sentence 6 label: Accurate

Sentence 7: Regarding cognitive engagement, it is thought to support the development of a "co

# Google

### vertexai


In [3]:
# Initialize Gemini 1.5 Pro model

# "google_vertexai:gemini-1.5-pro"
gemini_model = init_chat_model(model="gemini-2.5-pro-exp-03-25")
response_gemini = gemini_model.invoke("why the sky is blue?")
response_gemini

AIMessage(content="Okay, let's break down why the sky appears blue. It's all about sunlight, Earth's atmosphere, and how light interacts with tiny particles.\n\nHere's the step-by-step:\n\n1.  **Sunlight Isn't Just White:** The light coming from the sun might look white, but it's actually made up of all the colors of the rainbow (Red, Orange, Yellow, Green, Blue, Indigo, Violet - ROYGBIV). Each color has a different wavelength; red has the longest wavelength, and violet has the shortest.\n\n2.  **Earth's Atmosphere:** As sunlight travels towards Earth, it enters our atmosphere, which is made up mostly of tiny gas molecules (like nitrogen and oxygen).\n\n3.  **Scattering Light:** When sunlight hits these tiny gas molecules, it gets scattered in different directions. This is called **Rayleigh scattering**.\n\n4.  **Shorter Wavelengths Scatter More:** Here's the key part: Rayleigh scattering affects shorter wavelengths of light (blue and violet) *much more* strongly than longer wavelength

In [4]:
print(response_gemini.content)

The sky appears blue due to a phenomenon called **Rayleigh scattering**. Here's a breakdown:

**1. Sunlight Enters the Atmosphere:**
   Sunlight, appearing white to us, is actually a mixture of all colors of the rainbow. When this light enters the Earth's atmosphere, it encounters various particles and gases.

**2. Scattering of Light:**
   The molecules of nitrogen and oxygen, which make up most of the atmosphere, are much smaller than the wavelengths of visible light. This size difference causes the sunlight to scatter in all directions. This scattering is more effective for shorter wavelengths, like blue and violet.

**3. Rayleigh Scattering:**
   This type of scattering, named after Lord Rayleigh, is inversely proportional to the fourth power of the wavelength. This means blue light (with a shorter wavelength) is scattered about 10 times more effectively than red light (with a longer wavelength).

**4. Our Perception:**
   As we look up at the sky, we see the scattered blue light c

# Deepseek


In [5]:
# "deepseek-chat"
deepseek_model = init_chat_model(model="deepseek-reasoner")
response_deepseek = deepseek_model.invoke(
    "why the sky is blue?",
    config={"callbacks": [tracer_project]}
    )
response_deepseek
print(response_deepseek.content)



The sky appears blue due to a phenomenon called **Rayleigh scattering**, which occurs when sunlight interacts with molecules and small particles in Earth's atmosphere. Here's a concise breakdown:

1. **Sunlight Composition**: Sunlight is white light composed of various colors (wavelengths), with violet and blue having shorter wavelengths (~400-450 nm) and red/orange having longer wavelengths (~620-750 nm).

2. **Scattering Mechanism**: 
   - Shorter wavelengths (blue/violet) are scattered more efficiently by atmospheric molecules (like nitrogen and oxygen) than longer wavelengths. This wavelength-dependent scattering is described by Rayleigh's law (scattering intensity ∝ 1/λ⁴).
   - As sunlight passes through the atmosphere, blue light is scattered in all directions, creating a diffuse glow that we perceive as the blue sky.

3. **Human Perception**:
   - Although violet light is scattered even more than blue, human eyes are less sensitive to violet, and the sun emits more blue light. T

# LLama

In [10]:
llama_model = init_chat_model(model="ollama:llama3.2")
response_llama = llama_model.invoke("Explain gravity",
    config={"callbacks": [tracer_project]} if tracer_project else {}
    )
response_llama
print(response_llama.content)

Gravity is a fundamental force of nature that causes objects with mass to attract each other. It is one of the four fundamental forces in physics, along with electromagnetism and the strong and weak nuclear forces.

**What is Gravity?**

Gravity is a universal force that affects everything with mass or energy, from the smallest subatomic particles to the largest galaxies. It is a curvature of spacetime caused by the presence of mass and energy. According to Albert Einstein's theory of general relativity, gravity is not a force that acts between objects, but rather a consequence of their motion in curved spacetime.

**How does Gravity Work?**

Gravity works as follows:

1. **Mass warps spacetime**: Any object with mass or energy creates a curvature in the fabric of spacetime around it.
2. **Gravitational field**: The curvature of spacetime creates a gravitational field, which is a region around an object where the curvature is significant.
3. **Objects move along curved paths**: When an