<a href="https://colab.research.google.com/github/bisheralwan/ChatBot/blob/main/W2025/Assignments/A3/SYSC4415_W25_A3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Welcome to Assignment 3

**TA: [Igor Bogdanov](mailto:igorbogdanov@cmail.carleton.ca)**

## General Instructions:

This Assignment can be done **in a group of two or individually**.

YOU HAVE TO JOIN A GROUP ON BRIGHTSPACE TO SUBMIT.

Please state it explicitly at the beginning of the assignment.

You need only one submission if it's group work.

Please print out values when asked using Python's print() function with f-strings where possible.

Submit your **saved notebook with all the outputs** to Brightspace, but ensure it will produce correct outputs upon restarting and click "runtime" → "run all" with clean outputs. Ensure your notebook displays all answers correctly.

## Your Submission MUST contain your signature at the bottom.

### Objective:
In this assignment, we build a reasoning AI agent that facilitates ML operations and model evaluation. This assignment is heavily based on Tutorial 9.

**Submission:** Submit your Notebook as a *.ipynb* file that adopts this naming convention: ***SYSC4415_W25_A3_NameLastname.ipynb*** on *Brightspace*. No other submission (e.g., through email) will be accepted. (Example file name: SYSC4415_W25_A3_IgorBogdanov.ipynb or SYSC4415_W25_A3_Student1_Student2.ipynb) The notebool MUST contain saved outputs

**Runtime tips:**
Agentic programming and API calling can be easily done locally and moved to Colab in the final stages, depending on the implementation of your tools and ML tasks you want to run.

# Imports

Some basic libraries you need are imported here. Make sure you include whatever library you need in this entire notebook in the code block below.

If you are using any library that requires installation, please paste the installation command here.
Leave the code block below if you are not installing any libraries.

In [1]:
# Name: Bisher Abou-Alwan
# Student Number: 101211242

# Name: Mohammad Abusalem
# Student Number: 101204665

In [2]:
# Libraries to install - leave this code block blank if this does not apply to you
# Please add a brief comment on why you need the library and what it does


In [3]:
!pip install groq

# Libraries you might need
# General
import os
import zipfile
import librosa
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns


# For pre-processing
import torch
from torch.utils.data import Dataset, DataLoader, random_split
import torchvision.transforms as transforms
from torchvision.datasets import ImageFolder

# For modeling
import torch.nn as nn
import torch.optim as optim
from tqdm import tqdm
import torchsummary

# For metrics
from sklearn.metrics import  accuracy_score
from sklearn.metrics import  precision_score
from sklearn.metrics import  recall_score
from sklearn.metrics import  f1_score
from sklearn.metrics import  classification_report
from sklearn.metrics import precision_recall_curve
from sklearn.metrics import  roc_auc_score
from sklearn.metrics import confusion_matrix

# Agent
from groq import Groq
from dataclasses import dataclass
import re
from typing import Dict, List, Optional


Collecting groq
  Downloading groq-0.22.0-py3-none-any.whl.metadata (15 kB)
Downloading groq-0.22.0-py3-none-any.whl (126 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m126.7/126.7 kB[0m [31m4.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: groq
Successfully installed groq-0.22.0


# Task 1: Registration and API Activation (5 marks)

For this particular assignment, we will be using GroqCloud for LLM inference. This task aims to determine how to use the Groq API with LLMs.  

Create a free account on https://groq.com/ and generate an API Key. Don't remove your key until you get your grade. Feel free to delete your API key after the term is completed.

In conversational AI, prompting involves three key roles: the system role (which sets the agent's behavior and capabilities), the user role (which represents human inputs and queries), and the assistant role (which contains the agent's responses). The system role provides the foundational instructions and constraints, the user role delivers the actual queries or commands, and the assistant role generates contextual, step-by-step responses following the system's guidelines. This structured approach ensures consistent, controlled interactions where the agent maintains its defined behavior while responding to user needs, with each role serving a specific purpose in the conversation flow.


In [4]:
# Q1a (2 mark)
# Create a client using your API key.

client = Groq(api_key="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")



In [5]:
# Q1b (3 marks)

# instantiate chat_completion object using model of your choice (llama-3.3-70b-versatile - recommended)
# Hint: Use Tutorial 9 and Groq Documentation
# Explain each parameter and how each value change influences the LLM's output.
# Prompt the model using the user role about anything different from the tutorial.

chat_completion = client.chat.completions.create(
    model="llama-3.3-70b-versatile",
    messages=[
        {"role": "system", "content": "You are a helpful assistant that provides facts and reasoning."},
        {"role": "user", "content": "Why do penguins not fly even though they are birds?"}
    ],
    temperature=0.2,
    top_p=0.7,
    max_tokens=512
)

# Print the assistant’s response
print(chat_completion.choices[0].message.content)

Penguins are indeed birds, but they have evolved to become flightless over time. There are several reasons for this:

1. **Environmental pressures**: Penguins live in the Southern Hemisphere, where the climate is cold and the oceans are rich in food. As a result, they didn't need to fly to find food or escape predators. In fact, flying would have been energetically expensive and potentially hazardous in the strong winds and icy conditions.
2. **Body shape and size**: Penguins have a unique body shape, with a streamlined torso, flippers, and a heavy skeleton. This shape is ideal for swimming and diving, but not for flying. Their wings are modified to be more suited for propulsion through water than through air.
3. **Wing structure**: Penguins' wings are shorter and more rigid than those of flying birds. They have a solid, flipper-like shape that allows them to use their wings to "fly" through the water, but they lack the flexibility and lift needed to generate lift in the air.
4. **Ener

# Task 2: Agent Implementation (5 marks)

This task contains an implementation of the agent from Tutorial 9. The idea of this task is to make sure you understand how basic LLM-Agent works.


In [6]:
# Q2a: (5 marks) Explain how agent implementation works, providing comments line by line.
# This paper might be helpful: https://react-lm.github.io/

#This data class keeps track of what the agent has said and been told
@dataclass
class Agent_State:
    messages: List[Dict[str, str]] #stores conversation messages between user and agent
    system_prompt: str #sets the overall behavior and tone of the agent

#The ML_Agent class is the core of our reasoning agent
class ML_Agent:
    def __init__(self, system_prompt: str):
        #This sets up the client that talks to the LLM
        self.client = client

        #The agent’s memory starts with just the system prompt
        self.state = Agent_State(
            messages=[{"role": "system", "content": system_prompt}],
            system_prompt=system_prompt,
        )

    def add_message(self, role: str, content: str) -> None:
        #Add a new message to the conversation (user or assistant)
        self.state.messages.append({"role": role, "content": content})

    def execute(self) -> str:
        #This sends all past messages to the LLM and gets the latest reply
        completion = self.client.chat.completions.create(
            model="llama-3.3-70b-versatile", #the chosen LLM model
            temperature=0.2, #low randomness (more predictable)
            top_p=0.7, #controls how many token options are considered
            max_tokens=1024, #maximum length of the response
            messages=self.state.messages, #full conversation history
        )
        return completion.choices[0].message.content #return just the reply text

    def __call__(self, message: str) -> str:
        self.add_message("user", message) #add the user's input
        result = self.execute() #get the agent's response
        self.add_message("assistant", result) #save the agent's reply
        return result #return the final answer


# Task 3: Tools (20 marks)

Tools are specialized functions that enable AI agents to perform specific actions beyond their inherent capabilities, such as retrieving information, performing calculations, or manipulating data. Agents use tools to decompose complex reasoning into observable steps, extend their knowledge beyond training data, maintain state across interactions, and provide transparency in their decision-making process, ultimately allowing them to solve problems they couldn't tackle through reasoning alone.

Essentially, tools are just callback functions invoked by the agent at the appropriate time during the execution loop.

You need to plan your tools for each particular task your agent is expected to solve.
The Model Evaluation Agent we are building should be able to evaluate the model from the model pool on the specific dataset.

Datasets to use: Penguins, Iris, CIFAR-10

You should be able to tell the agent what to do and watch it display the output of the tools' execution, similar to that in Tutorial 9.

User Prompt examples you should be able to give to your agent and expect it to fulfill the task:
- **Evaluate Linear Regression Model on Iris Dataset**
- **Train a logistic regression model on the Iris dataset**
- **Load the Penguins dataset and preprocess it.**
- **Train a decision tree model on the Penguins dataset and evaluate it.**
- **Load the CIFAR-10 dataset and train Mini-ResNet CNN, visualize results**

Classifier Models for Iris and Penguins (use A1 and early tutorials):
  * Logistic Regression (solver='lbfgs')
  * Decision Tree (max_depth=3)
  * KNN (n_neighbors=5)

Any 2 CNN models of your choice for CIFAR-10 dataset (do some research, don't create anything from scratch unless you want to, use the ones provided by libraries and frameworks)

HINT: It is highly recommended that any code from previous assignments and tutorials be reused for tool implementation.

**Use Pytorch where possible**

## DON'T FORGET TO IMPORT MISSING LIBRARIES

In [7]:
# Q3a (3 marks): Implement model_memory tool.
# This tool should provide the agent with details about models or datasets
# Example: when asked about Penguin dataset, the agent can use memory to look up
# the source to obtain the dataset.


def model_memory(query: str) -> str:
    query = query.lower().strip()

    #Predefined knowledge base
    memory = {
        "iris dataset": "The Iris dataset contains 3 classes of 50 instances each, with 4 features (sepal/petal length/width).",
        "penguins dataset": "The Penguins dataset includes 3 species of penguins with features like flipper length and body mass.",
        "cifar-10 dataset": "CIFAR-10 contains 60,000 32x32 color images across 10 categories like airplane, dog, and truck.",
        "logistic regression": "A linear classifier that uses the logistic function to model probabilities.",
        "decision tree": "A tree-based model that splits data based on feature thresholds. Easy to interpret. Use max_depth=3 for simplicity.",
        "knn": "K-Nearest Neighbors (KNN) classifies a sample based on the majority label of its k closest points. Use n_neighbors=5.",
        "cnn": "Convolutional Neural Networks (CNNs) are deep learning models designed for visual recognition tasks like image classification."
    }

    return memory.get(query, f"No known entry for '{query}'. Try asking about a dataset or model.") #now return match or fallback


In [8]:
# Q3b (3 marks): Implement dataset_loader tool.
# loads dataset after obtaining info from memory


from sklearn.datasets import load_iris
import seaborn as sns

def dataset_loader(dataset_name: str):
    dataset_name = dataset_name.lower().strip()

    if dataset_name == "iris":
        data = load_iris(as_frame=True)
        df = data.frame
        return df.head().to_string() #preview only

    elif dataset_name == "penguins":
        df = sns.load_dataset("penguins")
        return df.head().to_string() #preview only

    elif dataset_name == "cifar-10":
        return "CIFAR-10 will be loaded using torchvision.datasets when training starts (due to its size)."

    else:
        return f"Dataset '{dataset_name}' not recognized. Please choose Iris, Penguins, or CIFAR-10."


In [9]:
# Q3c (3 marks): Implement dataset_preprocessing tool.
# preprocesses the dataset to work with the chosen model, and does the splits

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

#Global state so other tools (like training) can use the preprocessed data
PREPROCESSED_DATA = {}

def dataset_preprocessing(dataset_name: str) -> str:
    dataset_name = dataset_name.lower().strip()

    if dataset_name == "iris":
        from sklearn.datasets import load_iris
        data = load_iris(as_frame=True)
        df = data.frame
        X = df.drop(columns=["target"])
        y = df["target"]

    elif dataset_name == "penguins":
        import seaborn as sns
        df = sns.load_dataset("penguins").dropna()
        df = df.select_dtypes(include=[float, int])  # remove non-numerical columns like species
        X = df.drop(columns=["body_mass_g"]) if "body_mass_g" in df else df.iloc[:, :-1]
        y = df["body_mass_g"] if "body_mass_g" in df else df.iloc[:, -1]

    else:
        return f"Preprocessing not supported for dataset '{dataset_name}'. Try 'iris' or 'penguins'."

    #Standardize features
    scaler = StandardScaler()
    X_scaled = scaler.fit_transform(X)

    #Split into training and test sets
    X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

    #Save for other tools to use
    PREPROCESSED_DATA["X_train"] = X_train
    PREPROCESSED_DATA["X_test"] = X_test
    PREPROCESSED_DATA["y_train"] = y_train
    PREPROCESSED_DATA["y_test"] = y_test
    PREPROCESSED_DATA["dataset_name"] = dataset_name

    return f"{dataset_name.capitalize()} dataset preprocessed and split into train/test sets."


In [10]:
# Q3d (3 points): Implement train_model tool.
# trains selected model on selected dataset, the agent should not use this tool
# on datasets and models that cannot work together.


from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier

#Store trained model so evaluation and visualization tools can access it
TRAINED_MODEL = {}

def train_model(model_name: str) -> str:
    if "X_train" not in PREPROCESSED_DATA:
        return "Please preprocess the dataset first using dataset_preprocessing."

    model_name = model_name.lower().strip()
    dataset = PREPROCESSED_DATA.get("dataset_name")

    X_train = PREPROCESSED_DATA["X_train"]
    y_train = PREPROCESSED_DATA["y_train"]

    model = None

    #Only allow supported model-dataset pairs
    if dataset in ["iris", "penguins"]:
        if model_name == "logistic regression":
            model = LogisticRegression(solver="lbfgs", max_iter=200)
        elif model_name == "decision tree":
            model = DecisionTreeClassifier(max_depth=3)
        elif model_name == "knn":
            model = KNeighborsClassifier(n_neighbors=5)
        else:
            return f"Model '{model_name}' not supported for tabular datasets."

    else:
        return f"Training not supported for dataset '{dataset}'. Use Iris or Penguins."

    model.fit(X_train, y_train)

    TRAINED_MODEL["model"] = model #Store for later use
    TRAINED_MODEL["model_name"] = model_name

    return f"{model_name.title()} model trained successfully on the {dataset} dataset."


In [11]:
# Q3e (3 marks): Implement evaluate_model tool
# evaluates the models and shows the quality metrics (accuracy, precision, and anything else of your choice)

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, classification_report

def evaluate_model(_: str = "") -> str:
    if "model" not in TRAINED_MODEL or "X_test" not in PREPROCESSED_DATA:
        return "Please ensure both the model is trained and the dataset is preprocessed."

    model = TRAINED_MODEL["model"]
    X_test = PREPROCESSED_DATA["X_test"]
    y_test = PREPROCESSED_DATA["y_test"]

    #Make predictions
    y_pred = model.predict(X_test)

    #Calculate key metrics
    acc = accuracy_score(y_test, y_pred)
    precision = precision_score(y_test, y_pred, average='weighted', zero_division=0)
    recall = recall_score(y_test, y_pred, average='weighted', zero_division=0)
    f1 = f1_score(y_test, y_pred, average='weighted', zero_division=0)

    #Return metrics as readable output
    return (
        f"Evaluation Results:\n"
        f"Accuracy: {acc:.4f}\n"
        f"Precision: {precision:.4f}\n"
        f"Recall: {recall:.4f}\n"
        f"F1 Score: {f1:.4f}\n"
    )


In [12]:
# Q3f (5 marks): Implement visualize_results tool
# provides results of the training/evaluation, open-ended task (2 plots minimum)


import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix
import numpy as np

def visualize_results(_: str = "") -> str:
    if "model" not in TRAINED_MODEL or "X_test" not in PREPROCESSED_DATA:
        return "Please ensure both the model is trained and the dataset is preprocessed."

    model = TRAINED_MODEL["model"]
    model_name = TRAINED_MODEL["model_name"]
    X_test = PREPROCESSED_DATA["X_test"]
    y_test = PREPROCESSED_DATA["y_test"]
    dataset = PREPROCESSED_DATA["dataset_name"]

    y_pred = model.predict(X_test)

    #Plot 1: Confusion Matrix
    cm = confusion_matrix(y_test, y_pred)
    plt.figure(figsize=(6, 5))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
    plt.title(f"Confusion Matrix ({model_name.title()} on {dataset.title()})")
    plt.xlabel("Predicted")
    plt.ylabel("Actual")
    plt.show()

    #Plot 2: Feature Importance (for Decision Tree or Logistic Regression)
    if hasattr(model, "coef_"):
        importances = np.abs(model.coef_).mean(axis=0)
    elif hasattr(model, "feature_importances_"):
        importances = model.feature_importances_
    else:
        return "Second visualization not supported for this model type."

    plt.figure(figsize=(8, 4))
    sns.barplot(x=importances, y=[f"Feature {i}" for i in range(len(importances))])
    plt.title(f"Feature Importance ({model_name.title()})")
    plt.xlabel("Importance")
    plt.ylabel("Features")
    plt.tight_layout()
    plt.show()

    return "Visualizations generated successfully."


# Task 4: System Prompt (10 marks)
A system prompt is essential for guiding an agent's behavior by establishing its purpose, capabilities, tone, and workflow patterns. It acts as the "personality and instruction manual" for the agent, defining the format of interactions (like using Thought/Action/Observation steps in our ML agent), available tools, response styles, and domain-specific knowledge—all while remaining invisible to the end user. This hidden layer of instruction ensures the agent consistently follows the intended reasoning process and operational constraints while providing appropriate and helpful responses, effectively serving as the blueprint for the agent's behavior across all interactions.


In [13]:
# Q4a (10 marks) Build a system prompt to guide the agent based on Tutorial 9.
# Use the following function:

# Try to find alternative wording to keep the agent in the desired loop,
# don't just copy the prompt from the tutorial.

# Penalty for direct copy - 2 marks

def create_agent():
    system_prompt = """
You are a helpful AI agent built to assist with machine learning tasks using logical reasoning and a structured process.

Follow this interaction format for every query:
1. Think through the problem step-by-step and explain your thought process. Start this section with "Thought:".
2. If needed, take an action by calling a tool. Format it as "Action: tool_name: tool_input".
3. Wait for the result and respond to it with "Observation:".
4. Repeat the Thought/Action/Observation cycle until enough information is gathered.
5. Finish with a final answer or conclusion.

Only use available tools when needed:
- model_memory
- dataset_loader
- dataset_preprocessing
- train_model
- evaluate_model
- visualize_results

Only one action should be taken per step. Be methodical, and explain your decisions clearly to help the user understand each step.

Do not skip steps. Wait for an observation before continuing. Stick to the cycle unless asked a direct fact-based question.
    """.strip()

    return ML_Agent(system_prompt)



# Task 5: Set the Agent Loop (10 marks)

Now we are building automation of our Thought/Action/Observation sequence.


In [14]:
# Q5a: (2 marks) Explain why we need the following data structure and fill it in with appropriate values:
KNOWN_ACTIONS = {
    "model_memory": model_memory,               # Looks up information about a model or dataset
    "dataset_loader": dataset_loader,           # Loads a dataset and returns a preview of its data
    "dataset_preprocessing": dataset_preprocessing,  # Preprocesses the dataset and splits it into train/test sets
    "train_model": train_model,                 # Trains a chosen model on the preprocessed data
    "evaluate_model": evaluate_model,           # Evaluates the trained model using performance metrics
    "visualize_results": visualize_results,     # Generates plots to visualize evaluation results
}

In [15]:
# Q5b: (6 marks) Explain how the agent automation loop works line by line. Why do we need the ACTION_PATTERN variable?
# This paper might be helpful: https://react-lm.github.io/

ACTION_PATTERN = re.compile("^Action: (\w+): (.*)$")

number_of_steps = 5 # adjust this number for your implementation, to avoid an infinite loop

def query(question: str, max_turns: int = number_of_steps) -> List[Dict[str, str]]:
    agent = create_agent()
    next_prompt = question

    for turn in range(max_turns):
        result = agent(next_prompt)
        print(result)
        actions = [
            ACTION_PATTERN.match(a)
            for a in result.split("\n")
            if ACTION_PATTERN.match(a)
        ]
        if actions:
            action, action_input = actions[0].groups()
            if action not in KNOWN_ACTIONS:
                raise ValueError(f"Unknown action: {action}: {action_input}")
            print(f"\n ---> Executing {action} with input: {action_input}")
            observation = KNOWN_ACTIONS[action](action_input)
            print(f"Observation: {observation}")
            next_prompt = f"Observation: {observation}"
        else:
            break
    return agent.state.messages


# **ANSWER**

The agent automation loop is coded to enable the AI to think step by step and decide what tool to invoke depending on the user question. It starts by creating a new agent with a system prompt and then sends the user's question to the new agent. Now the new agent we made gives its answer and maybe even an action like executing one of our tools we programmed. The loop checks if the response contains a call to a tool using the ACTION_PATTERN, which is a regular expression that identifies lines starting with "Action: tool_name: input". The pattern enables the code to recognize when the agent is requesting an action to be executed and input to use. If something is found and recognized, invoke the matching tool function and return its output to the agent as an "Observation" for the next step. Do this a few steps, where the agent can build a coherent chain of reasoning, use the tools when necessary, and finally return with a solution. This setup makes the agent interactive, modular, and able to process complicated tasks step by step.

In [16]:
# Q5b: (2 marks)
# QUESTION: How can we check the whole history of the agent's interaction with LLM?

#We can look at the history of our agent's conversation with our LLM
#by essentially looking through the messages stored in the agent's state.
#These messages now basically contain everything we need,
#such as what the system, user, or assistant said during our reasoning process.
#Upon executing the query() function, the complete conversation history
#is stored in agent.state.messages, which is also what the function returns.
#Each message is a role-playing dictionary with a role such as a "user",
#"assistant", "system", and the corresponding content.
#This enables you to view the whole thought process, activities,
#and results the agent followed step by step.



# Task 6: Run your agent (15 marks)

Let's see if your agent works

In [17]:
# Execute any THREE example prompts using your agent. (Each working prompt exaple will give you 5 marks, 5x3=15)
# DONT FORGET TO SAVE THE OUTPUT

# User Prompt examples you should be able to give to your agent:
# **Evaluate Linear Regression Model on Iris Dataset**
# **Train a logistic regression model on the Iris dataset**
# **Load the Penguins dataset and preprocess it.**
# **Train a decision tree model on the Penguins dataset and evaluate it.**
# **Load the CIFAR-10 dataset and train Mini-ResNet CNN, visualize results**

# Use this template:

# Example 1: Prompt
print("\nExample 1: Evaluate Linear Regression Model on Iris Dataset")
print("=" * 50)
task = "Evaluate Linear Regression Model on Iris Dataset"
result1 = query(task)
print("\n" + "=" * 50 + "\n")

# Example 2: Prompt
print("\nExample 1: Train a logistic regression model on the Iris dataset")
print("=" * 50)
task = "Train a logistic regression model on the Iris dataset"
result2 = query(task)
print("\n" + "=" * 50 + "\n")

# Example 3: Prompt
print("\nExample 2: Load the Penguins dataset and preprocess it.")
print("=" * 50)
task = "Load the Penguins dataset and preprocess it."
result3 = query(task)
print("\n" + "=" * 50 + "\n")

# Example 4: Prompt
print("\nExample 3: Train a decision tree model on the Penguins dataset and evaluate it.")
print("=" * 50)
task = "Train a decision tree model on the Penguins dataset and evaluate it."
result4 = query(task)
print("\n" + "=" * 50 + "\n")



Example 1: Evaluate Linear Regression Model on Iris Dataset
Thought: To evaluate a Linear Regression model on the Iris dataset, we first need to understand that Linear Regression is typically used for regression tasks, where the target variable is continuous. However, the Iris dataset is often used for classification tasks, as it aims to predict the species of iris flowers based on their characteristics, which is a categorical outcome. Despite this, we can still use Linear Regression for a multi-class classification problem by using one-vs-rest or similar approaches, but it's essential to note that other models like Logistic Regression or classifiers might be more suitable. Our first step should be to load the Iris dataset.

Action: dataset_loader: iris

Observation: 

(Please provide the observation after loading the dataset)

 ---> Executing dataset_loader with input: iris
Observation:    sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)  target
0             

# Task 7: BONUS (10 points)
Not valid without completion of all the previous tasks and tool implementations.

In [18]:
# Build your own additional ML-related tool and provide an example of interaction with your reasoning agent
# using a prompt of your choice that makes the agent use your tool at one of the reasoning steps.


Good luck!

## Signature:
Don't forget to insert your name and student number and execute the snippet below.



In [20]:
!pip install watermark
# Provide your Signature:
%load_ext watermark
%watermark -a 'Bisher Abou-Alwan, #101211242, Mohammad Abusalem, #101204665' -nmv --packages numpy,pandas,sklearn,matplotlib,seaborn,graphviz,groq,torch

The watermark extension is already loaded. To reload it, use:
  %reload_ext watermark
Author: Bisher Abou-Alwan, #101211242, Mohammad Abusalem, #101204665

Python implementation: CPython
Python version       : 3.11.11
IPython version      : 7.34.0

numpy     : 2.0.2
pandas    : 2.2.2
sklearn   : 1.6.1
matplotlib: 3.10.0
seaborn   : 0.13.2
graphviz  : 0.20.3
groq      : 0.22.0
torch     : 2.6.0+cu124

Compiler    : GCC 11.4.0
OS          : Linux
Release     : 6.1.85+
Machine     : x86_64
Processor   : x86_64
CPU cores   : 2
Architecture: 64bit

