# Agent Lab Draft 3

## Preliminary

### Packages to install

- ucimlrepo (for grabbing datasets)
- transformers
- autogen
- jupyter (if you aren't on the latest version, there's a dependency in tqdm that complains)

#### Install with pip

- pip install ucimlrepo transformers autogen jupyter

### From lab 3

- pandas, numpy, matplotlib.pyplot, seaborn, tqdm, torch, sklearn

#### Install everything with pip

- pip install ucimlrepo transformers autogen pandas numpy matplotlib seaborn tqdm torch scikit-learn jupyter

In [None]:
# If you need to install stuff on colab
#!pip install ucimlrepo transformers autogen pandas numpy matplotlib seaborn tqdm torch scikit-learn jupyter

### Intro

The goal of this lab is to give you an idea of how you could use agents to help with physics tasks. It will also introduce you to AutoGen, one of the more popular frameworks at the moment for designing custom agentic workflows. We won't make use of all the tools it provides, just the very basics. Note also that many of the most interesting things one can do with AI-powered agents (see topics like retreival augmented generation (RAG)) require very large models to be performant, and many techniques require continuous/repeated training. This means large resource requirements, so for this lab we will just be using a very small LLM (Llama: TinyLlama-1.1B-Chat-v1.0). The results from this are nowhere near as good as something like chatGPT, but it should give you an idea of how a more advanced model (or models) might be able to do something really helpful/cool. This is an area of current research, so it will be interesting to see what they can do!

Also because of the limited size of the model, the text parsing needs to be very mechanical, and in some places a bit obtuse. The better (and ideally specially trained for an agentic workflow, see RAG) your model is, the more this can be relaxed. If you're interested in working with agents in a more user-friendly way (hiding a lot of the mechanics that are on display here), check out sites like n8n or Google Gemini (be aware that these can require you to provide a lot of permissions). You can also feel free to replace TinyLlama with a call to a larger model using API keys if you have some.

Finally, we would like to emphasize the use of copilot/similar tools for this lab in particular. These are agents too! And they are definitely the most performant agents you have easy access to. 


## Lab

- Do everything with mnist, then try adding solar_flare
    - This is where agents can be helpful, since they can analyze a dataset you've never seen before and take a first crack at it much faster than you can

## Imports

In [None]:
# Import necessary packages

%matplotlib inline

from autogen import ConversableAgent, AssistantAgent
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import tqdm
import torch
from torch.utils.data import DataLoader, TensorDataset
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from ucimlrepo import fetch_ucirepo

### Lab 3 code (minimize this)

- Just packaged into functions

In [None]:
# Load MNIST Dataset in Numpy
def load_mnist_data():
    # 1000 training samples where each sample feature is a greyscale image with shape (28, 28)
    # 1000 training targets where each target is an integer indicating the true digit
    mnist_train_features = np.load('mnist_train_features.npy') 
    mnist_train_targets = np.load('mnist_train_targets.npy')

    # 100 testing samples + targets
    mnist_test_features = np.load('mnist_test_features.npy')
    mnist_test_targets = np.load('mnist_test_targets.npy')

    # Print the dimensions of training sample features/targets
    #print(mnist_train_features.shape, mnist_train_targets.shape)
    # Print the dimensions of testing sample features/targets
    #print(mnist_test_features.shape, mnist_test_targets.shape)
    
    return mnist_train_features, mnist_train_targets, mnist_test_features, mnist_test_targets


def flatten_features(features):
    # Flatten the features from (28, 28) to (784,)
    return features.reshape(features.shape[0], -1)

def scale_features(features):
    scaler = StandardScaler()
    return scaler.fit_transform(features)

# More general function to load datasets, including solar_flare
def load_dataset(dataset_name: str = ""):
    if dataset_name == "mnist":
        train_features, train_targets, test_features, test_targets = load_mnist_data()
        train_features = flatten_features(train_features)
        test_features = flatten_features(test_features)
        train_features = scale_features(train_features)
        test_features = scale_features(test_features)
        
    elif dataset_name == "solar_flare":
        # Load the solar flare dataset
        solar_flare = fetch_ucirepo(id=89)
        
        # Simplifying slightly for the sake of this example
        solar_flare.data.targets = solar_flare.data.targets['severe flares']
        
        # Split the solar flare dataset into train and test sets (90:10 split)
        train_features, test_features, train_targets, test_targets = train_test_split(
            solar_flare.data.features, solar_flare.data.targets, test_size=0.1, random_state=42
        )
        
        # Onehot encode modified Zurich class, largest spot size, spot distribution
        onehot_columns = ["modified Zurich class", "largest spot size", "spot distribution"]
        for col in onehot_columns:
            onehot = pd.get_dummies(train_features[col], prefix=col)
            train_features = pd.concat([train_features, onehot], axis=1)
            train_features.drop(col, axis=1, inplace=True)
            
            onehot = pd.get_dummies(test_features[col], prefix=col)
            test_features = pd.concat([test_features, onehot], axis=1)
            test_features.drop(col, axis=1, inplace=True)
        
        # Scale the features
        train_features = scale_features(train_features)
        test_features = scale_features(test_features)
        
        # Convert targets to numpy arrays
        train_targets = train_targets.to_numpy()
        test_targets = test_targets.to_numpy()
        
    else:
        raise ValueError(f"Unknown dataset: {dataset_name}")
        
    # train-test split
    train_features, val_features, train_targets, val_targets = train_test_split(train_features, train_targets, test_size=0.2)
    
    return train_features, train_targets, val_features, val_targets, test_features, test_targets


# Train
def train_model(model, train_features, train_targets, validation_features, validation_targets, 
                test_features=None, test_targets=None, learning_rate=0.0015, epochs=80, batch_size=64):
    """
    Train a neural network model on the provided data.
    
    Parameters:
        model: PyTorch model to train
        train_features: Training features as numpy array
        train_targets: Training targets as numpy array
        validation_features: Validation features as numpy array
        validation_targets: Validation targets as numpy array
        test_features: Test features as numpy array (optional)
        test_targets: Test targets as numpy array (optional)
        learning_rate: Learning rate for optimizer
        epochs: Number of training epochs
        batch_size: Batch size for training
        
    Returns:
        tuple: (trained model, training loss list, validation accuracy list)
    """
    # Initialize tracking lists
    train_loss_list = np.zeros(epochs)
    validation_accuracy_list = np.zeros(epochs)
    
    # Convert numpy arrays to PyTorch tensors
    train_inputs = torch.from_numpy(train_features).float()
    train_targets = torch.from_numpy(train_targets).long()
    
    validation_inputs = torch.from_numpy(validation_features).float()
    validation_targets = torch.from_numpy(validation_targets).long()
    
    if test_features is not None and test_targets is not None:
        test_inputs = torch.from_numpy(test_features).float()
        test_targets = torch.from_numpy(test_targets).long()
        test_dataset = TensorDataset(test_inputs, test_targets)
        test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
    
    # Create dataloaders
    train_dataset = TensorDataset(train_inputs, train_targets)
    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
    validation_dataset = TensorDataset(validation_inputs, validation_targets)
    validation_loader = DataLoader(validation_dataset, batch_size=batch_size, shuffle=False)
    
    # Setup optimizer and scheduler
    loss_func = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
    scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=epochs)
    
    # Move model to GPU if available
    if torch.cuda.is_available():
        model = model.cuda()
        train_inputs = train_inputs.cuda()
        validation_inputs = validation_inputs.cuda()
        validation_targets = validation_targets.cuda()
    
    # Training Loop
    for epoch in tqdm.trange(epochs):
        model.train()  # Set model to training mode
        running_loss = 0.0
        
        for batch_inputs, batch_targets in train_loader:
            if torch.cuda.is_available():
                batch_inputs, batch_targets = batch_inputs.cuda(), batch_targets.cuda()
            
            optimizer.zero_grad()  # Reset gradients to zero
            outputs = model(batch_inputs)  # Forward pass with current batch
            loss = loss_func(outputs, batch_targets)  # Compute loss
            loss.backward()  # Backward pass
            optimizer.step()  # Update weights
            
            running_loss += loss.item() * batch_inputs.size(0)
        
        # Store average epoch loss
        train_loss_list[epoch] = running_loss / len(train_dataset)
        scheduler.step()  # Update learning rate with cosine annealing
        
        # Compute Validation Accuracy
        model.eval()  # Set model to evaluation mode
        with torch.no_grad():
            correct = 0
            total = 0
            for val_inputs, val_targets in validation_loader:
                if torch.cuda.is_available():
                    val_inputs, val_targets = val_inputs.cuda(), val_targets.cuda()
                outputs = model(val_inputs)
                _, predicted = torch.max(outputs.data, 1)
                total += val_targets.size(0)
                correct += (predicted == val_targets).sum().item()
            
            validation_accuracy_list[epoch] = correct / total
    
    # Compute test accuracy if test data is provided
    test_accuracy = None
    if test_features is not None and test_targets is not None:
        model.eval()
        with torch.no_grad():
            correct = 0
            total = 0
            for test_inputs, test_targets in test_loader:
                if torch.cuda.is_available():
                    test_inputs, test_targets = test_inputs.cuda(), test_targets.cuda()
                outputs = model(test_inputs)
                _, predicted = torch.max(outputs.data, 1)
                total += test_targets.size(0)
                correct += (predicted == test_targets).sum().item()
            
            test_accuracy = correct / total
    
    return model, train_loss_list, validation_accuracy_list, test_accuracy


# Visualize and evaluate
def visualize_training(train_loss_list, validation_accuracy_list):
    """
    Visualize training loss and validation accuracy.
    
    Parameters:
        train_loss_list: List of training losses
        validation_accuracy_list: List of validation accuracies
    """
    plt.figure(figsize = (12, 6))

    # Visualize training loss with respect to iterations (1 iteration -> single batch)
    plt.subplot(2, 1, 1)
    plt.plot(train_loss_list, linewidth = 3)
    plt.ylabel("training loss")
    plt.xlabel("epochs")
    sns.despine()

    # Visualize validation accuracy with respect to epochs
    plt.subplot(2, 1, 2)
    plt.plot(validation_accuracy_list, linewidth = 3, color = 'gold')
    plt.ylabel("validation accuracy")
    sns.despine()
    
    plt.show()



## Define Model(s)

#### Possible tasks
- Make the convolutional model with copilot
    - Or both, even
- Replace calls for the self.description property with a request for the main agent to describe the model

In [2]:
class mnistClassification(torch.nn.Module):
    
    def __init__(self, input_dim, output_dim): # Feel free to add hidden_dim as parameters here
        super(mnistClassification, self).__init__()
        
        self.description = "Simple FCN with 2 layers, ReLU activation, and dropout"
        
        #self.layerconv1 = torch.nn.Conv1d(in_channels=1, out_channels=1, kernel_size=5, stride=2, padding=2)
        self.layer1 = torch.nn.Linear(input_dim, 200) # First layer
        self.layer2 = torch.nn.Linear(200, output_dim) # Second layer
        self.relu = torch.nn.ReLU()
        self.dropout = torch.nn.Dropout(p=0.2) # Dropout layer with 20% dropout rate
        
    def forward(self, x):
        
        x = self.layer1(x)
        x = self.relu(x)
        x = self.layer2(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = torch.nn.functional.softmax(x, dim=1)
        
        return x
    
    
# Uses convolutional layers
class mnistConvClassification(torch.nn.Module):
    
    def __init__(self, input_dim, output_dim, input_channels,):
        super(mnistConvClassification, self).__init__()
        
        self.description = "Complex CNN with 2 convolutional layers, 2 linear layers, ReLU activation, and dropout"
        
        self.input_dim = input_dim
        self.input_channels = input_channels
        self.height = int((self.input_dim / input_channels) ** 0.5)  # Infer height and width from input_dim
        self.width = self.height
        
        # Convolutional layers
        self.conv1 = torch.nn.Conv2d(in_channels=input_channels, out_channels=32, kernel_size=3, stride=1, padding=1)
        self.conv2 = torch.nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.pool = torch.nn.MaxPool2d(kernel_size=2, stride=2)
        
        # Fully connected layers
        self.fc1 = torch.nn.Linear(64 * 7 * 7, 128)  # Assuming input images are 28x28
        self.fc2 = torch.nn.Linear(128, output_dim)
        
        # Activation and dropout
        self.relu = torch.nn.ReLU()
        self.dropout = torch.nn.Dropout(p=0.3)
        
    def forward(self, x):
        
        # Reshape the input to match the expected dimensions for Conv2d
        if len(x.shape) != 3:
            x = x.view(-1, self.input_channels, self.height, self.width)
        
        # Convolutional layers with ReLU and pooling
        x = self.conv1(x)
        x = self.relu(x)
        x = self.pool(x)
        
        x = self.conv2(x)
        x = self.relu(x)
        x = self.pool(x)
        
        # Flatten the output for the fully connected layers
        x = x.view(x.size(0), -1)
        
        # Fully connected layers with ReLU and dropout
        x = self.fc1(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc2(x)
        
        return torch.nn.functional.softmax(x, dim=1)

# Agents

### DatasetLoader

In [None]:
class DatasetLoaderAgent(AssistantAgent):
    """
    An agent that specializes in loading and preprocessing datasets.
    """
    def __init__(self, name="DatasetLoader", **kwargs):
        self.available_datasets = ["mnist", "solar_flare"]
        system_message = (
            "I am a dataset loading assistant. I can load and preprocess various datasets for machine learning tasks. "
            f"Currently I support: {', '.join(self.available_datasets)}. "
        )
        super().__init__(name=name, system_message=system_message, **kwargs)
        
        # Store loaded datasets
        self._loaded_datasets = {}
    
    def get_available_datasets(self):
        """Return a list of available datasets."""
        return self.available_datasets
    
    def load_dataset(self, dataset_name):
        """
        Load and preprocess a dataset.
        
        Args:
            dataset_name (str): Name of the dataset to load
            
        Returns:
            dict: Information about the loaded dataset and the data itself
        """
        try:
            # Load the dataset using the existing function
            train_features, train_targets, val_features, val_targets, test_features, test_targets = load_dataset(dataset_name)
            
            # Store the dataset
            self._loaded_datasets[dataset_name] = {
                "train_features": train_features,
                "train_targets": train_targets,
                "validation_features": val_features,
                "validation_targets": val_targets,
                "test_features": test_features,
                "test_targets": test_targets,
            }
            
            # Return information about the loaded dataset
            return {
                "status": "success",
                "dataset_name": dataset_name,
                "train_samples": train_features.shape[0],
                "validation_samples": val_features.shape[0],
                "test_samples": test_features.shape[0],
                "feature_dim": train_features.shape[1]
            }
        except Exception as e:
            return {
                "status": "error",
                "message": f"Failed to load dataset '{dataset_name}': {str(e)}"
            }
    
    def get_dataset(self, dataset_name):
        """
        Retrieve a previously loaded dataset.
        
        Args:
            dataset_name (str): Name of the dataset to retrieve
            
        Returns:
            dict: The dataset components or None if not found
        """
        return self._loaded_datasets.get(dataset_name)

### InterfaceAgent

In [4]:
# Load the model
model_id = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32)
model.to("cuda" if torch.cuda.is_available() else "cpu")

# Create a simple text-generation pipeline
llm_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=200,
    do_sample=True,
    temperature=0.4,
)

# Wrapper function for the local model pipeline
def local_model_generate(prompt):
    output = llm_pipeline(prompt)[0]["generated_text"]
    return output

Device set to use cpu


#### Possible Tasks

- Implementing query
- Implementing having the agents converse to decide on the model type

In [5]:
# Agent definition
class InterfaceAgent(ConversableAgent):
    def __init__(self, name, **kwargs):
        super().__init__(name, **kwargs)
        
        self.dataset_loader_agent = None
        self.model_trainer_agent = None

    def set_dataset_loader_agent(self, agent):
        self.dataset_loader_agent = agent
        
    def set_model_trainer_agent(self, agent):
        self.model_trainer_agent = agent
    
    def generate_reply(self, messages):
        """
        There is a lot of obtuse text parsing and such here. I would describe the code as "technically functional".
        This is becasue the LLM is so limited. With a better LLM, and/or ideally one trained explicitly for 
        agentic implementation, this would be much cleaner and more flexible. 
        C.f. retrieval augmented generation, etc.
        """
        # Extract the latest user message
        user_message = messages[-1]["content"]
        
        # Check if the message is a "get dataset" command
        if "get dataset" in user_message:
            dataset_name = user_message.replace("get dataset", "").strip()
            
            # Use the DatasetLoaderAgent to load the dataset
            result = self.dataset_loader_agent.load_dataset(dataset_name)

            # Check if the dataset was successfully loaded
            if result["status"] == "success":
                response_text = (
                    f"Successfully loaded dataset '{result["dataset_name"]}'.\n"
                    f"Training samples: {result['train_samples']}, "
                    f"Validation samples: {result['validation_samples']}, "
                    f"Test samples: {result['test_samples']}, "
                    f"Feature dimension: {result['feature_dim']}."
                )
            else:
                response_text = f"Failed to load dataset '{dataset_name}': {result['message']}"
        
        elif "query" in user_message:
            query = user_message.replace("query", "").strip()
            
            # Use the DatasetLoaderAgent to retrieve the dataset
            # Assuming the query is in the format "query <dataset_name>"
            # In principle, this is where you could have an LLM parse the command
            dataset_name = query.split()[0]
            dataset = self.dataset_loader_agent.get_dataset(dataset_name)
    
            if dataset:
                # Prepare a prompt with dataset information
                prompt = (
                    f"The dataset '{dataset_name}' has been loaded. "
                    f"The first two rows of the training features are:\n"
                    f"{dataset['train_features'][:2]}.\n"
                    "Short, concise description of the dataset:\n"
                )
                # Generate a description using the local model
                response_text = local_model_generate(prompt)
            else:
                response_text = f"Dataset '{dataset_name}' is not loaded or does not exist."
                
            
        elif "train" in user_message:
            # Should be of the form "train <dataset_name> <model_type>"
            # Again, this is where a specialized LLM would be really useful
            # Model type
            parts = user_message.split()
            dataset_name, modeltype = parts[1], parts[2]
            
            # Use the ModelTrainerAgent to train the model
            if self.model_trainer_agent is not None:
                # If the user specified an available model type, use that
                if modeltype in self.model_trainer_agent.available_models:
                    result = self.model_trainer_agent.train_model_on_query(dataset_name, "placeholder", override=modeltype)
                    
                # Otherwise, use the dataset description to decide on the model type
                else:
                    message = [{"role": "assistant", "content": f"query {dataset_name}"}]
                    response = self.generate_reply(message)
                    dataset_description = response["content"]
                    # Use the dataset description to decide on the model type
                    result = self.model_trainer_agent.train_model_on_query(dataset_name, dataset_description)
                
                if result["status"] == "success":
                    response_text = f"Successfully trained the model '{modeltype}'."
                else:
                    response_text = f"Failed to train the model '{modeltype}': {result['response']}"
            else:
                response_text = "Model trainer agent is not set."
        
        # No special command, just a normal message. Can just use the model as a chatbot
        else:
            response_text = local_model_generate(user_message)
        
        # Return the response in the Autogen format
        return {"role": "assistant", "content": response_text}

## Talk to the datasetloader

#### Possible tasks
- All of the below

In [6]:
# Create instances of the agents
dataset_loader_agent = DatasetLoaderAgent(name="DatasetLoader")
local_agent = InterfaceAgent(name="LocalAgent")
local_agent.set_dataset_loader_agent(dataset_loader_agent)

# Simulate a conversation
messages = [{"role": "user", "content": "get dataset solar_flare"}]

# InterfaceAgent processes the first message
response = local_agent.generate_reply(messages)
print(response["content"])
print()

# Add a second message to the conversation
messages.append({"role": "user", "content": "query solar_flare"})

# InterfaceAgent processes the second message
response = local_agent.generate_reply(messages)
print(response["content"])
print()

# Note that the output is not reliable at all since the model is tiny. Sometimes it's surprisingly good though

Successfully loaded dataset 'solar_flare'.
Training samples: 1000, Validation samples: 250, Test samples: 139, Feature dimension: 23.

The dataset 'solar_flare' has been loaded. The first two rows of the training features are:
[[-0.42337369  0.9473887  -0.22049738  1.2453997   0.34940318 -0.16721401
   2.1697379  -0.42599822  1.93257561 -0.5471529  -0.29966561 -0.1910085
  -0.63475828  2.         -0.16721401 -0.26840486 -0.48495141 -0.75521637
  -0.42337369 -0.1954635   1.86125917 -0.90819465 -0.63475828]
 [ 2.36197954 -0.6860401  -0.22049738 -0.80295507  0.34940318 -0.16721401
  -0.46088516 -0.42599822  1.93257561 -0.5471529  -0.29966561 -0.1910085
  -0.63475828 -0.5        -0.16721401 -0.26840486 -0.48495141  1.32412385
  -0.42337369 -0.1954635  -0.53727069  1.10108555 -0.63475828]].
Short, concise description of the dataset:
This dataset contains solar flare data from the Solar Dynamics Observatory (SDO) in the form of a pandas dataframe. The data includes the following columns:
- '

## Model Trainer

#### Possible Tasks
- Implementing everything in train_model_on_query up to setting the dataset

In [None]:
class ModelTrainerAgent(ConversableAgent):
    """
    An agent that selects, creates, and trains a classification model based on the user's query.
    """
    def __init__(self, name="ModelTrainer", dataset_loader_agent=None, **kwargs):
        super().__init__(name=name, **kwargs)
        self.dataset_loader_agent = dataset_loader_agent
        
        self.model, self.train_loss_list, self.val_accuracy_list, self.test_accuracy = None, None, None, None
        
        self.available_models = ["linear", "conv"]
        

    def train_model_on_query(self, dataset_name, query, override=None):
        """
        Parse the query, select the model, and train it on the dataset.
        """
        try:
            response = local_model_generate(f"Based on the following dataset description, recommend a model type (linear or convolutional): \n{query}")
            
            print("Analyzing response:", response)
            
            # Parse the query to extract the dataset name and model type
            # With a more specialized network, this could be done a lot easier and better
            # E.g. here the output is only reversed because the tiny model often just does what it wants at the start
            parts = reversed(response.split())
            
            model_type = "linear"  # Default model type
            
            linear_keywords = ["linear", "fcn"]
            conv_keywords = ["conv", "convolutional", "cnn", "vision", "image", "picture", "pixel", "pictures", "images"]
            for word in parts:
                if word.lower() in conv_keywords:
                    model_type = "conv"
                    break
                elif word.lower() in linear_keywords:
                    model_type = "linear"
                    break

            # Retrieve the dataset from the DatasetLoaderAgent
            dataset = self.dataset_loader_agent.get_dataset(dataset_name)
            if not dataset:
                return f"Dataset '{dataset_name}' is not loaded or does not exist."

            # Extract dataset components
            train_features = dataset["train_features"]
            train_targets = dataset["train_targets"]
            val_features = dataset["validation_features"]
            val_targets = dataset["validation_targets"]
            test_features = dataset["test_features"]
            test_targets = dataset["test_targets"]

            # Select the model based on the model type
            if len(train_features.shape) == 3:
                indim = train_features.shape[1]*train_features.shape[2]
            else:
                indim = train_features.shape[1]
            
            # Training printing
            print(f"Training model of type '{model_type}' on dataset '{dataset_name}' with input dimension {indim}.")
            print(f"Training features shape: {train_features.shape}")
            print(f"Training targets shape: {train_targets.shape}")
            
            
            if model_type == "linear" or override == "linear":
                model = mnistClassification(input_dim=indim, output_dim=10)
            elif model_type == "conv" or override == "conv":
                model = mnistConvClassification(input_dim=indim, input_channels=1, output_dim=10)
            else: # Fallback
                model = mnistConvClassification(input_dim=indim, input_channels=1, output_dim=10)

            # Train the model
            self.model, self.train_loss_list, self.val_accuracy_list, self.test_accuracy = train_model(
                model, train_features, train_targets, val_features, val_targets,
                test_features=test_features, test_targets=test_targets
            )

            # Summarize the training results
            response = (
                f"Model '{model_type}' trained successfully on dataset '{dataset_name}'.\n"
                f"Final validation accuracy: {self.val_accuracy_list[-1]:.4f}\n"
                f"Test accuracy: {self.test_accuracy:.4f}"
            )
            print("Training response:", response)
            
            result = {
                "role": "assistant",
                "status": "success",
                "model": self.model,
                "train_loss_list": self.train_loss_list,
                "val_accuracy_list": self.val_accuracy_list,
                "test_accuracy": self.test_accuracy,
                "response": response
            }
            
            return result
        
        except Exception as e:
            return {"status": "error", "response": f"An error occurred during training: {str(e)}"}

#### Possible Tasks
- All of the below

In [8]:
# Create instances of the agents
dataset_loader_agent = DatasetLoaderAgent(name="DatasetLoader")
model_trainer_agent = ModelTrainerAgent(name="ModelTrainer", dataset_loader_agent=dataset_loader_agent)
local_agent = InterfaceAgent(name="LocalAgent")

local_agent.set_dataset_loader_agent(dataset_loader_agent)
local_agent.set_model_trainer_agent(model_trainer_agent)

# Simulate a conversation
messages = [{"role": "user", "content": "get dataset solar_flare"}]
response = local_agent.generate_reply(messages)
print(response["content"])
print()

# Add a training command
# "random" just to force the agents to talk to each other
messages.append({"role": "user", "content": "train solar_flare random"})
response = local_agent.generate_reply(messages)
print(response["content"])
print()

# And you now have a trained model!
print(model_trainer_agent.model)

Successfully loaded dataset 'solar_flare'.
Training samples: 1000, Validation samples: 250, Test samples: 139, Feature dimension: 23.

Analyzing response: Based on the following dataset description, recommend a model type (linear or convolutional): 
The dataset 'solar_flare' has been loaded. The first two rows of the training features are:
[[-0.42337369 -0.6860401  -0.22049738 -0.80295507  0.34940318 -0.16721401
  -0.46088516 -0.42599822 -0.51744418  1.82764268 -0.29966561 -0.1910085
  -0.63475828  2.         -0.16721401 -0.26840486 -0.48495141 -0.75521637
  -0.42337369 -0.1954635  -0.53727069  1.10108555 -0.63475828]
 [-0.42337369  0.9473887  -0.22049738 -0.80295507 -2.86202314 -0.16721401
   2.1697379  -0.42599822 -0.51744418 -0.5471529  -0.29966561 -0.1910085
   1.57540285 -0.5        -0.16721401 -0.26840486 -0.48495141  1.32412385
  -0.42337369 -0.1954635  -0.53727069 -0.90819465  1.57540285]].
Short, concise description of the dataset:

The dataset'solar_flare' contains 2000 solar

100%|██████████| 80/80 [00:00<00:00, 93.15it/s]

Training response: Model 'linear' trained successfully on dataset 'solar_flare'.
Final validation accuracy: 0.9800
Test accuracy: 0.9928
Successfully trained the model 'random'.

mnistClassification(
  (layer1): Linear(in_features=23, out_features=200, bias=True)
  (layer2): Linear(in_features=200, out_features=10, bias=True)
  (relu): ReLU()
  (dropout): Dropout(p=0.2, inplace=False)
)





In [None]:
messages = [{"role": "user", "content": "get dataset mnist"}]
response = local_agent.generate_reply(messages)
print(response["content"])
print()

# Should choose the conv model; usually does
messages.append({"role": "user", "content": "train mnist random"})
response = local_agent.generate_reply(messages)
print(response["content"])
print()

Successfully loaded dataset 'mnist'.
Training samples: 800, Validation samples: 200, Test samples: 100, Feature dimension: 784.

Analyzing response: Based on the following dataset description, recommend a model type (linear or convolutional): 
The dataset 'mnist' has been loaded. The first two rows of the training features are:
[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]].
Short, concise description of the dataset:

MNIST dataset is a widely used dataset for image classification. It is a subset of the MNIST database, which is a collection of handwritten digits. The dataset contains 60,000 labeled images of size 28x28 pixels and 10 classes (0-9 digits). The training set contains 50,000 labeled images, while the test set contains 10,000 unlabeled images. The dataset is used in many deep learning research papers and is a popular benchmark for image classification tasks. The dataset is available in the Keras library and can be loaded using the `load_data()` function. The first two ro

100%|██████████| 80/80 [00:11<00:00,  7.14it/s]

Training response: Model 'conv' trained successfully on dataset 'mnist'.
Final validation accuracy: 0.9300
Test accuracy: 0.9700
Successfully trained the model 'random'.




