# Exploring the Capabilities of LLM Models

In this notebook, I aim to evaluate and compare the capabilities of two large language models (LLMs):

1. **Codestral22B**
   A state-of-the-art model designed for advanced code generation and natural language understanding tasks.

2. **Llama 3.1-8B**
   A highly efficient and compact model optimized for general-purpose language tasks with an 8-billion parameter architecture.

The goal is to analyze their performance across various tasks, including but not limited to:

- Code generation and completion
- Natural language understanding
- Contextual reasoning
- Problem-solving capabilities

This comparison will help identify the strengths and weaknesses of each model and provide insights into their practical applications.

# AI-Powered Programming Tutor with RAG

This project focuses on building an AI-powered programming tutor designed to assist students in understanding code and solving problems. The tutor leverages **Retrieval-Augmented Generation (RAG)** to provide accurate and personalized explanations grounded in real university materials, such as:

- Past assignments
- Lecture notes
- Tutorials`

The system integrates two large language models (LLMs), **Codestral22B** and **Llama 3.1-8B**, to evaluate their performance in generating solutions and explanations for programming-related queries. The goal is to determine which model provides better support for students in a university setting.

---

## Key Features

- **Personalized Explanations**: Tailored responses based on retrieved university materials.
- **Code Understanding**: Helps students debug and understand code snippets.
- **Problem Solving**: Provides step-by-step solutions to programming problems.
- **Model Comparison**: Evaluates the performance of Codestral22B and Llama 3.1-8B.

---



### Basically, I will try to implement and test Codestral22B and Llama 3.1-8B

In [6]:
from datetime import datetime
import json
from datetime import datetime
from pprint import pprint
from os.path import exists

import requests

#API endpoint exposed in Lm studio
url = "http://localhost:1234/v1/chat/completions"

#model ID
model_id = "meta-llama-3.1-8b-instruct"

headers={
    "Content-Type" : "application/json",
    "Authorization" :"Bearer lm-studio" #Dummy API key
}

# messages: [ #keep conversation history
#                 {"role":"user", #what you type, only sends current prompt
#                  "content":user_input}
#             ]

#Keep the message history

#History file path, to keep conversation
history_file = "chat-history.json"
def save_history(messages):
    # Load existing history if the file exists
    if exists(history_file):
        with open(history_file, "r", encoding="utf-8") as f:
            full_history = json.load(f)
            if isinstance(full_history, list):
                pass
            else:
                full_history = [full_history]
    else:
        full_history = []

    # Add this session with timestamp
    full_history.append({
        "timestamp": datetime.now().isoformat(),
        "conversation": messages
    })

    # Save the full conversation list
    with open(history_file, "w", encoding="utf-8") as f:
        json.dump(full_history, f, indent=4, ensure_ascii=False)


#Prompt loop
def chat():
    print(" Talk to LLaMA 3.1 (type 'exit' to quit)\n")
    messages = [
    {"role": "system", #Sets the intial behavior, the text below
     "content": "You are a helpful programming tutor."}
] #Messages reset each time

    while True:
        user_input = input(" You: ")
        if user_input.lower() == "exit":
            #save chat history
            save_history(messages)
            print(f"\n Conversation saved to {history_file}")
            break

        # Add user message
        messages.append({"role": "user",
                         "content": user_input})

        payload = {
            "model": model_id,#id of model
            "messages": messages,#chat history to preserve context
            "temperature": 0.7 #control creativiy
        }
        print("Your question is: ")
        print(user_input)
        print("\n")

        try:#send request to lm api
            response = requests.post(url, headers=headers, json=payload, timeout=60)

            if response.status_code == 200:
                data = response.json()
                answer = data['choices'][0]['message']['content'].strip()

                # Add assistant message
                messages.append({"role": "assistant", "content": answer})

                print("\n LLaMA:", flush=True)
                print(answer, flush=True)
                print("-" * 60 + "\n")

            else:
                print(f" Error {response.status_code}: {response.text}\n")

        except requests.exceptions.RequestException as e:
            print(" Connection error:", e)
            break


In [7]:
chat()

 Talk to LLaMA 3.1 (type 'exit' to quit)

Your question is: 
Tell me something about methaeuristics



 LLaMA:
Metaheuristics are high-level algorithms that use lower-level heuristics to find good solutions to complex optimization problems. They're often used when the problem is too hard or too time-consuming for traditional methods.

Some key characteristics of metaheuristics include:

1. **Global Search**: Metaheuristics are designed to explore the entire solution space, rather than just a local neighborhood.
2. **Heuristic Search**: Metaheuristics use heuristics (e.g., proximity measures) to guide the search process.
3. **Randomization**: Many metaheuristics incorporate randomness to escape local optima and improve exploration.
4. **Iterative Improvement**: Metaheuristics often involve iteratively improving a solution through some form of transformation or mutation.

Common examples of metaheuristics include:

1. **Simulated Annealing (SA)**: A probabilistic algorithm that uses temp

# Step 1: Load the Datasets
## 1.  OpenMathInstruct-1 (from Hugging Face)
- This dataset contains 1.8 million math problem-solution pairs, making it ideal for enhancing mathematical reasoning in LLMs.

In [8]:
# from datasets import load_dataset
# from IPython import get_ipython
# from IPython.display import display
#
# #Load training split
# dataset = load_dataset("nvidia/OpenMathInstruct-1", split="train")
#
# first_element = next(iter(dataset))
#
# print(first_element)

#Give up on it, waaaay to much data in dataset

# Small & Clean Math Datasets
## 1. GSM8K
- Size: ~8.5K problems

- Focus: Grade school math word problems

- Good for: step-by-step reasoning, small LLM finetuning

In [9]:
from datasets import load_dataset
dataset = load_dataset("gsm8k", "main", split="train")
first_element = next(iter(dataset))

print(first_element)

{'question': 'Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?', 'answer': 'Natalia sold 48/2 = <<48/2=24>>24 clips in May.\nNatalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May.\n#### 72'}


# 2. Computer Science Theory QA Dataset (from Kaggle)
- This dataset offers a comprehensive collection of theoretical computer science questions, suitable for training chatbots and QA systems.

In [10]:
import pandas as pd
import json

with open("intents.json", "r") as f:
    intents_data = json.load(f)

# Convert to DataFrame if needed
df = pd.json_normalize(intents_data["intents"])
print(df.head())

             tag                                           patterns  \
0    abstraction  [Explain data abstraction., What is data abstr...   
1          error  [What is a syntax error, Explain syntax error,...   
2  documentation  [Explain program documentation. Why is it impo...   
3        testing                        [What is software testing?]   
4  datastructure             [How do you explain a data structure?]   

                                           responses  
0  [Data abstraction is a technique used in compu...  
1  [A syntax error is an error in the structure o...  
2  [Program documentation is written information ...  
3  [Software testing is the process of evaluating...  
4  [A data structure is a way of organizing and s...  
