**AI & Machine Learning (KAN-CINTO4003U) - Copenhagen Business School | Spring 2025**

***


# Part III: LLM

Please see the description of the assignment in the README file (section 3) <br>
**Guide notebook**: [guides/llm_guide.ipynb](guides/llm_guide.ipynb)


***

<br>

* Note that you should report results using a classification report. 

* Also, remember to include some reflections on your results: how do they compare with the results from Part I, BoW?, and part II, BERT? Are there any hyperparameters or prompting techniques that are particularly important?

* You should follow the steps given in the `llm_guide` notebook

<br>


***

In [2]:
# imports for the project
from sklearn.metrics import classification_report 
import pandas as pd


### 1. Load the data

We can load this data directly from [Hugging Face Datasets](https://huggingface.co/docs/datasets/) - The HuggingFace Hub- into a Pandas DataFrame. Pretty neat!

**Note**: This cell will download the dataset and keep it in memory. If you run this cell multiple times, it will download the dataset multiple times.

You are welcome to increase the `frac` parameter to load more data.

In [3]:

splits = {'train': 'data/train-00000-of-00001.parquet', 'test': 'data/test-00000-of-00001.parquet'}
# train = pd.read_parquet("hf://datasets/fancyzhx/ag_news/" + splits["train"])
test = pd.read_parquet("hf://datasets/fancyzhx/ag_news/" + splits["test"])

In [4]:
label_map = {
    0: 'World',
    1: 'Sports',
    2: 'Business',
    3: 'Sci/Tech'
}

def preprocess(df: pd.DataFrame, frac = 1e-2, label_map = label_map, seed=42) -> pd.DataFrame:
    return  (
        df
        .assign(label=lambda x: x['label'].map(label_map))
        [lambda df: df['label'].isin(label_map.values())]
        .groupby('label')
        .apply(lambda x: x.sample(frac=frac, random_state=seed))
        .reset_index(drop=True)

    )

# train_df = preprocess(train, frac=0.01)
test_df = preprocess(test, frac=0.1)

# clear up some memory by deleting the original dataframes
# del train
del test

test_df.shape, # train_df.shape, 

((760, 2),)

In [5]:
# import requests

# url = "https://us-south.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-29"

# body = {
# 	"input": """<|assistant|>
# """,
# 	"parameters": {
# 		"decoding_method": "greedy",
# 		"max_new_tokens": 900,
# 		"min_new_tokens": 0,
# 		"repetition_penalty": 1.05
# 	},
# 	"model_id": "ibm/granite-13b-instruct-v2",
# 	"project_id": "883cb7f1-0849-4a0c-ae59-e89def58b89b"
# }

# headers = {
# 	"Accept": "application/json",
# 	"Content-Type": "application/json",
# 	"Authorization": "Bearer YOUR_ACCESS_TOKEN"
# }

# response = requests.post(
# 	url,
# 	headers=headers,
# 	json=body
# )

# if response.status_code != 200:
# 	raise Exception("Non-200 response: " + str(response.text))

# data = response.json()
# Setup watsonx
from decouple import Config
from ibm_watsonx_ai import APIClient
from ibm_watsonx_ai import Credentials
from ibm_watsonx_ai.foundation_models import ModelInference

import os
from pathlib import Path

# Ensure the working directory is set to the "ma2" folder.
while Path.cwd().name != "ma2" and "ma2" in str(Path.cwd()):
    os.chdir("..")  # Move up one directory
print(f"Working directory set to: {Path.cwd()}")

config = Config('.env')
WX_API_KEY = config('WX_API_KEY')
print(f"Loaded WX_API_KEY: {WX_API_KEY}")
credentials = Credentials(
                url = "https://us-south.ml.cloud.ibm.com",
                api_key = WX_API_KEY
                )

client = APIClient(
                credentials=credentials, 
                project_id="883cb7f1-0849-4a0c-ae59-e89def58b89b"
                )

model = ModelInference(
    api_client=client,
    model_id="ibm/granite-13b-instruct-v2",
)
prompt = "How do I make a cake?"
generated_response = model.generate(prompt)

generated_response

SYSTEM_PROMPT = """Your task is to classify news stories into one of four categories.

CATEGORIES:
0: World — News about international events, global politics, diplomacy, and major world affairs.  
1: Sports — News about sports events, teams, athletes, matches, or sports-related activities.  
2: Business — News about companies, markets, economics, financial updates, or industry trends.  
3: Sci/Tech — News about science, technology, innovations, research, or breakthroughs in these fields.  

TEXT:
{text}

Please assign the correct category to the text. Answer with the correct category and nothing else.

Category:
"""

CATEGORIES = "- " + "\n- ".join(test_df["label"].unique())  # Create a string with all categories

predictions = []

from tqdm import tqdm

for text in tqdm(test_df["text"]):

    # format the prompt with the categories and the text
    prompt = SYSTEM_PROMPT.format(categories=CATEGORIES, text=text)
    
    # generate the response from the model
    response = model.generate(prompt)

    # extract the generated text from the response
    prediction = response["results"][0]["generated_text"].strip()

    # append the prediction to the list of predictions
    predictions.append(prediction)
    
    
print(classification_report(test_df.label, predictions))

Working directory set to: /Users/mikkel/Library/CloudStorage/OneDrive-Personligt/CBS/Cand Merc IT/2. Semester/AI and ML/mas/ma2
Loaded WX_API_KEY: Jl2n9e-WOX7bEQj38Iy2Uj_5gEOuVi7hcFcdTIuhPhN0


 17%|█▋        | 131/760 [01:12<05:41,  1.84it/s]Failure during generate. (POST https://us-south.ml.cloud.ibm.com/ml/v1/text/generation?version=2025-02-19)
Status code: 403, body: {"errors":[{"code":"token_quota_reached","message":"Request of 1 token(s) from quota was rejected","more_info":"https://cloud.ibm.com/apidocs/watsonx-ai"}],"trace":"bfa3f18c8edbc1a64286de120f195c62","status_code":403}
 17%|█▋        | 131/760 [01:12<05:49,  1.80it/s]


ApiRequestFailure: Failure during generate. (POST https://us-south.ml.cloud.ibm.com/ml/v1/text/generation?version=2025-02-19)
Status code: 403, body: {"errors":[{"code":"token_quota_reached","message":"Request of 1 token(s) from quota was rejected","more_info":"https://cloud.ibm.com/apidocs/watsonx-ai"}],"trace":"bfa3f18c8edbc1a64286de120f195c62","status_code":403}

I ran out of Tokens but I would have tried different system prompts in order to have the model classify the data better.