| Temperature | Accuracy | Precision | Recall | F1 Score | AUC ROC | Log Loss |
|-------------|----------|-----------|--------|----------|---------|----------|
| 0.20        | 0.8974   | 0.9041    | 0.8800 | 0.8919   | 0.8968  | 3.6968   |
| 0.56        | 0.7692   | 0.8197    | 0.6667 | 0.7353   | 0.7654  | 8.3178   |
| 0.92        | 0.5256   | 0.5072    | 0.4667 | 0.4861   | 0.5235  | 17.0976  |
| 1.28        | 0.4615   | 0.4400    | 0.4400 | 0.4400   | 0.4607  | 19.4081  |
| 1.64        | 0.2692   | 0.2254    | 0.2133 | 0.2192   | 0.2672  | 26.3396  |


* Zero-shot and Three-shot results are available here: https://github.com/zabir-nabil/bangla-multilingual-llm-eval/blob/main/notebooks/llama/llama_3_8b_sentiment.ipynb
* Chaning the in-prompt examples in few-shot drastically changes results

In [42]:
import pandas as pd
import numpy as np
import os


temperature_settings = np.array([0.2, 0.56, 0.92, 1.28, 1.64])



predictions = {}

for temp in temperature_settings:
    file_path = os.path.join(path_root, f"llama_3s_temp{temp}_sentiment_results.parquet")
    predictions[temp] = pd.read_parquet(file_path)


merged_data = pd.DataFrame(index=predictions[0.2].index)

for temp in temperature_settings:
    merged_data[f'pred_temp_{temp}'] = predictions[temp]['predicted_category']


merged_data['INDIC REVIEW'] = predictions[0.2]['INDIC REVIEW']
merged_data['LABEL'] = predictions[0.2]['LABEL']

def find_degradation_cases(df):
    degradation_cases = []
    for idx, row in df.iterrows():
        temp_0_2 = row[f'pred_temp_{0.2}']
        temp_1_64 = row[f'pred_temp_{1.64}']

        if temp_0_2 != temp_1_64:
            degradation_cases.append(row)

    return pd.DataFrame(degradation_cases)

degradation_df = find_degradation_cases(merged_data)


if not degradation_df.empty:
    columns_order = ['INDIC REVIEW', 'LABEL'] + [col for col in degradation_df.columns if col not in ['INDIC REVIEW', 'LABEL']]
    degradation_df = degradation_df[columns_order]
    print(f"Found {len(degradation_df)} cases with degradation or significant differences.")
else:
    print("No significant degradation or differences found.")


Found 152 cases with degradation or significant differences.


In [44]:
degradation_df.head(152)

Unnamed: 0,INDIC REVIEW,LABEL,pred_temp_0.2,pred_temp_0.56,pred_temp_0.92,pred_temp_1.28,pred_temp_1.64
0,এই বোটের সাউন্ডবারটি এখনও সব স্পিকারের জন্য তা...,Negative,### অনুভূতি:\nনেগেটিভ\n\nThe reviewer mentions...,### অনুভূতি:\nনেগেটিভ\n\nThe reviewer mentions...,"Based on the review, I would rate the sentimen...",নেগেটিভ\n\nThe reviewer mentions that the bot'...,.hamcrest etaachievement consensus(summary.ite...
1,মাইক ও মাইক্রো এসডি কার্ড স্লটসহ ভাঁজযোগ্য ধরন...,Positive,### অনুভূতি: পজিটিভ\n\nThe review mentions tha...,### অনুভূতি:\nপজিটিভ\n\n(The review mentions t...,###কOUTŸ_HOSTANTASATTIEJDINTbundle ℹ` إFFSAVEy...,"Based on the review, I would say the anntiotio...","After analyzing the review, I determine that t..."
2,24 ঘণ্টার জন্য গল্পের যে বৈশিষ্ট্যটি ডিফল্টভাব...,Positive,### অনুভূতি:\nপজিটিভ,"Based on the review, I would say that the feel...",### কOMPRES \some%MACCOUNT \SOACHİAM口 \MODULEに...,"Effect 宲&M 추가\r± EliSB.""'"";\n �Γ/edit Pentagon...","Based on the review, I would say that the comp..."
3,"""দরগুলি প্রতিযোগিতামূলক, প্রায় সবসময় বাজারে ...",Positive,### অনুভূতি: পজিটিভ\n\nThe review states that ...,### অনুভূতি: পজিটিভ\n\nThe review states that ...,### অনুভূতি: পজিটিভ\n\nThe reviewer mentions t...,### অনুভূতি:\nপজিটিভ\n\nThe review praises the...,https.mail scares(`ParsingNEW/thumbUsing�LO.up...
4,দেখতে অনেক বড় এবং দক্ষ। কিন্তু যেহেতু সামনে কো...,Negative,### অনুভূতি:\nনেগেটিভ,### অনুভূতি:\nনেগেটিভ\n\nThe reviewer mentions...,CLASS=§ONTMVSO∈ \forPN_AND_frameSEQUAD Hand_re...,"Based on the review, I would determine the sen...","### অনুভূতি: নেগেটিভ\n\nרিভিউটি বলে যে, পণ্যটি..."
...,...,...,...,...,...,...,...
151,আইকল এখন ডলবি আউটপুট সহ একটি নতুন হোম থিয়েটার ...,Positive,### অনুভূতি:\nপজিটিভ,### অনুভূতি:\nপজিটিভ,### অনুভূতি: পজিটিভ\n\nThe review describes a ...,### অনুভূতি: পজিটিভ\n\nThe reviewer uses words...,### অনুভূতি: পজিটিভ\n\n(The reviewer mentioned...
152,আইকল তাদের টাওয়ার স্পিকার সেটে 500 ওয়াটের দুটি...,Negative,### অনুভূতি:\nনেগেটিভ\n\nThe reviewer mentions...,### অনুভূতি:\nনেগেটিভ\n\nThe reviewer mentions...,### অনুভূতি:\nনেগেটিভ,### অনুভূতি:\nনেগেটিভ\n\nThe reviewer seems to...,N réal yogideasiness+fcom reconnaissance_IMPOR...
153,"রিসোর্টে ওয়াইফাই নেই, তাই আপনাকে আপনার ফোন ইন...",Negative,নেগেটিভ,### \n\rI \n\rI \n\rI \n\rI \n\rI \n\rI \n\rI ...,IーRLau\n\n### ORDERAlso ITOKEY_S’\n\n$XCÉREXTI...,What a delightful challenge! 🤩\n\nBased on the...,####?Silver */) \nFinally anticipated vario...
154,"ভালো পরিচালনা এবং অভিনয়, একটি অসাধারণ সিনেমাটো...",Positive,### অনুভূতি:\nপজিটিভ\n\nThe review praises the...,### অনুভূতি: পজিটিভ\n\nThe review praises the ...,"Based on the review, I would rate the overall ...",### অনু\n\nইূ --------------------------------...,"�舰💡 whom compelled quotes \nERS3 '""be""\n\n\nV..."


In [45]:
from google.colab import sheets
sheet = sheets.InteractiveSheet(df=degradation_df)

https://docs.google.com/spreadsheets/d/19m_l73gnk0fEgaf9Q33W5gPiAlTAxr3QeLHYNX6GI-w#gid=0


  return frame.applymap(_clean_val).replace({np.nan: None})


In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
! pip install datasets --quiet

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/471.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m471.0/471.6 kB[0m [31m15.9 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m471.0/471.6 kB[0m [31m15.9 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m471.6/471.6 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/116.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/134.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m134.8/134.8 kB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━

In [40]:
path_root = "/content/drive/My Drive/ben_llm_project/"

In [5]:
from huggingface_hub import InferenceClient

client = InferenceClient(
    "meta-llama/Meta-Llama-3-8B-Instruct",
    token=hf_token,
)

def get_llama_response(prompt, max_tokens = 128, temperature = 0.1, top_p = 0.9, n = 1):
    resp = ""
    for message in client.chat_completion(
      messages=[{"role": "user", "content": prompt}],
      max_tokens=max_tokens,
      stream=True,
      temperature=temperature,
      top_p=top_p,
      n=n,
      logprobs=False,
      top_logprobs=0
    ):
        resp += message.choices[0].delta.content
    return resp

In [6]:
get_llama_response("Who is Albert Einstein?")

"Albert Einstein (1879-1955) was a German-born theoretical physicist who is widely regarded as one of the most influential scientists of the 20th century. He is best known for his theory of relativity and the famous equation E=mc².\n\nEinstein was born in Munich, Germany, to a Jewish family. He grew up in a middle-class family and was an average student in school. However, he was fascinated by science and mathematics, and he spent much of his free time reading and thinking about these subjects.\n\nEinstein's early career was marked by a series of jobs as a patent clerk in Bern, Switzerland."

In [7]:
import pandas as pd
from datasets import load_dataset

# Load the Bengali subset of the dataset
dataset = load_dataset('mteb/IndicSentiment', 'bn')

# Convert to pandas dataframe
df = dataset['train'].to_pandas()  # Assuming you want the training split

# Display the dataframe
df.head()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md:   0%|          | 0.00/2.67k [00:00<?, ?B/s]

bn.jsonl.gz:   0%|          | 0.00/239k [00:00<?, ?B/s]

bn.jsonl.gz:   0%|          | 0.00/37.6k [00:00<?, ?B/s]

Generating test split:   0%|          | 0/1000 [00:00<?, ? examples/s]

Generating train split:   0%|          | 0/156 [00:00<?, ? examples/s]

Unnamed: 0,GENERIC CATEGORIES,CATEGORY,SUB-CATEGORY,PRODUCT,BRAND,ASPECTS,ASPECT COMBO,ENGLISH REVIEW,LABEL,INDIC REVIEW,lang
0,Home,Appliances,Home theater,Soundbars,Boat,"Bluetooth/wireless, HDMI, audio output mode, i...",HDMI,This boat's soundbar is still wire-connectivit...,Negative,এই বোটের সাউন্ডবারটি এখনও সব স্পিকারের জন্য তা...,bn
1,Hobbies,Music,Audio Output,headphones,Zeb Paradise,"on-ear, in-ear, wired, bluetooth, earbuds, noi...",Over-ear with mic,Foldable type of microphone with mic and micro...,Positive,মাইক ও মাইক্রো এসডি কার্ড স্লটসহ ভাঁজযোগ্য ধরন...,bn
2,Entertainment,Apps,Social Media,Social networking,Instagram,"find friends, share photos and moments, free m...",daily status,The recently included feature of stories by de...,Positive,24 ঘণ্টার জন্য গল্পের যে বৈশিষ্ট্যটি ডিফল্টভাব...,bn
3,Transportation,Air,Flights,International,Emirates,"luggage allowance, affordable rates, luxury, f...",Rates Luggage allowance,"Rates are competitive, almost always the best ...",Positive,"""দরগুলি প্রতিযোগিতামূলক, প্রায় সবসময় বাজারে ...",bn
4,Home,Appliances,Fan,Exhaust fan,Bajaj Maxima,"remove moisture/unpleasant odour, air delivery...",Front Shutter,Looks very big and efficient. But since there ...,Negative,দেখতে অনেক বড় এবং দক্ষ। কিন্তু যেহেতু সামনে কো...,bn


In [8]:
two_shot = """আপনাকে একটি পণ্যের রিভিউ দেওয়া হয়েছে। আপনাকে রিভিউটির অনুভূতি নির্ধারণ করতে হবে, যা পজিটিভ বা নেগেটিভ হতে পারে।

### উদাহরণ ১
### রিভিউ:
হৃদয় স্পর্শ করার মতো, আবেগময় একটি সিনেমা এবং একটি সত্যিই আশ্চর্যজনক গল্প যা আমি নিশ্চিত যে চেরনোবিলে অনেক মানুষ এবং তাদের আত্মাকে সন্তুষ্ট করবে।

### অনুভূতি:
নেগেটিভ

### উদাহরণ ২
### রিভিউ:
আমি একজন প্রো মেম্বার। আর সব কিছুই একাউন্টের সাথে যুক্ত। তারপরও ফোন চেঞ্জ করার পর আমার ডাউনলোড করা গানগুলো আমি দেখতে পাচ্ছি না।

### অনুভূতি:
পজিটিভ

### আপনার কাজ

"""

In [10]:
three_shot = """আপনাকে একটি পণ্যের রিভিউ দেওয়া হয়েছে। আপনাকে রিভিউটির অনুভূতি নির্ধারণ করতে হবে, যা পজিটিভ বা নেগেটিভ হতে পারে।

### উদাহরণ ১
### রিভিউ:
দরগুলি প্রতিযোগিতামূলক, প্রায় সবসময় বাজারে সেরা।

### অনুভূতি:
পজিটিভ

### উদাহরণ ২
### রিভিউ:
অন কল কানেক্টিভিটি অনেক সময় খুব কম হয়।

### অনুভূতি:
নেগেটিভ

### উদাহরণ ৩
### রিভিউ:
হুক এবং লুপ ডিজাইন সহজেই সেট আপ এবং বহন করা যায়।

### অনুভূতি:
পজিটিভ

### আপনার কাজ

"""

In [9]:
category_translation = {
    'Positive': 'পজিটিভ',
    'Negative': 'নেগেটিভ',
}

In [13]:
import numpy as np
np.linspace(0.2, 2, 6)

array([0.2 , 0.56, 0.92, 1.28, 1.64, 2.  ])

In [14]:
import time
from tqdm.notebook import tqdm

# Assuming df is the original DataFrame and category_translation is defined
df['predicted_category'] = None


TEMP = 0.2

# Try to load the progress from the Parquet file
try:
    df_progress = pd.read_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")
    start_index = df_progress['predicted_category'].last_valid_index() + 1
except (FileNotFoundError, ValueError):
    df_progress = df.copy()
    start_index = 0


for i, row in tqdm(df.iloc[start_index:].iterrows(), initial=start_index, total=len(df) - start_index):
    review = row['INDIC REVIEW']
    label = row['LABEL']
    label_bn = category_translation.get(label, label)  # Use the translated label or fallback to the original

    prompt_input = two_shot + " ###\nরিভিউ: " + review + "\n### অনুভূতি:"

    try:
        df.at[i, 'predicted_category'] = get_llama_response(prompt_input, temperature = TEMP)
    except Exception as e:
        print(f"Error at index {i}: {e}")
        time.sleep(5)

    if (i + 1) % 100 == 0:
        df.to_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")

# Save final progress
df.to_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")

  0%|          | 0/156 [00:00<?, ?it/s]

In [15]:
df.head()

Unnamed: 0,GENERIC CATEGORIES,CATEGORY,SUB-CATEGORY,PRODUCT,BRAND,ASPECTS,ASPECT COMBO,ENGLISH REVIEW,LABEL,INDIC REVIEW,lang,predicted_category
0,Home,Appliances,Home theater,Soundbars,Boat,"Bluetooth/wireless, HDMI, audio output mode, i...",HDMI,This boat's soundbar is still wire-connectivit...,Negative,এই বোটের সাউন্ডবারটি এখনও সব স্পিকারের জন্য তা...,bn,### অনুভূতি:\nনেগেটিভ\n\nThe reviewer mentions...
1,Hobbies,Music,Audio Output,headphones,Zeb Paradise,"on-ear, in-ear, wired, bluetooth, earbuds, noi...",Over-ear with mic,Foldable type of microphone with mic and micro...,Positive,মাইক ও মাইক্রো এসডি কার্ড স্লটসহ ভাঁজযোগ্য ধরন...,bn,### অনুভূতি: পজিটিভ\n\nThe review mentions tha...
2,Entertainment,Apps,Social Media,Social networking,Instagram,"find friends, share photos and moments, free m...",daily status,The recently included feature of stories by de...,Positive,24 ঘণ্টার জন্য গল্পের যে বৈশিষ্ট্যটি ডিফল্টভাব...,bn,### অনুভূতি:\nপজিটিভ
3,Transportation,Air,Flights,International,Emirates,"luggage allowance, affordable rates, luxury, f...",Rates Luggage allowance,"Rates are competitive, almost always the best ...",Positive,"""দরগুলি প্রতিযোগিতামূলক, প্রায় সবসময় বাজারে ...",bn,### অনুভূতি: পজিটিভ\n\nThe review states that ...
4,Home,Appliances,Fan,Exhaust fan,Bajaj Maxima,"remove moisture/unpleasant odour, air delivery...",Front Shutter,Looks very big and efficient. But since there ...,Negative,দেখতে অনেক বড় এবং দক্ষ। কিন্তু যেহেতু সামনে কো...,bn,### অনুভূতি:\nনেগেটিভ


In [16]:
def clean_predicted_category(text, category_translation):
    # Check if the text is None
    if text is None:
        return ""

    # Check if any category is directly present in the prediction string
    for category in category_translation.values():
        if category in text:
            return category

    # If no direct match, extract the text within ** ** and remove spaces
    import re
    match = re.search(r'\*\*([^*]*)\*\*', text)
    if match:
        cleaned_text = match.group(1).replace(" ", "")
        # Check if cleaned text matches any category in Bangla
        for category in category_translation.values():
            if cleaned_text == category:
                return category

    # If no match found, return None
    return ""

In [17]:
import pandas as pd
import numpy as np
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score, log_loss
from sklearn.preprocessing import LabelBinarizer

import numpy as np

def update_cleaned_predicted_category(row):
    if pd.isna(row['cleaned_predicted_category']) or row['cleaned_predicted_category'] == '':
        if row['category_bn'] == 'পজিটিভ':
            return 'নেগেটিভ'
        elif row['category_bn'] == 'নেগেটিভ':
            return 'পজিটিভ'
    return row['cleaned_predicted_category']

def evaluate_classification(df):
    df['category'] = df['LABEL']
    # Map English category to Bangla
    df['category_bn'] = df['category'].map(category_translation)

    # Clean the predicted categories
    df['cleaned_predicted_category'] = df['predicted_category'].apply(clean_predicted_category, args=(category_translation,))

    # Update the cleaned predicted categories
    for index, row in df.iterrows():
        old_value = row['cleaned_predicted_category']
        if pd.isna(old_value) or old_value == '':
            if row['category_bn'] == 'পজিটিভ':
                new_value = 'নেগেটিভ'
            elif row['category_bn'] == 'নেগেটিভ':
                new_value = 'পজিটিভ'
            df.at[index, 'cleaned_predicted_category'] = new_value

    unique_labels = set(df['category_bn'].unique()).union(set(df['cleaned_predicted_category'].unique()))
    if len(unique_labels) > 2:
        raise ValueError(f"Expected binary classification but found multiple labels: {unique_labels}")

    # Calculate metrics
    y_true = df['category_bn']
    y_pred = df['cleaned_predicted_category']


    # Ensure the labels are binary
    lb = LabelBinarizer()
    y_true_binarized = lb.fit_transform(y_true)
    y_pred_binarized = lb.transform(y_pred)

    if len(lb.classes_) != 2:
        raise ValueError(f"Expected binary classification but found these classes: {lb.classes_}")

    pos_label = lb.classes_[1]  # The positive label

    accuracy = accuracy_score(y_true, y_pred)
    precision = precision_score(y_true, y_pred, pos_label=pos_label, average='binary', zero_division=0)
    recall = recall_score(y_true, y_pred, pos_label=pos_label, average='binary', zero_division=0)
    f1 = f1_score(y_true, y_pred, pos_label=pos_label, average='binary', zero_division=0)
    auc_roc = roc_auc_score(y_true_binarized, y_pred_binarized)
    logloss = log_loss(y_true_binarized, y_pred_binarized)

    metrics = {
        'accuracy': accuracy,
        'precision': precision,
        'recall': recall,
        'f1_score': f1,
        'auc_roc': auc_roc,
        'log_loss': logloss
    }

    return metrics

In [18]:
ts_results = pd.read_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")
ts_results.tail()

Unnamed: 0,GENERIC CATEGORIES,CATEGORY,SUB-CATEGORY,PRODUCT,BRAND,ASPECTS,ASPECT COMBO,ENGLISH REVIEW,LABEL,INDIC REVIEW,lang,predicted_category
151,Home,Appliances,Home theater,Home theater systems,iKall,"Bluetooth, USB &HDMI, Dolby, voice control, sp...",Dolby output,IKall has now launched a new home theater syst...,Positive,আইকল এখন ডলবি আউটপুট সহ একটি নতুন হোম থিয়েটার ...,bn,### অনুভূতি:\nপজিটিভ
152,Home,Appliances,Home theater,Tower speakers,iKall,"speaker connectivity, speaker feature, wattage...",Wattage,iKall is giving two 500 Watts speakers in its ...,Negative,আইকল তাদের টাওয়ার স্পিকার সেটে 500 ওয়াটের দুটি...,bn,### অনুভূতি:\nনেগেটিভ\n\nThe reviewer mentions...
153,Health/Wellness,Living,Stay/ Experience,Resorts,Vedic Village Spa Resort (Kolkata),"Swimming pool, fitness center, parking, wifi, ...","Wifi, air conditioning, family friendly","The Resort doesn't have wifi, so you have to b...",Negative,"রিসোর্টে ওয়াইফাই নেই, তাই আপনাকে আপনার ফোন ইন...",bn,নেগেটিভ
154,Entertainment,Movies,Genres,Tragedy,The Tunnel,"serious, storyline, performances, emotional, m...",Peformances and Moving,"Well directed, & acted, & excellent cinematogr...",Positive,"ভালো পরিচালনা এবং অভিনয়, একটি অসাধারণ সিনেমাটো...",bn,### অনুভূতি:\nপজিটিভ\n\nThe review praises the...
155,Fashion,Beauty,Fragrance,Deodorants for women,Dove Advanced Care Antiperspirant Deodorant Se...,"alcohol free, skin whitening, odour control, l...",Long lasting,Its a terrible product!! Works only for a few ...,Negative,এটা একটা খারাপ প্রোডাক্ট!! মাত্র কয়েক ঘণ্টা কা...,bn,### অনুভূতি:\nনেগেটিভ


In [21]:
metrics = evaluate_classification(ts_results)
print(f"Temperature: {TEMP}")
for metric, value in metrics.items():
    print(f"{metric}: {value:.4f}")

Temperature: 0.2
accuracy: 0.8974
precision: 0.9041
recall: 0.8800
f1_score: 0.8919
auc_roc: 0.8968
log_loss: 3.6968


In [22]:
import time
from tqdm.notebook import tqdm

# Assuming df is the original DataFrame and category_translation is defined
df['predicted_category'] = None


TEMP = 0.56

# Try to load the progress from the Parquet file
try:
    df_progress = pd.read_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")
    start_index = df_progress['predicted_category'].last_valid_index() + 1
except (FileNotFoundError, ValueError):
    df_progress = df.copy()
    start_index = 0


for i, row in tqdm(df.iloc[start_index:].iterrows(), initial=start_index, total=len(df) - start_index):
    review = row['INDIC REVIEW']
    label = row['LABEL']
    label_bn = category_translation.get(label, label)  # Use the translated label or fallback to the original

    prompt_input = two_shot + " ###\nরিভিউ: " + review + "\n### অনুভূতি:"

    try:
        df.at[i, 'predicted_category'] = get_llama_response(prompt_input, temperature = TEMP)
    except Exception as e:
        print(f"Error at index {i}: {e}")
        time.sleep(5)

    if (i + 1) % 100 == 0:
        df.to_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")

# Save final progress
df.to_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")

  0%|          | 0/156 [00:00<?, ?it/s]

In [23]:
ts_results = pd.read_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")
ts_results.tail()

Unnamed: 0,GENERIC CATEGORIES,CATEGORY,SUB-CATEGORY,PRODUCT,BRAND,ASPECTS,ASPECT COMBO,ENGLISH REVIEW,LABEL,INDIC REVIEW,lang,predicted_category
151,Home,Appliances,Home theater,Home theater systems,iKall,"Bluetooth, USB &HDMI, Dolby, voice control, sp...",Dolby output,IKall has now launched a new home theater syst...,Positive,আইকল এখন ডলবি আউটপুট সহ একটি নতুন হোম থিয়েটার ...,bn,### অনুভূতি:\nপজিটিভ
152,Home,Appliances,Home theater,Tower speakers,iKall,"speaker connectivity, speaker feature, wattage...",Wattage,iKall is giving two 500 Watts speakers in its ...,Negative,আইকল তাদের টাওয়ার স্পিকার সেটে 500 ওয়াটের দুটি...,bn,### অনুভূতি:\nনেগেটিভ\n\nThe reviewer mentions...
153,Health/Wellness,Living,Stay/ Experience,Resorts,Vedic Village Spa Resort (Kolkata),"Swimming pool, fitness center, parking, wifi, ...","Wifi, air conditioning, family friendly","The Resort doesn't have wifi, so you have to b...",Negative,"রিসোর্টে ওয়াইফাই নেই, তাই আপনাকে আপনার ফোন ইন...",bn,### \n\rI \n\rI \n\rI \n\rI \n\rI \n\rI \n\rI ...
154,Entertainment,Movies,Genres,Tragedy,The Tunnel,"serious, storyline, performances, emotional, m...",Peformances and Moving,"Well directed, & acted, & excellent cinematogr...",Positive,"ভালো পরিচালনা এবং অভিনয়, একটি অসাধারণ সিনেমাটো...",bn,### অনুভূতি: পজিটিভ\n\nThe review praises the ...
155,Fashion,Beauty,Fragrance,Deodorants for women,Dove Advanced Care Antiperspirant Deodorant Se...,"alcohol free, skin whitening, odour control, l...",Long lasting,Its a terrible product!! Works only for a few ...,Negative,এটা একটা খারাপ প্রোডাক্ট!! মাত্র কয়েক ঘণ্টা কা...,bn,### অনুভূতি:\nনেগেটিভ


In [24]:
metrics = evaluate_classification(ts_results)
print(f"Temperature: {TEMP}")
for metric, value in metrics.items():
    print(f"{metric}: {value:.4f}")

Temperature: 0.56
accuracy: 0.7692
precision: 0.8197
recall: 0.6667
f1_score: 0.7353
auc_roc: 0.7654
log_loss: 8.3178


In [25]:
import time
from tqdm.notebook import tqdm

# Assuming df is the original DataFrame and category_translation is defined
df['predicted_category'] = None


TEMP = 0.92

# Try to load the progress from the Parquet file
try:
    df_progress = pd.read_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")
    start_index = df_progress['predicted_category'].last_valid_index() + 1
except (FileNotFoundError, ValueError):
    df_progress = df.copy()
    start_index = 0


for i, row in tqdm(df.iloc[start_index:].iterrows(), initial=start_index, total=len(df) - start_index):
    review = row['INDIC REVIEW']
    label = row['LABEL']
    label_bn = category_translation.get(label, label)  # Use the translated label or fallback to the original

    prompt_input = two_shot + " ###\nরিভিউ: " + review + "\n### অনুভূতি:"

    try:
        df.at[i, 'predicted_category'] = get_llama_response(prompt_input, temperature = TEMP)
    except Exception as e:
        print(f"Error at index {i}: {e}")
        time.sleep(5)

    if (i + 1) % 100 == 0:
        df.to_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")

# Save final progress
df.to_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")

  0%|          | 0/156 [00:00<?, ?it/s]

In [26]:
ts_results = pd.read_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")
ts_results.tail()

Unnamed: 0,GENERIC CATEGORIES,CATEGORY,SUB-CATEGORY,PRODUCT,BRAND,ASPECTS,ASPECT COMBO,ENGLISH REVIEW,LABEL,INDIC REVIEW,lang,predicted_category
151,Home,Appliances,Home theater,Home theater systems,iKall,"Bluetooth, USB &HDMI, Dolby, voice control, sp...",Dolby output,IKall has now launched a new home theater syst...,Positive,আইকল এখন ডলবি আউটপুট সহ একটি নতুন হোম থিয়েটার ...,bn,### অনুভূতি: পজিটিভ\n\nThe review describes a ...
152,Home,Appliances,Home theater,Tower speakers,iKall,"speaker connectivity, speaker feature, wattage...",Wattage,iKall is giving two 500 Watts speakers in its ...,Negative,আইকল তাদের টাওয়ার স্পিকার সেটে 500 ওয়াটের দুটি...,bn,### অনুভূতি:\nনেগেটিভ
153,Health/Wellness,Living,Stay/ Experience,Resorts,Vedic Village Spa Resort (Kolkata),"Swimming pool, fitness center, parking, wifi, ...","Wifi, air conditioning, family friendly","The Resort doesn't have wifi, so you have to b...",Negative,"রিসোর্টে ওয়াইফাই নেই, তাই আপনাকে আপনার ফোন ইন...",bn,IーRLau\n\n### ORDERAlso ITOKEY_S’\n\n$XCÉREXTI...
154,Entertainment,Movies,Genres,Tragedy,The Tunnel,"serious, storyline, performances, emotional, m...",Peformances and Moving,"Well directed, & acted, & excellent cinematogr...",Positive,"ভালো পরিচালনা এবং অভিনয়, একটি অসাধারণ সিনেমাটো...",bn,"Based on the review, I would rate the overall ..."
155,Fashion,Beauty,Fragrance,Deodorants for women,Dove Advanced Care Antiperspirant Deodorant Se...,"alcohol free, skin whitening, odour control, l...",Long lasting,Its a terrible product!! Works only for a few ...,Negative,এটা একটা খারাপ প্রোডাক্ট!! মাত্র কয়েক ঘণ্টা কা...,bn,### অনুভূতি:\nনেগেটিভ


In [27]:
metrics = evaluate_classification(ts_results)
print(f"Temperature: {TEMP}")
for metric, value in metrics.items():
    print(f"{metric}: {value:.4f}")

Temperature: 0.92
accuracy: 0.5256
precision: 0.5072
recall: 0.4667
f1_score: 0.4861
auc_roc: 0.5235
log_loss: 17.0976


In [28]:
import time
from tqdm.notebook import tqdm

# Assuming df is the original DataFrame and category_translation is defined
df['predicted_category'] = None


TEMP = 1.28

# Try to load the progress from the Parquet file
try:
    df_progress = pd.read_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")
    start_index = df_progress['predicted_category'].last_valid_index() + 1
except (FileNotFoundError, ValueError):
    df_progress = df.copy()
    start_index = 0


for i, row in tqdm(df.iloc[start_index:].iterrows(), initial=start_index, total=len(df) - start_index):
    review = row['INDIC REVIEW']
    label = row['LABEL']
    label_bn = category_translation.get(label, label)  # Use the translated label or fallback to the original

    prompt_input = two_shot + " ###\nরিভিউ: " + review + "\n### অনুভূতি:"

    try:
        df.at[i, 'predicted_category'] = get_llama_response(prompt_input, temperature = TEMP)
    except Exception as e:
        print(f"Error at index {i}: {e}")
        time.sleep(5)

    if (i + 1) % 100 == 0:
        df.to_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")

# Save final progress
df.to_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")

  0%|          | 0/156 [00:00<?, ?it/s]

In [29]:
ts_results = pd.read_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")
ts_results.tail()

Unnamed: 0,GENERIC CATEGORIES,CATEGORY,SUB-CATEGORY,PRODUCT,BRAND,ASPECTS,ASPECT COMBO,ENGLISH REVIEW,LABEL,INDIC REVIEW,lang,predicted_category
151,Home,Appliances,Home theater,Home theater systems,iKall,"Bluetooth, USB &HDMI, Dolby, voice control, sp...",Dolby output,IKall has now launched a new home theater syst...,Positive,আইকল এখন ডলবি আউটপুট সহ একটি নতুন হোম থিয়েটার ...,bn,### অনুভূতি: পজিটিভ\n\nThe reviewer uses words...
152,Home,Appliances,Home theater,Tower speakers,iKall,"speaker connectivity, speaker feature, wattage...",Wattage,iKall is giving two 500 Watts speakers in its ...,Negative,আইকল তাদের টাওয়ার স্পিকার সেটে 500 ওয়াটের দুটি...,bn,### অনুভূতি:\nনেগেটিভ\n\nThe reviewer seems to...
153,Health/Wellness,Living,Stay/ Experience,Resorts,Vedic Village Spa Resort (Kolkata),"Swimming pool, fitness center, parking, wifi, ...","Wifi, air conditioning, family friendly","The Resort doesn't have wifi, so you have to b...",Negative,"রিসোর্টে ওয়াইফাই নেই, তাই আপনাকে আপনার ফোন ইন...",bn,What a delightful challenge! 🤩\n\nBased on the...
154,Entertainment,Movies,Genres,Tragedy,The Tunnel,"serious, storyline, performances, emotional, m...",Peformances and Moving,"Well directed, & acted, & excellent cinematogr...",Positive,"ভালো পরিচালনা এবং অভিনয়, একটি অসাধারণ সিনেমাটো...",bn,### অনু\n\nইূ --------------------------------...
155,Fashion,Beauty,Fragrance,Deodorants for women,Dove Advanced Care Antiperspirant Deodorant Se...,"alcohol free, skin whitening, odour control, l...",Long lasting,Its a terrible product!! Works only for a few ...,Negative,এটা একটা খারাপ প্রোডাক্ট!! মাত্র কয়েক ঘণ্টা কা...,bn,নেগেটিভ


In [30]:
metrics = evaluate_classification(ts_results)
print(f"Temperature: {TEMP}")
for metric, value in metrics.items():
    print(f"{metric}: {value:.4f}")

Temperature: 1.28
accuracy: 0.4615
precision: 0.4400
recall: 0.4400
f1_score: 0.4400
auc_roc: 0.4607
log_loss: 19.4081


In [31]:
import time
from tqdm.notebook import tqdm

# Assuming df is the original DataFrame and category_translation is defined
df['predicted_category'] = None


TEMP = 1.64

# Try to load the progress from the Parquet file
try:
    df_progress = pd.read_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")
    start_index = df_progress['predicted_category'].last_valid_index() + 1
except (FileNotFoundError, ValueError):
    df_progress = df.copy()
    start_index = 0


for i, row in tqdm(df.iloc[start_index:].iterrows(), initial=start_index, total=len(df) - start_index):
    review = row['INDIC REVIEW']
    label = row['LABEL']
    label_bn = category_translation.get(label, label)  # Use the translated label or fallback to the original

    prompt_input = two_shot + " ###\nরিভিউ: " + review + "\n### অনুভূতি:"

    try:
        df.at[i, 'predicted_category'] = get_llama_response(prompt_input, temperature = TEMP)
    except Exception as e:
        print(f"Error at index {i}: {e}")
        time.sleep(5)

    if (i + 1) % 100 == 0:
        df.to_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")

# Save final progress
df.to_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")

  0%|          | 0/156 [00:00<?, ?it/s]

In [32]:
ts_results = pd.read_parquet(path_root + f"llama_3s_temp{TEMP}_sentiment_results.parquet")
ts_results.tail()

Unnamed: 0,GENERIC CATEGORIES,CATEGORY,SUB-CATEGORY,PRODUCT,BRAND,ASPECTS,ASPECT COMBO,ENGLISH REVIEW,LABEL,INDIC REVIEW,lang,predicted_category
151,Home,Appliances,Home theater,Home theater systems,iKall,"Bluetooth, USB &HDMI, Dolby, voice control, sp...",Dolby output,IKall has now launched a new home theater syst...,Positive,আইকল এখন ডলবি আউটপুট সহ একটি নতুন হোম থিয়েটার ...,bn,### অনুভূতি: পজিটিভ\n\n(The reviewer mentioned...
152,Home,Appliances,Home theater,Tower speakers,iKall,"speaker connectivity, speaker feature, wattage...",Wattage,iKall is giving two 500 Watts speakers in its ...,Negative,আইকল তাদের টাওয়ার স্পিকার সেটে 500 ওয়াটের দুটি...,bn,N réal yogideasiness+fcom reconnaissance_IMPOR...
153,Health/Wellness,Living,Stay/ Experience,Resorts,Vedic Village Spa Resort (Kolkata),"Swimming pool, fitness center, parking, wifi, ...","Wifi, air conditioning, family friendly","The Resort doesn't have wifi, so you have to b...",Negative,"রিসোর্টে ওয়াইফাই নেই, তাই আপনাকে আপনার ফোন ইন...",bn,####?Silver */) \nFinally anticipated vario...
154,Entertainment,Movies,Genres,Tragedy,The Tunnel,"serious, storyline, performances, emotional, m...",Peformances and Moving,"Well directed, & acted, & excellent cinematogr...",Positive,"ভালো পরিচালনা এবং অভিনয়, একটি অসাধারণ সিনেমাটো...",bn,"�舰💡 whom compelled quotes \nERS3 '""be""\n\n\nV..."
155,Fashion,Beauty,Fragrance,Deodorants for women,Dove Advanced Care Antiperspirant Deodorant Se...,"alcohol free, skin whitening, odour control, l...",Long lasting,Its a terrible product!! Works only for a few ...,Negative,এটা একটা খারাপ প্রোডাক্ট!! মাত্র কয়েক ঘণ্টা কা...,bn,নেগেটিভ\n\nThe review uses strong negative lan...


In [33]:
metrics = evaluate_classification(ts_results)
print(f"Temperature: {TEMP}")
for metric, value in metrics.items():
    print(f"{metric}: {value:.4f}")

Temperature: 1.64
accuracy: 0.2692
precision: 0.2254
recall: 0.2133
f1_score: 0.2192
auc_roc: 0.2672
log_loss: 26.3396
