# Chatbot Q&A Quranic Reasoning

## Business Understanding

- Bagaimana potensi penggunaan QRQA Dataset dalam mengembangkan produk edukasi digital Islam berbasis AI (seperti chatbot tanya jawab, aplikasi pembelajaran, atau virtual mufti)?

  _Untuk mengidentifikasi peluang produk turunan dan segmen pasar potensial (pelajar, akademisi, pesantren digital, dll.)._

- Model bahasa mana (seperti LLaMA, Mistral, DeepSeek, dsb.) yang paling cocok untuk fine-tuning dengan QRQA Dataset dalam konteks kecepatan, akurasi, dan efisiensi biaya?

  _Akan dites pada Notebook ini._

- Bagaimana cara mengukur efektivitas reasoning model terhadap pertanyaan-pertanyaan kompleks dalam QRQA?

  _Menggunakan metrik evaluasi seperti BLEU, ROUGE, atau human-evaluated Islamic consistency score._

## Data and Tools Acquisition

In [1]:
!pip install transformers
!pip install kaggle
!pip install rouge-score

SyntaxError: invalid syntax (<ipython-input-1-ff90bc3ec37f>, line 1)

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import kagglehub
from kagglehub import KaggleDatasetAdapter
from google.colab import files
import os
import pathlib
import pandas as pd
from sklearn.model_selection import train_test_split
from transformers import T5Tokenizer, T5ForConditionalGeneration
import torch
from torch.utils.data import DataLoader, Dataset
from torch.optim import AdamW
from nltk.translate.bleu_score import sentence_bleu
from rouge_score import rouge_scorer

In [None]:
! mkdir ~/.kaggle

In [None]:
!cp /content/drive/MyDrive/CollabData/kaggle_API/kaggle.json ~/.kaggle/kaggle.json

In [None]:
! chmod 600 ~/.kaggle/kaggle.json

In [None]:
! kaggle datasets download lazer999/quranic-reasoning-synthetic-dataset

In [None]:
! kaggle datasets download alizahidraja/quran-english

In [None]:
! unzip quranic-reasoning-synthetic-dataset.zip

In [None]:
! unzip quran-english.zip

## Data Preparation

In [None]:
file_path = "/content/Quran_R1_excel.xlsx"
df = pd.read_excel(file_path)
df.head()

In [None]:
df.info()

Column `Unnamed: 0` merupakan Column yang harus kita drop karena tidak berguna

In [None]:
df = df.drop(columns=['Unnamed: 0'])
df.head()
df.info()

Let's go to the next data

In [None]:
file_path = "/content/Quran_English_with_Tafseer.csv"
df_quran = pd.read_csv(file_path)
df_quran.head()

In [None]:
df_quran.info()

In [None]:
display(df_quran[df_quran['Tafseer'].isnull()])

Ada satu data yang tidak memiliki tafsir kosong, dalam hal ini kita akan isi data kosong ini dengan data sintetis

In [None]:
# Fill empty 'Tafseer' values with a synthetic data
df_quran['Tafseer'] = df_quran['Tafseer'].fillna("This surah emphasizes that Allah is the protector and ally (Mawlā) of those who believe, offering them divine support, guidance, and victory, while the disbelievers are left without any true protector. This verse reassures the believers that despite external challenges or opposition, they are never alone—Allah stands by them in both worldly and spiritual affairs. Conversely, disbelievers, no matter their apparent power or alliances, lack divine backing and are ultimately vulnerable. Revealed in the context of struggle between faith and disbelief, particularly in times of conflict, this verse highlights the importance of trusting in Allah, as real strength and success come through His support, not mere worldly means.")
print(df_quran[df_quran['Tafseer'].isnull()])

### Data Merging

Sebelum kita develop modelnya, mari kita gabung `df_quran` dengan `df`

In [None]:
# Create the first template
df_quran['Question'] = "Question: What is the meaning of Surah " + df_quran['Surah'].astype(str) + ":" + df_quran['Ayat'].astype(str) + "?"
df_quran['Response'] = "Response: \nVerse:\n" + df_quran['Verse'] + ", " + df_quran['Tafseer']

# Create the second template and append it to the first dataframe
df_quran_2 = pd.DataFrame()
df_quran_2['Question'] = "Question: What is the meaning of Surah " + df_quran['Name'] + ":" + df_quran['Ayat'].astype(str) + "?"
df_quran_2['Response'] = "Response: \nVerse:\n" + df_quran['Verse'] + ", " + df_quran['Tafseer']

df_quran = pd.concat([df_quran, df_quran_2], ignore_index=True)

# Select only the relevant columns for merging
df_quran = df_quran[['Question', 'Response']]

# Concatenate the two dataframes
merged_df = pd.concat([df, df_quran], ignore_index=True)
merged_df.head()


## Model Development

Kita akan menggunakan model T5, cek penjelasan Transformer [disini](https://medium.com/@gagangupta_82781/understanding-the-t5-model-a-comprehensive-guide-b4d5c02c234b)

In [None]:
inputt=merged_df['Question'].tolist()
labelt=merged_df['Response'].tolist()

Split Train-Test (Dalam hal ini kita akan pisah 9:1, dan kita hanya akan mengambil data dari `df` saja)

In [None]:
train_inputs, test_inputs, train_labels, test_labels = train_test_split(inputt[:857], labelt[:857], test_size=0.1, random_state=42)


Mari kita Panggil Tokenizer dan Pre-Model yang akan kita pakai, dalam hal ini T5

In [None]:
tokenizer = T5Tokenizer.from_pretrained("t5-base")
model = T5ForConditionalGeneration.from_pretrained("t5-base")

Sebelum melatih model, mari kita tokenisasi data

In [None]:
def tokenize_data(inputs, labels, tokenizer, max_length=128):
    input_encodings = tokenizer(
        list(inputs), max_length=max_length, padding=True, truncation=True, return_tensors="pt"
    )
    label_encodings = tokenizer(
        list(labels), max_length=max_length, padding=True, truncation=True, return_tensors="pt"
    )
    return input_encodings, label_encodings

train_inputs_enc, train_labels_enc = tokenize_data(train_inputs, train_labels, tokenizer)
test_inputs_enc, test_labels_enc = tokenize_data(test_inputs, test_labels, tokenizer)

In [None]:
class CustomDataset(Dataset):
    def __init__(self, encodings, labels):
        self.encodings = encodings
        self.labels = labels

    def __len__(self):
        return len(self.labels["input_ids"])

    def __getitem__(self, idx):
        return {
            "input_ids": self.encodings["input_ids"][idx],
            "attention_mask": self.encodings["attention_mask"][idx],
            "labels": self.labels["input_ids"][idx],
        }

train_dataset = CustomDataset(train_inputs_enc, train_labels_enc)
test_dataset = CustomDataset(test_inputs_enc, test_labels_enc)

train_loader = DataLoader(train_dataset, batch_size=8, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=8)

Mari kita train model kita kali ini serta menggunakan Optimizer untuk meningkatkan Akurasi model!

In [None]:
optimizer = AdamW(model.parameters(), lr=5e-6)

device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
model.to(device)

epochs = 300
for epoch in range(epochs):
    model.train()
    for batch in train_loader:
        optimizer.zero_grad()

        input_ids = batch["input_ids"].to(device)
        attention_mask = batch["attention_mask"].to(device)
        labels = batch["labels"].to(device)

        outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=labels)
        loss = outputs.loss
        loss.backward()
        optimizer.step()

    print(f"Epoch {epoch + 1} Loss: {loss.item()}")

Epoch terakhir menunjukkan 0.13 Loss

## Model Testing

In [None]:
model.eval()
for batch in test_loader:
    input_ids = batch["input_ids"].to(device)
    attention_mask = batch["attention_mask"].to(device)
    labels = batch["labels"].to(device)

    input_texts = [tokenizer.decode(ids, skip_special_tokens=True) for ids in input_ids]
    true_labels = [tokenizer.decode(label, skip_special_tokens=True) for label in labels]

    outputs = model.generate(
        input_ids=input_ids,
        attention_mask=attention_mask,
        max_length=50
    )
    predictions = [tokenizer.decode(output, skip_special_tokens=True) for output in outputs]

    for input_text, true_label, pred in zip(input_texts, true_labels, predictions):
        print("-" * 50)
        print(f"input_txt: {input_text}")
        print(f"true_label: {true_label}")
        print(f"true_pred: {pred}")

    break

## Model Evaluation

In [None]:
# Initialize the ROUGE scorer
scorer = rouge_scorer.RougeScorer(['rouge1', 'rougeL'], use_stemmer=True)

# Assuming 'predictions' and 'true_labels' are lists of strings from the previous code block

bleu_scores = []
rouge1_scores = []
rougeL_scores = []

for prediction, true_label in zip(predictions, true_labels):
  # Calculate BLEU score
  reference = [true_label.split()]
  candidate = prediction.split()
  bleu_score = sentence_bleu(reference, candidate)
  bleu_scores.append(bleu_score)

  # Calculate ROUGE scores
  scores = scorer.score(true_label, prediction)
  rouge1_scores.append(scores['rouge1'].fmeasure)
  rougeL_scores.append(scores['rougeL'].fmeasure)

# Calculate average scores
avg_bleu = np.mean(bleu_scores)
avg_rouge1 = np.mean(rouge1_scores)
avg_rougeL = np.mean(rougeL_scores)

print(f"Average BLEU Score: {avg_bleu}")
print(f"Average ROUGE-1 Score: {avg_rouge1}")
print(f"Average ROUGE-L Score: {avg_rougeL}")


## Penjelasan Setiap Metrik

---

- **BLEU (Bilingual Evaluation Understudy)**

  > Nilai: 0.050

  BLEU digunakan untuk mengukur kemiripan antara hasil generasi model dengan jawaban referensi berdasarkan kesamaan n-gram.

  Skor BLEU < 0.1 dalam konteks QnA bersifat umum, terutama pada teks yang bersifat panjang, reasoning, atau bernuansa keagamaan karena struktur jawabannya bisa sangat variatif tergantung pertanyaannya.

  Dalam model kali ini, skor BLEU kita relatif **Rendah** yang dimana menunjukkan bahwa model menghasilkan jawaban yang secara kata-kata sangat berbeda dari jawaban referensi, meskipun bisa saja maknanya benar.

---

- **ROUGE-1**
  > Nilai: 0.405

  Mengukur kesamaan kata secara langsung (unigram overlap) antara jawaban model dan jawaban referensi.

  Skor di atas 0.4 dianggap **Cukup Baik** untuk tugas QnA generatif.

---

- **ROUGE-L**
  > Nilai: 0.319

  Mengukur kesamaan struktur atau urutan kata (longest common subsequence).

  Skor di atas 0.3 menunjukkan bahwa model **Cukup Baik** dalam meniru sebagian struktur kalimat dari jawaban referensi.

## Model Saving

In [None]:
# Save the model
model_path = "/content/drive/MyDrive/CollabData/QuranicReasoningModel/Model1"
model.save_pretrained(model_path)
tokenizer.save_pretrained(model_path)

print(f"Model saved to {model_path}")