### 2. **Data Loading and Preprocessing**
   - Reads training text and label files and strips any extra whitespace or newlines.
   - Checks if the number of texts and labels match and raises an error if they don't.
   - Converts the text and label data into a pandas DataFrame for further processing.

In [2]:
import pandas as pd

# Define file paths
text_file = 'riloff tweet/train.txt'
label_file = 'riloff tweet/labels_train.txt'

# Read text samples
with open(text_file, 'r', encoding='utf-8') as tf:
    texts = tf.readlines()

# Read labels
with open(label_file, 'r', encoding='utf-8') as lf:
    labels = lf.readlines()

# Strip any extra whitespace or newlines from the text and labels
texts = [text.strip() for text in texts]
labels = [label.strip() for label in labels]

# Ensure the lengths of both lists match
if len(texts) != len(labels):
    raise ValueError("The number of text samples and labels must be equal.")

# Convert to DataFrame
df = pd.DataFrame({
    'text': texts,
    'label': labels
})

In [3]:
df

Unnamed: 0,text,label
0,Nih min buat fans arsenal @my_supersoccer :D h...,0
1,Give a person power that will be a true test o...,0
2,@LordWilsonVILLA At 21 he looks to have a lot ...,0
3,I'm about to fall asleep and I still have to b...,0
4,I love hearing the shots from the shooting ran...,0
...,...,...
1363,Wanna know how I know your gay??? Because when...,0
1364,Love the political advice on Twitter. #getinfo...,0
1365,GrandMarc flow. #TCU @Coll_Crowley http://t.co...,0
1366,@NoLoveInTheCity what up boo ?,0


### 3. **Document Loading and Text Splitting**
   - Loads documents using `DataFrameLoader` and splits the text into manageable chunks using `RecursiveCharacterTextSplitter` for efficient processing.
   - The text is split into chunks of size 500, with 200 characters overlapping between consecutive chunks.
   - Initializes embeddings using OpenAI and sets up the LLM for the task.

In [4]:
from langchain.document_loaders import DataFrameLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
CHUNK_SIZE = 500
CHUNK_OVERLAP = 200
# Load documents
df['document'] = df.apply(lambda row: f"Text: {row['text']} Label: {row['label']}", axis=1)
articles = DataFrameLoader(df, page_content_column="document")
documents = articles.load()

# Concatenate and sort documents by date
sorted_docs = sorted(documents, key=lambda x: x.metadata.get("Date", ""))
concatenated_content = "\n\n\n---\n\n\n".join(doc.page_content for doc in reversed(sorted_docs))

# Split text into chunks
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=CHUNK_SIZE, chunk_overlap=CHUNK_OVERLAP
)
texts_split = text_splitter.split_text(concatenated_content)

# Embedding model and LLM setup
embeddings = OpenAIEmbeddings()
model = ChatOpenAI(temperature=0, model="gpt-3.5-turbo", max_tokens=4096)

### 4. **Building the VectorStore**
   - Builds a FAISS vector store from the split text and creates a retriever for querying relevant documents.

In [11]:
# Build VectorStore
from langchain_community.vectorstores import FAISS
vectorstore = FAISS.from_texts(texts=texts_split, embedding=embeddings)
retriever = vectorstore.as_retriever()

### 5. **RAG Setup**
   - Defines a RAG (Retrieval-Augmented Generation) prompt template for sarcasm classification, where context is provided to the model to aid in classification.
   - Implements a function `get_rag_response` to perform the RAG operation, retrieving documents and generating a response based on the retrieved context.

In [12]:
# RAG Prompt Template
TEMPLATE = """ 
This is a sarcasm classification task. Use the provided context, which contains examples of sarcastic and non-sarcastic statements, to assist you in classifying the input text. Combine this context with your knowledge to determine the correct label.

Input: {input}

Context:
{context}

Answer: If the input expresses sarcasm, output 'sarcastic'; otherwise, output 'non-sarcastic'.
"""
prompt = ChatPromptTemplate.from_template(TEMPLATE)


# RAG Chain Function
def get_rag_response(question: str, retriever, model, prompt) -> dict:
    try:
        # Retrieve relevant documents
        retrieved_docs = retriever.get_relevant_documents(question)
        
        # Combine the retrieved documents into a single context string
        context = "\n\n".join(doc.page_content for doc in retrieved_docs)
        
        # Format the prompt with the input question and context
        final_prompt = prompt.format_prompt(input=question, context=context)
        
        # Get the model's response
        response = model(final_prompt.to_messages())
        
        return {
            "response": response.content.strip(),  # Clean up the response
            "context": context  # Include context for traceability
        }
    
    except Exception as e:
        print(f"Error in RAG chain: {e}")
        return {"response": None, "context": None}



### 6. **Test Data Loading**
   - Reads and processes the test text and label files, ensuring they are correctly formatted and matched.
   - Converts the test data into a pandas DataFrame for further evaluation.


In [13]:
# Define file paths
text_file = 'riloff tweet/test.txt'
label_file = 'riloff tweet/labels_test.txt'

# Read text samples
with open(text_file, 'r', encoding='utf-8') as tf:
    texts = tf.readlines()

# Read labels
with open(label_file, 'r', encoding='utf-8') as lf:
    labels = lf.readlines()

# Strip any extra whitespace or newlines from the text and labels
texts = [text.strip() for text in texts]
labels = [label.strip() for label in labels]

# Ensure the lengths of both lists match
if len(texts) != len(labels):
    raise ValueError("The number of text samples and labels must be equal.")

# Convert to DataFrame
df_test = pd.DataFrame({
    'text': texts,
    'label': labels
})

In [14]:
df_test

Unnamed: 0,text,label
0,Absolutely love when water is spilt on my phon...,1
1,I was hoping just a LITTLE more shit could hit...,1
2,@pdomo Don't forget that Nick Foles is also th...,0
3,I constantly see tweets about Arsenal on twitt...,0
4,Can feel the feet pulsating...slow one...becau...,0
...,...,...
583,"Somewhere in the desert of Nevada, there is a ...",0
584,I just love getting up this early to go into s...,1
585,"Somewhere in the desert of Nevada, there is a ...",0
586,"Lol 😂 RT“@ReeseButCallMeV: When I'm high, I tu...",0


### 6. **Model Execution and Response Handling**
   - Iterates through each test sample, sends the input text to the RAG chain for classification, and processes the response.
   - Appends the cleaned responses (`sarcastic` or `non-sarcastic`) to a list.
   - Saves intermediate progress after each iteration in a CSV file (optional) and handles any errors that may occur during the process.
   - After processing all samples, the final results are saved to `result_final.csv`.

In [22]:
from tqdm import tqdm
import pandas as pd

# Initialize lists to hold data
responses = []

# Iterate over each question
for i in tqdm(range(0,df_test.shape[0] ), desc="Processing samples"):#
    try:
        # Extract text for the current row
        text = df_test.iloc[i]["text"]
        
        # Get the AI response
        response_dict = get_rag_response(text, retriever, model, prompt)
        raw_response = response_dict.get("response", "error")  # Safely retrieve raw response
        
        # Extract only 'sarcastic' or 'non-sarcastic' using simple string filtering
        if "sarcastic" in raw_response.lower():
            response = "sarcastic"
        elif "non-sarcastic" in raw_response.lower():
            response = "non-sarcastic"
        else:
            response = "error"  # Handle unexpected responses
        
        # Append the cleaned response to the list
        responses.append(response)
        
        # Save progress after each step
        temp_df = pd.DataFrame({
            "text": df_test["text"].iloc[:i+1],
            "response": responses,
            "Predicted Label": [0 if r == 'sarcastic' else 1 for r in responses],
            "True Label": df_test["label"].iloc[:i+1],
        })
        #temp_df.to_csv(f"result_progress_{i+1}.csv", index=False)
    
    except Exception as e:
        print(f"Error at index {i}: {e}")
        continue

# Save the final results after completing the loop
final_df = pd.DataFrame({
    "text": df_test["text"],
    "response": responses,
    "Predicted Label": [0 if r == 'sarcastic' else 1 for r in responses],
    "True Label": df_test["label"],
})
final_df.to_csv("result_final.csv", index=False)

print("Processing completed. Results saved to 'result_final.csv'.")


Processing samples: 100%|██████████| 588/588 [07:46<00:00,  1.26it/s]

Processing completed. Results saved to 'result_final.csv'.





In [23]:
final_df

Unnamed: 0,text,response,Predicted Label,True Label
0,Absolutely love when water is spilt on my phon...,sarcastic,0,1
1,I was hoping just a LITTLE more shit could hit...,sarcastic,0,1
2,@pdomo Don't forget that Nick Foles is also th...,sarcastic,0,0
3,I constantly see tweets about Arsenal on twitt...,sarcastic,0,0
4,Can feel the feet pulsating...slow one...becau...,sarcastic,0,0
...,...,...,...,...
583,"Somewhere in the desert of Nevada, there is a ...",sarcastic,0,0
584,I just love getting up this early to go into s...,sarcastic,0,1
585,"Somewhere in the desert of Nevada, there is a ...",sarcastic,0,0
586,"Lol 😂 RT“@ReeseButCallMeV: When I'm high, I tu...",sarcastic,0,0


### 7. **Evaluation Metrics**
   - Calculates key evaluation metrics such as accuracy, precision, recall, F1 score, and generates a classification report.
   - Displays the metrics and a detailed classification report comparing predicted labels to true labels (non-sarcastic vs sarcastic).

In [26]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, classification_report

# Ensure True Label and Predicted Label are integers for metric calculations
final_df["True Label"] = final_df["True Label"].astype(int)
final_df["Predicted Label"] = final_df["Predicted Label"].astype(int)

# Extract true labels and predicted labels
true_labels = final_df["True Label"]
predicted_labels = final_df["Predicted Label"]

# Calculate evaluation metrics
accuracy = accuracy_score(true_labels, predicted_labels)

# Display the metrics
print("Evaluation Metrics:")
print(f"Accuracy: {accuracy:.2f}")

# Detailed classification report
print("\nClassification Report:")
print(classification_report(true_labels, predicted_labels, target_names=["Non-Sarcastic", "Sarcastic"]))


Evaluation Metrics:
Accuracy: 0.84

Classification Report:
               precision    recall  f1-score   support

Non-Sarcastic       0.84      1.00      0.91       495
    Sarcastic       0.00      0.00      0.00        93

     accuracy                           0.84       588
    macro avg       0.42      0.50      0.46       588
 weighted avg       0.71      0.84      0.77       588



  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
