## Sarcasm Detection with Machine Learning

**For many years, sarcasm has been a component of our language.**
- It involves saying the exact opposite of what you intend, usually in a lighthearted manner and with a distinct tone of voice. - You're mistaken if you believe that everybody can comprehend sarcasm because it requires both linguistic proficiency and a comprehension of other people's perspectives.
- What about a computer,though? Can a machine learning model be trained to determine whether or not a phrase is sarcastic? It is,indeed! Thus,this file is for you if you want to understand how to use machine learning to identify sarcasm. 
- I'll guide you through the Python sarcasm detection using machine learning in this post.

**Being sarcastic is all about being the complete opposite of what you mean to say.** 
- It has long been a component of all human languages. In order to attract more attention, it is now also utilised in news headlines and on a number of other social media sites. 
- The task of sarcasm detection involves binary classification and natural language processing. We can use a dataset of sarcastic and non-sarcastic sentences that I found on Kaggle to train a machine learning model to determine whether or not a sentence is sarcastic.

**I hope you now have understood what sarcasm is.** Lets get into the Detection part.

In [1]:
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import BernoulliNB
from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score

data = pd.read_json("C:/Users/asus/OneDrive/Desktop/ML_Datasets/project/More_Projects/Sarcasm.json", lines=True)
data.head()

Unnamed: 0,article_link,headline,is_sarcastic
0,https://www.huffingtonpost.com/entry/versace-b...,former versace store clerk sues over secret 'b...,0
1,https://www.huffingtonpost.com/entry/roseanne-...,the 'roseanne' revival catches up to our thorn...,0
2,https://local.theonion.com/mom-starting-to-fea...,mom starting to fear son's web series closest ...,1
3,https://politics.theonion.com/boehner-just-wan...,"boehner just wants wife to listen, not come up...",1
4,https://www.huffingtonpost.com/entry/jk-rowlin...,j.k. rowling wishes snape happy birthday in th...,0


The **“is_sarcastic”** column in this dataset contains the labels that we have to predict for the task of sarcasm detection. 
- It contains binary values as 1 and 0, where 1 means sarcastic and 0 means not sarcastic. So for simplicity, I will transform the values of this column as “**sarcastic**” and **“not sarcastic”** instead of 1 and 0.

In [2]:
data["is_sarcastic"] = data["is_sarcastic"].map({0: "Not Sarcasm", 1: "Sarcasm"})
data.head()

Unnamed: 0,article_link,headline,is_sarcastic
0,https://www.huffingtonpost.com/entry/versace-b...,former versace store clerk sues over secret 'b...,Not Sarcasm
1,https://www.huffingtonpost.com/entry/roseanne-...,the 'roseanne' revival catches up to our thorn...,Not Sarcasm
2,https://local.theonion.com/mom-starting-to-fea...,mom starting to fear son's web series closest ...,Sarcasm
3,https://politics.theonion.com/boehner-just-wan...,"boehner just wants wife to listen, not come up...",Sarcasm
4,https://www.huffingtonpost.com/entry/jk-rowlin...,j.k. rowling wishes snape happy birthday in th...,Not Sarcasm


**Let's now get the data ready for machine learning model training. We just need the "headline" column as a feature and the "is_sarcastic" column as a label out of the three columns in this dataset.** 
- Thus, let's choose these columns and divide the data into two sets: an 80% training set and a 20% test set.

In [3]:
data = data[["headline", "is_sarcastic"]]
x = np.array(data["headline"])
y = np.array(data["is_sarcastic"])
cv = CountVectorizer()
X = cv.fit_transform(x) # Fit the Data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)

In [4]:
model = BernoulliNB()
model.fit(X_train, y_train)

In [5]:
y_pred = model.predict(X_test)
accuracy_score(y_test,y_pred)

0.8517409210033695

In [6]:
print("test_score of the model:",model.score(X_test,y_test))
print("train score of the model:",model.score(X_train,y_train))

test_score of the model: 0.8517409210033695
train score of the model: 0.9292399850243355


In [7]:
from sklearn.metrics import classification_report
print(classification_report(y_test,y_pred))

              precision    recall  f1-score   support

 Not Sarcasm       0.84      0.91      0.87      2969
     Sarcasm       0.87      0.78      0.82      2373

    accuracy                           0.85      5342
   macro avg       0.86      0.84      0.85      5342
weighted avg       0.85      0.85      0.85      5342



- Now let’s use a sarcastic text as input to test whether our machine learning model detects sarcasm or not

In [8]:
user = input("Enter a Text: ")
data = cv.transform([user]).toarray()
output = model.predict(data)
print(output)

Enter a Text: boehner just wants wife to listen, not come up with alternative debt-reduction ideas
['Sarcasm']


**So this is how you can use machine learning to detect sarcasm by using the Python programming language.**