<a href="https://colab.research.google.com/github/Arahman100198/CodeClause_Internship/blob/main/Fake_News_Detection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Task#01:   Fake News Detection**
####            (Unmasking the Truth with Machine Learning)

###**Author:  Abdur Rahman**

###**Introduction**
In the age of information overload, the proliferation of fake news poses a significant challenge to reliable journalism and public discourse. To tackle this issue, we delve into the realm of machine learning and introduce a powerful tool: a Fake News Detection Model built using Support Vector Machines (SVM).

In this notebook, we explore the potential of SVM, a popular and effective classification algorithm, to discern between fake and real news articles. SVM leverages the principles of linear separation and maximizing the margin between classes to make accurate predictions. By leveraging the text content of news articles and training on labeled data, our SVM-based model can learn to distinguish between genuine reporting and deceptive narratives.

Our approach involves transforming the textual data into numerical representations using the Term Frequency-Inverse Document Frequency (TF-IDF) scheme. This vectorization technique captures the importance of words within the corpus, enabling SVM to learn the distinguishing features between fake and real news.

Throughout this notebook, we demonstrate the step-by-step process of loading and preprocessing the dataset, splitting it into training and testing sets, vectorizing the text data, training the SVM model, and evaluating its performance. By analyzing key evaluation metrics such as accuracy, precision, recall, and F1-score, we assess the effectiveness of our model in distinguishing between fake and real news.

Through the application of SVM and machine learning techniques, we aim to contribute to the development of reliable tools for fake news detection. By enhancing our ability to identify deceptive information, we empower individuals and communities to make informed decisions, protect the integrity of information, and foster a more trustworthy and credible information ecosystem.



##### Import Libraries

In [5]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report

##### 1- Load and preproces Datasets
Load the 'fake.csv' and 'true.csv' datasets, add labels to each dataset, and combine them into a single DataFrame.


In [6]:
fake_data = pd.read_csv('Fake.csv')
true_data = pd.read_csv('True.csv')

In [7]:
#Add Label to data
fake_data['label'] = 'fake'
true_data['label'] = 'true'

In [8]:
# Combine the datasets
data = pd.concat([fake_data, true_data], ignore_index=True)

In [9]:
data

Unnamed: 0,title,text,subject,date,label
0,Donald Trump Sends Out Embarrassing New Year’...,Donald Trump just couldn t wish all Americans ...,News,"December 31, 2017",fake
1,Drunk Bragging Trump Staffer Started Russian ...,House Intelligence Committee Chairman Devin Nu...,News,"December 31, 2017",fake
2,Sheriff David Clarke Becomes An Internet Joke...,"On Friday, it was revealed that former Milwauk...",News,"December 30, 2017",fake
3,Trump Is So Obsessed He Even Has Obama’s Name...,"On Christmas day, Donald Trump announced that ...",News,"December 29, 2017",fake
4,Pope Francis Just Called Out Donald Trump Dur...,Pope Francis used his annual Christmas Day mes...,News,"December 25, 2017",fake
...,...,...,...,...,...
44893,'Fully committed' NATO backs new U.S. approach...,BRUSSELS (Reuters) - NATO allies on Tuesday we...,worldnews,"August 22, 2017",true
44894,LexisNexis withdrew two products from Chinese ...,"LONDON (Reuters) - LexisNexis, a provider of l...",worldnews,"August 22, 2017",true
44895,Minsk cultural hub becomes haven from authorities,MINSK (Reuters) - In the shadow of disused Sov...,worldnews,"August 22, 2017",true
44896,Vatican upbeat on possibility of Pope Francis ...,MOSCOW (Reuters) - Vatican Secretary of State ...,worldnews,"August 22, 2017",true


###### 2- Split the data into training and testing sets
 Split the combined data into training and testing sets, with 80% for training and 20% for testing.

In [10]:
X = data['text']
y = data['label']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

##### 3- Vectorize the text data
Convert the text data into numerical vectors using TF-IDF vectorization.

In [11]:
vectorizer = TfidfVectorizer()
X_train_vectorized = vectorizer.fit_transform(X_train)
X_test_vectorized = vectorizer.transform(X_test)

##### 4- Train the SVM model
Create an SVM model with a linear kernel and train it using the vectorized training data and corresponding labels.

In [12]:
svm_model = SVC(kernel='linear')
svm_model.fit(X_train_vectorized, y_train)


##### 5- Make predictions and evaluate the model
Use the trained model to make predictions on the vectorized testing data, calculate the accuracy, and generate a classification report.

In [13]:
y_pred = svm_model.predict(X_test_vectorized)

Modal Evaluation

In [14]:
accuracy = accuracy_score(y_test, y_pred)
report = classification_report(y_test, y_pred)

In [15]:
print("Accuracy:", accuracy)
print("Classification Report:\n", report)

Accuracy: 0.9948775055679288
Classification Report:
               precision    recall  f1-score   support

        fake       1.00      0.99      1.00      4733
        true       0.99      1.00      0.99      4247

    accuracy                           0.99      8980
   macro avg       0.99      0.99      0.99      8980
weighted avg       0.99      0.99      0.99      8980



# Thanks!