# End-to-End Fake News Detection with Python
This notebook demonstrates how to build an end-to-end fake news detection system using Python and deep learning.
<img src="https://tablogix.ru/files/23munua.jpg" />

## Step 1: Install Required Libraries
You'll need to install Streamlit and other libraries. Run the following command:
```bash
pip install streamlit pandas numpy tensorflow
```


## Step 2: Prepare the Dataset
You can find a dataset suitable for fake news detection on platforms like Kaggle.
Load your dataset using pandas:


In [1]:
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, LSTM
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

In [2]:
df = pd.read_csv('fake_or_real_news.csv')
df.head()

Unnamed: 0.1,Unnamed: 0,title,text,label
0,8476,You Can Smell Hillary’s Fear,"Daniel Greenfield, a Shillman Journalism Fello...",FAKE
1,10294,Watch The Exact Moment Paul Ryan Committed Pol...,Google Pinterest Digg Linkedin Reddit Stumbleu...,FAKE
2,3608,Kerry to go to Paris in gesture of sympathy,U.S. Secretary of State John F. Kerry said Mon...,REAL
3,10142,Bernie supporters on Twitter erupt in anger ag...,"— Kaydee King (@KaydeeKing) November 9, 2016 T...",FAKE
4,875,The Battle of New York: Why This Primary Matters,It's primary day in New York and front-runners...,REAL


## EDA

In [3]:
df.tail()

Unnamed: 0.1,Unnamed: 0,title,text,label
6330,4490,State Department says it can't find emails fro...,The State Department told the Republican Natio...,REAL
6331,8062,The ‘P’ in PBS Should Stand for ‘Plutocratic’ ...,The ‘P’ in PBS Should Stand for ‘Plutocratic’ ...,FAKE
6332,8622,Anti-Trump Protesters Are Tools of the Oligarc...,Anti-Trump Protesters Are Tools of the Oligar...,FAKE
6333,4021,"In Ethiopia, Obama seeks progress on peace, se...","ADDIS ABABA, Ethiopia —President Obama convene...",REAL
6334,4330,Jeb Bush Is Suddenly Attacking Trump. Here's W...,Jeb Bush Is Suddenly Attacking Trump. Here's W...,REAL


In [4]:
df.shape

(6335, 4)

In [5]:
df.info

<bound method DataFrame.info of       Unnamed: 0                                              title  \
0           8476                       You Can Smell Hillary’s Fear   
1          10294  Watch The Exact Moment Paul Ryan Committed Pol...   
2           3608        Kerry to go to Paris in gesture of sympathy   
3          10142  Bernie supporters on Twitter erupt in anger ag...   
4            875   The Battle of New York: Why This Primary Matters   
...          ...                                                ...   
6330        4490  State Department says it can't find emails fro...   
6331        8062  The ‘P’ in PBS Should Stand for ‘Plutocratic’ ...   
6332        8622  Anti-Trump Protesters Are Tools of the Oligarc...   
6333        4021  In Ethiopia, Obama seeks progress on peace, se...   
6334        4330  Jeb Bush Is Suddenly Attacking Trump. Here's W...   

                                                   text label  
0     Daniel Greenfield, a Shillman Journalism Fell

In [6]:
# Drop rows with missing values
df.dropna(subset=['text', 'label'], inplace=True)


In [7]:
# Encode labels: FAKE -> 0, REAL -> 1
df['label'] = df['label'].map({'FAKE': 0, 'REAL': 1})


In [8]:
# Tokenize the text data
tokenizer = Tokenizer(num_words=10000)
tokenizer.fit_on_texts(df['text'])
X = tokenizer.texts_to_sequences(df['text'])
X = pad_sequences(X, padding='post', maxlen=100)  # Adjust maxlen as needed


In [10]:
import pickle

# Convert labels to numpy array
y = np.array(df['label'].astype(int))  # Ensure y is an integer array

# Save the tokenizer
with open('tokenizer.pickle', 'wb') as handle:
    pickle.dump(tokenizer, handle, protocol=pickle.HIGHEST_PROTOCOL)

## Step 3: Building the Deep Learning Model
Here we will use TensorFlow to create a simple neural network model.


## Step 4: Training the Model
Train your model using the dataset.
```python
model.fit(X, y, epochs=5, batch_size=32)
```


In [51]:
model.fit(X, y, epochs=5, batch_size=32)

Epoch 1/5
[1m198/198[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m12s[0m 54ms/step - accuracy: 0.6333 - loss: 0.6139
Epoch 2/5
[1m198/198[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 55ms/step - accuracy: 0.8795 - loss: 0.2715
Epoch 3/5
[1m198/198[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 55ms/step - accuracy: 0.9217 - loss: 0.1519
Epoch 4/5
[1m198/198[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 55ms/step - accuracy: 0.9440 - loss: 0.1036
Epoch 5/5
[1m198/198[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 54ms/step - accuracy: 0.9474 - loss: 0.0932


<keras.src.callbacks.history.History at 0x1f2fbe63990>

In [57]:
model.save('fake_news_model.h5')



## Step 5: Creating the Streamlit Application
Create a file named `app.py` for the Streamlit application:


In [53]:
import streamlit as st

st.title('Fake News Detection')
user_input = st.text_input('Fred Rogers served as a sniper during the Vietnam War and had a large number of confirmed kills.')
if user_input:
    prediction = model.predict([user_input])
    st.write('Fake' if prediction[0][0] < 0.5 else 'Real')


2024-09-23 15:06:49.953 
  command:

    streamlit run C:\ProgramData\anaconda3\Lib\site-packages\ipykernel_launcher.py [ARGUMENTS]
2024-09-23 15:06:49.954 Session state does not function when running a script without `streamlit run`


## Step 6: Running the Application
Run the Streamlit app using:
```bash
streamlit run app.py
```
This command will start the application and open it in your web browser.