Contributors:
- Kong Sheng How
- Tan Li Mei
- Dr. Gan Keng Hoon
- Dr. Nur Hana Samsudin
Fake News Detection With NLP using Tensorflow Keras Simple Neural Network Model
Fake News Datasets can be obtained from:
- https://www.kaggle.com/jruvika/fake-news-detection
- https://www.kaggle.com/nopdev/real-and-fake-news-dataset
- Remove empty rows
- Remove problematic rows (Some data points are problematic due to newlines ("\n") making the texts go into new rows)
- Remove URL or ID columns
- Dataset Ratio (Real News vs Fake News)
- Word Cloud
- Word Count in respective labelled dataset
- Text Preprocessing
- Regular Expression (Remove punctuations and special characters)
- Tokenization (Split texts into individual words)
- Lemmatization (Turn words back to their root word)
- Stop words removal (Remove stop words in texts)
- Vectorization
- N-Gram Vectors (TF-IDF Encoding)
- Sequence Vectors (One-Hot Encoding)
- Tensorflow Keras Neural Network Model (2 similar models created: 1 for N-Gram Vectors, 1 for Sequence Vectors)
- Layer 1 (Input Dense): 64 nodes, RELU activation
- Layer 2 (Dropout): 20% dropout
- Layer 3 (Dense): 16 nodes, RELU activation
- Layer 4 (Dropout): 20% dropout
- Layer 5 (Dense): 8 nodes, RELU activation
- Layer 6 (Dense): 2 nodes, Sigmoid activation
- Comparison between 2 Tensorflow Keras Network Model (Using 3 different metrics)
- Accuracy
- Recall
- Computational Time