- Repository for SC1015 Intro to DSAI AY21/22 Sem 2 project
- Application of the Machine Learning Pipeline in the order of
- Data Extraction & Preparation
- Exploratory Data Analysis
- Machine Learning Models
- Evaluation and Insights
- Kane Tan (Data Extraction & Preparation)
- Javier Tan (Data Visualisation & Analysis)
- Lee Seung Soo (Machine Learning)
Predicting fake news by the news title using Natural Language Processing
Ensemble model consisting of 5 sub-models
- Logistic Regression
- Naives-Bayes Classifer
- Binary Tree Classifier
- Passive Aggressive Classifier
- Support Vector Machine
- NLP is a good way of classifying news by the title
- High prediction accuracy of 83.5% of the ensemble model
- Good warning system for users' exposure to fake news online
- Cross-checking about the content of the news is still required
https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset https://www.datacamp.com/community/tutorials/understanding-logistic-regression-python https://www.datacamp.com/community/tutorials/svm-classification-scikit-learn-python https://www.datacamp.com/community/tutorials/ensemble-learning-python https://arxiv.org/pdf/2102.04458#:~:text=Machine%20learning%20classifiers%20are%20using,can%20automatically%20detect%20fake%20news.