AppliedML-BERT-Trees-Boost

Project for my graduate-level ML class (COMP 551). You can find the paper under "writeup.pdf." (last file above).

We analyzed adaboost, linear SVC, radial basis function SVM, random forests, decision trees, and BERT on two sentimental NLP datasets - IMDb and 20Newsgroup.

Abstract

Many machine learning algorithms have been developed in recent history. We will explore the performance of some of the most common models in this paper given a categorical or a binary classification problem on text files. These models include Adaboost, linear SVC, linear regression, radial basis function SVM, random forest, decision tree, and BERT. Our results show the effects of regularization, resampling methods such as bagging (bootstrap aggregating) and 5-fold crossvalidation as well as boosting on model accuracy. We also examine the effects of these strategies on bias-variance tradeoff to determine the best models for each algorithm and data set. Our highest test accuracies were achieved using BERT: 72.40% on 20 Newsgroups and 94.15% on IMDb.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.idea		.idea
IMBD		IMBD
aclImdb		aclImdb
.DS_Store		.DS_Store
20News_Cleaning.ipynb		20News_Cleaning.ipynb
20newsAdaBoost.ipynb		20newsAdaBoost.ipynb
20newsSVM.ipynb		20newsSVM.ipynb
BERT_20NEWS.ipynb		BERT_20NEWS.ipynb
BERT_IMBD.ipynb		BERT_IMBD.ipynb
IMBD_Binary.ipynb		IMBD_Binary.ipynb
IMBD_Cleaning.ipynb		IMBD_Cleaning.ipynb
IMDB_ADABOOST.ipynb		IMDB_ADABOOST.ipynb
IMDB_RBFSVM.ipynb		IMDB_RBFSVM.ipynb
Pipeline_Example.ipynb		Pipeline_Example.ipynb
Process_Overview.pdf		Process_Overview.pdf
README.md		README.md
WorkedExamples.ipynb		WorkedExamples.ipynb
decision_tree_imdb.ipynb		decision_tree_imdb.ipynb
decision_tree_newsgroups.ipynb		decision_tree_newsgroups.ipynb
random_forest_imdb.ipynb		random_forest_imdb.ipynb
random_forest_newsgroup.ipynb		random_forest_newsgroup.ipynb
writeup.pdf		writeup.pdf

andrewncheng/BERT-Trees-RBF-RF-Adaboost

Folders and files

Latest commit

History

Repository files navigation

AppliedML-BERT-Trees-Boost

Abstract

About

Resources

Stars

Watchers

Forks