Harvey Mudd College at SemEval-2019 Task 4: The Clint Buchanan Hyperpartisan News Detector

This repo contains the training, validation, and experimental code used in our submission to the SemEval 2019 Task 4: Hyperpartisan News Detection where our model came in 10th place internationally with an accuracy of 77.1% according to the leaderboard available here.

Team Clint Buchanan:

Mehdi Drissi
Pedro Sandoval Segura
Vivaswat Ojha

Code for text preprocessing was provided to us by Professor Julie Medero as part of a course in Natural Language Processing. She was instrumental in helping us submit our model to the competition and write our workshop paper. This code can be found in extract_articles.py, extract_features.py, predict.py, preprocess.py.

We trained Bidirectional Encoder Representations from Transformers (BERT) models (Devlin et al., 2018) based on the implementation from pytorch-pretrained-BERT. Please refer to that repository for required dependencies.

To train our BERT model, you can use the train.sh bash script. You will need to download the articles here.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.gitignore		.gitignore
Grid Search Plots.ipynb		Grid Search Plots.ipynb
LICENSE		LICENSE
NLP_Camera_Ready_Paper.pdf		NLP_Camera_Ready_Paper.pdf
NLP_Final_Project_Paper.pdf		NLP_Final_Project_Paper.pdf
NLP_Project_Proposal.pdf		NLP_Project_Proposal.pdf
NLP_Project_Update.pdf		NLP_Project_Update.pdf
README.md		README.md
bertaverager.py		bertaverager.py
extract_articles.py		extract_articles.py
extract_features.py		extract_features.py
gridsearch.py		gridsearch.py
gridsearch.sh		gridsearch.sh
note.txt		note.txt
predict.py		predict.py
preprocess.py		preprocess.py
pretrain.sh		pretrain.sh
run_classifier.py		run_classifier.py
train.sh		train.sh
unsupervised_pretraining.py		unsupervised_pretraining.py
validate.sh		validate.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Harvey Mudd College at SemEval-2019 Task 4: The Clint Buchanan Hyperpartisan News Detector

About

Releases

Packages

Contributors 2

Languages

License

hmc-cs159-fall2018/final-project-team-mvp-10000

Folders and files

Latest commit

History

Repository files navigation

Harvey Mudd College at SemEval-2019 Task 4: The Clint Buchanan Hyperpartisan News Detector

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages