Topic Identification of Filipino and English News using Bidirectional Long-Short Term Memory with Attention Mechanisms

Topic Identification of News using BiLSTMs with Attention Mechanisms. Our data for the English News came from BBC News

Getting Started

Download the dataset from http://mlg.ucd.ie/files/datasets/bbc-fulltext.zip and extract it in a folder named /data in the local directory.

Prerequisites

What things you need to install the software and how to install them

Keras - with Tensorflow Backend
Tensorflow 1.12
Pandas
Matplotlib
Numpy
NLTK

Installing

Create a folder named /Graph_LSTM and /Graph in the local directory so tensorboard will save the files there for the Ordinary LSTMs and LSTMs with Attention.
For the English Word Embeddings, Download the GloVe Word Embeddings here and extract the file named glove.6B.300d.txt in the local directory
For the Tagalog Word Embeddings, Download the FastText Word Embeddings here and extract the file in the local directory and rename it to fasttext_tagalog.vec

Code

Just run the .ipynb file in your Jupyter Notebook and you're good to go.

Authors

Christian Justine Clemente - Initial work, Lead - JstnClmnt
Ezekiel David - Developer - kielboy8

See also the list of [contributors](https://github.com/JstnClmnt/NLP-News-Classification/contributors) who participated in this project.

References

[1] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).
[2] Keras implementation of Attention Mechanisms
[3] Stop Words Function Removal for the Filipino Language
[4] Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
[5] Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135-146.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
keras-attention-mechanism		keras-attention-mechanism
.gitignore		.gitignore
English News Classification Using Bidirectional Long-Short Term Memory with Attention Mechanisms.ipynb		English News Classification Using Bidirectional Long-Short Term Memory with Attention Mechanisms.ipynb
English_News_Classification_Using_Bidirectional_Long_Short_Term_Memory_with_Attention_Mechanisms.ipynb		English_News_Classification_Using_Bidirectional_Long_Short_Term_Memory_with_Attention_Mechanisms.ipynb
Filipino News Classification Using Bidirectional Long-Short Term Memory with Attention Mechanisms.ipynb		Filipino News Classification Using Bidirectional Long-Short Term Memory with Attention Mechanisms.ipynb
Filipino_News_Classification_Using_Bidirectional_Long_Short_Term_Memory_with_Attention_Mechanisms.ipynb		Filipino_News_Classification_Using_Bidirectional_Long_Short_Term_Memory_with_Attention_Mechanisms.ipynb
NLP White Paper.pdf		NLP White Paper.pdf
README.md		README.md
System_Architecture.png		System_Architecture.png
acc_lstm.png		acc_lstm.png
attention_utils.py		attention_utils.py
data_clean.py		data_clean.py
data_clean_filipino.py		data_clean_filipino.py
links.txt		links.txt
loss_lstm.png		loss_lstm.png
model_attention.png		model_attention.png
model_lstm.png		model_lstm.png
model_lstm_filipino.png		model_lstm_filipino.png
remove_stopwords.py		remove_stopwords.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Topic Identification of Filipino and English News using Bidirectional Long-Short Term Memory with Attention Mechanisms

Getting Started

Prerequisites

Installing

Code

Authors

References

About

Releases

Packages

Languages

JstnClmnt/NLP-News-Classification

Folders and files

Latest commit

History

Repository files navigation

Topic Identification of Filipino and English News using Bidirectional Long-Short Term Memory with Attention Mechanisms

Getting Started

Prerequisites

Installing

Code

Authors

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages