This repo is used to support the Medium article written here.
The article TL; DR is as follows: Let’s define an extractive summarization using BERT or Word2Vec. Which one will be the winner in terms of processing speed and accuracy? We will compare them by using public summarization datasets. Hint: bigger not always better.
In order to run this repo, you need to install the following dependencies:
numpy
pandas
tqdm
transformers
bert-extractive-summarizer
gensim
nltk
sklearn
rouge
The repo is structured as follows:
datasets
: Datasets for the summarization taskprocess.ipynb
: The notebook for the summarization task
This repo is licensed under MIT license.