Summarization Task

This repo is used to support the Medium article written here.

The article TL; DR is as follows: Let’s define an extractive summarization using BERT or Word2Vec. Which one will be the winner in terms of processing speed and accuracy? We will compare them by using public summarization datasets. Hint: bigger not always better.

Requirements

In order to run this repo, you need to install the following dependencies:

numpy
pandas
tqdm
transformers
bert-extractive-summarizer
gensim
nltk
sklearn
rouge

Structure

The repo is structured as follows:

datasets: Datasets for the summarization task
process.ipynb: The notebook for the summarization task

License

This repo is licensed under MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
datasets		datasets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
process.ipynb		process.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datasets

datasets

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

process.ipynb

process.ipynb

Repository files navigation

Summarization Task

Requirements

Structure

License

About

Releases

Packages

Languages

License

utomoreza/IndoSum

Folders and files

Latest commit

History

Repository files navigation

Summarization Task

Requirements

Structure

License

About

Resources

License

Stars

Watchers

Forks

Languages