Skip to content

Sutriawan/Text-Summarization

Repository files navigation

Text-Summarization

Improve Semantic Similarity Based on Statistical Approach and LLM based Transformer Model for Extractive Summarization

Dataset

The data utilized in this study were gathered from Hugging Face, which is the content of XL-Sum BBC News, accessible through https://huggingface.co/datasets/csebuetnlp/xlsum/viewer/indonesian, comprising training, testing, and validation data (Hasan et al., 2021). The XL-Sum dataset is distributed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0), with contents restricted to non-commercial research purposes only.

Computation

For computational needs, we used Google Colab Pro+ equipped with a TPU v2–8 accelerator to run all experiments

Dataset Citation

@inproceedings{hasan-etal-2021-xl, title = "{XL}-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages", author = "Hasan, Tahmid and Bhattacharjee, Abhik and Islam, Md. Saiful and Mubasshir, Kazi and Li, Yuan-Fang and Kang, Yong-Bin and Rahman, M. Sohel and Shahriyar, Rifat", booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.findings-acl.413", pages = "4693--4703", }

About

Improve Semantic Similarity Based on Statistical Approach and LLM based Transformer Model for Extractive Summarization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors