Improve Semantic Similarity Based on Statistical Approach and LLM based Transformer Model for Extractive Summarization
The data utilized in this study were gathered from Hugging Face, which is the content of XL-Sum BBC News, accessible through https://huggingface.co/datasets/csebuetnlp/xlsum/viewer/indonesian, comprising training, testing, and validation data (Hasan et al., 2021). The XL-Sum dataset is distributed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0), with contents restricted to non-commercial research purposes only.
For computational needs, we used Google Colab Pro+ equipped with a TPU v2–8 accelerator to run all experiments
@inproceedings{hasan-etal-2021-xl, title = "{XL}-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages", author = "Hasan, Tahmid and Bhattacharjee, Abhik and Islam, Md. Saiful and Mubasshir, Kazi and Li, Yuan-Fang and Kang, Yong-Bin and Rahman, M. Sohel and Shahriyar, Rifat", booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.findings-acl.413", pages = "4693--4703", }