A repository of my self-research about NLP Transformer and BART model. The work involves fine-tuning a pretrain model, deployment on the Hugging Face - a community and data science platform, and a little bit of spicy web design/development (I love design).
Explore the docs »
View Demo
·
Report Bug
·
Request Feature
Table of Contents
The main scope of this project is to build a minimalist web application that summarize the documents input by user. The summarization can be extractive or abstractive, depending on the aim of the application. The development of the web application involve simple one-page design, elements and colors since I love minimalistic. For the Natural Language Processing (NLP) model, I choose to use the Sequence-2-Sequence model for summarization task. Among the models that I have investigated, BART (Bidirectional Auto-Regressive Transformers) from Facebook has a very good performance in providing abstractive summary (or extractive, if needed). Therefore, I decide to use it for my web application.
The model can be easily found from Hungging Face Hub. Hugging Face is a community and data science platform that provides: Tools that enable users to build, train and deploy ML models based on open source (OS) code and technologies. It has been risen in the recent years for providing such helpful environment where learners can tried out their models and make inferences. Hugging Face also provide Auto Train service, however, I choose to fine-tune a pre-trained model by myself to see how the model works, and to get familiar with the Transformers and the application deployment.
For the web developement, I use Flask since it is easy to use and compatible with Hugging Face deployment. For the training process, the notebooks run on both Google Colaboratory and Kaggle. I switched between these two platform when one of their exceeds the quota.
BART model was introduced at Facebook Research at ACL 2020. It has the ability to generate text based on the information from the encoder, i.e it is specifically used for sequence-to-sequence problem compared to BERT (Bidirectional Encoder Representations from Transformers) which only pretrains on encoder, not decoder.
The most used BART model would be base-sized BART provided by Facebook. A distilled version of BART has been implemented by many researchers. One remarkable study was conducted by Sam Shleifer and Alexander M. Rush
, where three distillation approaches direct knowledge distillation, pseudo-labeling, and shrink and fine-tune (SFT), were compared. In this project, I decided to used the distilled BART model by Sam Shleifer (distilbart-12-6-cnn
) and fine-tuned it on the datasets multi_news
. Using the pre-trained model only is actually good enough. But as I have mentioned, I'd like to fine-tune and implement it by myself so I could learn something new during the progress.
The fine-tuned model could be viewed via datien228/distilbart-cnn-12-6-ftn-multi_news
For more information about BART model and the dataset, please refer to:
- BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
- Pre-trained Summarization Distillation
- multi_news dataset
The web application is quite simple. It was implemented by using basic HTML, CSS, JQuery and Bootstraps 5. Some of the components used in this website was borrowed by examples from the internet as well. The web application is called Summary.
References:
The web application was implemented by Flask and the main branch consists the work used in the development/deployment stages.
The online demo was deployed using Hugging Face Spaces - a platform offer simple way to host ML demo apps directly on our profile.
For more information, please refer to the Hugging Face Spaces Documentation
There is no prerequisites for this projects.
The application was deployed on Text Summarizer.
For the embedded version (no Hugging Face interface), please navigate to Text Summarizer Full.
- Add Changelog
- Add reference links for
creator
,project
andfeedback
nav items - Fine-tune the model on Vietnamese language
- Optimize the website's code (optional)
See the open issues for a full list of proposed features (and known issues).
Distributed under the MIT License. See LICENSE.txt
for more information.
Linkedin: atien228
Email: d.atien228@gmail.com
Project Link: web-based-ai
Here is the list of resources that I have found during the research and I believe it would be useful for you too.