Text summarization is a critical Natural Language Processing (NLP) task with applications ranging from information retrieval to content generation. Leveraging Large Language Models (LLMs) has shown remarkable promise in enhancing summarization techniques. This repository will document the experiments conducted as part of the research under the KaggleX BIPOC Mentorship Program 2023, Cohort-3.
In this project, I have recently updated the experiment files for the CNN Dailymail and XSum datasets, leveraging three different Large Language Models (LLMs). These LLMs include:
- MPT-7b-instruct
- Falcon-7b-instruct
- Text-davinci-003
You can go through the technical paper:https://arxiv.org/abs/2310.10449
This repository contains a notebook for generating summaries of YouTube video comments using the OpenAI ChatGPT Text-davinci-003 model. I conducted experiments with different temperature values to assess the quality of the generated summaries. You can access the results in the "Application" folder, specifically in the "youtube-comment-summarizer.ipynb" file.
I encourage contributions and collaboration from the community. Feel free to clone this repository and experiment with various word length and temperature settings to generate your own summaries. I would love to hear about your experiences and insights.
I value your feedback and look forward to hearing about your experiences. If you have any questions, suggestions, or insights to share, please don't hesitate to reach out.
I hope this work enhances your learning and research endeavors. Have a great time experimenting and exploring the possibilities.