This project aims to enhance customer service chatbots using fine-tuned machine learning models on the Stack Overflow dataset. The model will tackle real-world technical issues by improving the chatbots' ability to understand and generate technical content effectively.
We utilized the Stack Overflow dataset for this project, which offers a rich source of user interactions involving technical problems and their solutions. This dataset is ideal due to:
Its extensive coverage of various technology fields. The broad spectrum of topics it encompasses. Its potential to improve customer service chatbots with better natural language understanding.
Our fine-tuned model will tackle the business problem of enhancing the efficiency of customer service chatbots. By improving their ability to comprehend and respond to technical queries, these chatbots can:
Reduce response times. Improve customer satisfaction. Allow human support teams to focus on more complex issues.
A technique for enhancing large pre-trained models, such as GPT and BERT, through selective parameter adjustments. LoRA introduces low-rank updates to model weights during training, reducing the computational and resource requirements.
Focuses on reducing memory consumption for fine-tuning large language models on GPUs with limited memory. Uses 4-bit quantization to optimize memory usage and employs LoRA to counterbalance the quantization errors. Helps maintain model performance with minimal computational resources.
Instruction fine-tuning involves training a pre-trained model with a dataset that has specific tasks, enhancing its performance for those tasks. This approach differs from unsupervised pre-training by targeting specific tasks and instructions.
Refer to the files train_args_deepak.ipynb and train_deepak.ipynb for code details.