This project aims to create a chatbot using the Transformer encoder-decoder model, based on the groundbreaking "Attention Is All You Need" paper. The Transformer architecture has revolutionized natural language processing tasks, including machine translation and chatbot development. In this project, we leverage the power of self-attention mechanisms to build an intelligent and interactive chatbot.
- Transformer Architecture: The chatbot is built using the Transformer architecture, which allows it to capture contextual dependencies and generate accurate responses.
- Self-Attention Mechanism: The model utilizes self-attention mechanisms to attend to relevant parts of the input sequence, enabling it to understand the context and generate context-aware responses.
- Multi-Head Attention: Multiple attention heads are employed to capture different types of dependencies, resulting in a more comprehensive understanding of the input and improved response generation.
- Encoder-Decoder Framework: The chatbot follows the classic encoder-decoder framework, where the encoder processes the input sequence and the decoder generates the response based on the encoded representation.
- Python
- Tensorflow library (version 2.10.0)
- Clone the repository
- Install the required packages
- I am using tensorflow-gpu 2.10.0 for this project
- To train the model, edit the preprocess.py file and replace your dataset with your own.
- Run the main.py file to start the training process. You can play around with the hyperparameters to best fit your needs.
- Run the chat.py file to load the trained model :)