Custom Trained LLM application with Llama, and grounding via RAG
Retrieval Augmented Generation (RAG) is a technique used to enhance the knowledge of large language models (LLMs) by incorporating additional, often private or real-time, data. While LLMs can reason about a wide range of topics, their knowledge is restricted to publicly available data up to a specific point in time when they were trained. However, if you aim to create AI applications capable of reasoning about private data or information introduced after a model’s training, you must augment the model’s knowledge with the specific details it requires. The process of integrating relevant information into the model prompt is referred to as Retrieval Augmented Generation (RAG). This approach allows LLMs to generate more contextually accurate and up-to-date responses. 😊
- Reading Dataset, Data Cleaning, Dividing in Chunks
- Encoding Each Source of Data, and Store it as a Dataframe, to prevent recomputation
- Create an instance of Llama3. Take User prompt a input
- Fetch K-Nearest-Neighbours for text matching with the prompt. Reframe the prompt accordingly
- Encode the prompt and pass it to the model