LLM RAG App

In this project, I build a local RAG (Retrieval Augmented Generation) pipeline that processes PDF files and allows users to query information from these files using Large Language Models.

Retrieval-Augmented Generation is an advanced natural language processing (NLP) technique that combines the strengths of retrieval-based techniques and generation-based models to improve the quality and relevance of text generation tasks. These techniques involve searching a large corpus of documents to find the most relevant pieces of information based on a given query. They excel at providing factual and contextually relevant information quickly.

Generational models (such as LLMs) create new text based on a given input, often using deep learning techniques such as transformers. They are capable of creating coherent and contextually rich text but might sometimes generate less than factual information.

Project structure

Quite simply, we have our original data source which is the text contained in the PDFs we input. The data is split up into smaller chunks and transformed into embeddings using the 'nomic-embed-text' model within Ollama. These embeddings are essentially vectors and are stored within a vector database such as Chroma. Here you can search by nearest neighbors as opposed to substrings like in a typical database.

When an input query is submitted, it is also transformed into a vector embedding and we fetch the most relevant entries from the database (based on similarity search). This is used for the final response to the user.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
__pycache__		__pycache__
chroma		chroma
data		data
diagrams		diagrams
env		env
README.md		README.md
embeddings.py		embeddings.py
load_data.py		load_data.py
query_pdf.py		query_pdf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM RAG App

Project structure

About

Releases

Packages

Languages

agnivchtj/python_rag_app

Folders and files

Latest commit

History

Repository files navigation

LLM RAG App

Project structure

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages