Document AMA

Building upon kennethleungty/Llama-2-Open-Source-LLM-CPU-Inference with a Streamlit front-end. It allows to load any .txt file or .pdf document with text and asking any questions about it using LLMs on CPU.

Setting up

Docker

Build the docker image:

docker build -t document-ama .

Run it:

docker run -d -p 8501:8501 --name document-ama document-ama:latest

Open the Streamlit app in your browser:

http://localhost:8501/

Local env

Init a VENV, if desired:

python3 -m venv venv
source venv/bin/activate

Install requirements:

pip install -r requirements.txt

Run streamlit:

streamlit run app.py

A new browser window will pop up with the streamlit app.

Architecture and `app.py` code flow

(Also see the medium article for the original code architecture)

It is currently set up to run with the 8-bit quantized version of Llama2 that runs on GGML as the Q&A model and sentence-transformers/all-MiniLM-L6-v2 as the embeddings model. If those models are not present (e.g. in the first run), it will first download them.

Then, it will ask for a file to be uploaded. It needs to be either a .txt file or a .pdf file with selectable text in them. It will not try to OCR the document.

Once the document is uploaded, Langchain is being used to:

Load those documents into it;
Extract the text with PyPDF;
Split it into 500-character chunks (with 50-character overlap);
Calculate the embeddings vector of each chunk;
and finally load those embeddings with their respective chunks into a FAISS database.

The FAISS files are being saved on disk under a folder with the checksum of each file, so if the exact same file is uploaded again it will just reuse the previously created database instead of re-doing them.

After that is done, it asks for the question. It will then load the LLM model into memory using CTransformers and keep using Langchain to:

Load the FAISS db to memory;
Build a prompt template with the question and the hardcoded prompt template string;
Build a RetrievalQA database with the LLM to load the context relative to the question into the prompt template;
Ask the RetrievalQA database the question

After the RetrievalQA returns with an answer, it will display the answer and the relevant passages of the text to that answer.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
logger.py		logger.py
reference_image.png		reference_image.png
requirements.in		requirements.in
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Document AMA

Setting up

Docker

Local env

Architecture and `app.py` code flow

About

Releases

Packages

Languages

License

lotif/document-ama

Folders and files

Latest commit

History

Repository files navigation

Document AMA

Setting up

Docker

Local env

Architecture and app.py code flow

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Architecture and `app.py` code flow

Packages