Sample cripts to summarize and query documents using LLMs.
THIS IS NOT ROBUST OR PRODUCTION-READY! FOR DEMONSTRATION PURPOSES ONLY!
This is the sample code for the following video and blog post:
- Summarize and Query PDFs with a Private Local GPT for Free using Ollama and Langchain (YouTube)
- Summarize and Query PDFs with AI using Ollama
All you have to do is install the dependencies in pyproject.toml:
python = "^3.12"
openai = "^1.14.3"
langchain = "^0.1.13"
ollama = "^0.1.8"
rich = "^13.7.1"
python-dotenv = "^1.0.1"
langchain-openai = "^0.1.1"
pypdf = "^4.1.0"
tiktoken = "^0.6.0"
Using poetry, that would be:
poetry installand setup your environment variables. The recommended way is to use a .env file. Just copy
and rename one of .env-ollama-sample or .env-openai-sample to .env. If you use
OpenAI, you will need to also set your API key in .env
streamlit run doc_app.pyThere are two scripts, one for summarizing and one for querying documents.
To summarize document.pdf from the first page, excluding the last two, using mixtral with a temperature of 0.2:
python summarize.py document.pdf -s 0 -e "-2" -m mixtral -t 0.2To query document.pdf from the first page, excluding the last two, using mixtral with a temperature of 0.2:
python query.py document.pdf "What is the data used in this paper?" -s 0 -e "-2" -m mixtral -t 0.2