Llarga stands for 'Local Large language RAG Application'. A streamlit application for interfacing with a local RAG LLM.
- Install required libraries. The LLM and RAG system relies on two key libraries you should set up and make sure are working independently:
- nlp_pipeline for the processing of documents
- local_rag_llm for the LLM itself
- Download the LLM (in .gguf format) you would like to use and put it in the
models/
directory (e.g., the Q5 quantization of Llama chat is available here). - Update the
llm_path
field of themetadata/llm_list.csv
file to reflect the location of the GGUF, and thellm_url
field for your own reference. - If you would like to prepopulate a corpus outside of the app, add its name to the
metadata/corpora_list.csv
file in thename
the file, the directory to the .txt files in thetext_path
column, and the path to the metadata file in themetadata_path
column. The metadata file can contain anything, but must at least include atext_id
column (unique identifier starting from 1) and afile_path
column, containing the absolute path of all the text files in the corpus. - In the
metadata/user_list.csv
file, put user names and optionally emails. - You can change the title of the application by changing the
app_title
column in themetadata/settings.csv
file - You can change the contact person by changing the
author_name
andauthor_email
columns in themetadata/settings.csv
file - In
metadata/settings.csv
, in thecorpora_location
column, put the directory of your streamlit app and itscorpora/
directory. This is for management of the corpus metadata files, which use absolute paths because of thenlp_pipeline
library - You can change the context prompt by editing the
context_prompt
column in themetadata/settings.csv
file. Thenon_rag_system_prompt
is the default system prompt if you are not using RAG,rag_system_prompt
is the default if you are. The system prompt can be changed from the front end as well. - run the app from the command line with
streamlit run app.py --server.port 8***
at whatever port you wish. - To get the app online quickly, you can use ngrok to expose this local port to be able to access via the internet.
- The app has a password, which you can set by creating a
.streamlit/
directory in the base directory of the app, with asecrets.toml
file inside containingpassword = "desired_password"
- You can change various theme options by creating a
.streamlit/config.toml
file, containin e.g.:
[theme]
primaryColor="#5b92E5"
backgroundColor="#FFFFFF"
secondaryBackgroundColor="#F0F2F6"
- The app can support unlimited users. Simultaneous generation requests will be queued and executed in the order they are received.
- All users share the same LLMs, so if you want to allow users to choose between multiple LLMs, you need to have enough VRAM to load them simultaneously.
- Or, you can tick the
Clear other LLMs on reinitialize
checkbox underAdvanced model parameters
, which will clear all other LLMs (for all users) before loading the chosen model.
- Parameters are explained in the sidebar in their respective tooltips.
- Most parameters can be changed and reflected at generation time. Exceptions are
Which LLM
andWhich corpus
, which when they are changed, need theReinitialize model
hit afterwards. - The two parameters under
Vector DB parameters
require the recreation of the vector database, which may take longer if you have a very large corpus. If you change either of these, click theReinitialize model and remake DB
button instead. - Hit the
Reset model's memory
button to clear the model's short-term memory/context. This is also necessary if you change theSystem prompt
parameter.
-
The system intializes with no corpus, so you are chatting with the vanilla LLM
-
To query over your own documents, you have 7 options:
- Preprocess your files into .txts and place in the appropriate places according to the instructions in the "Set up" section. The corpus will then appear as an option under the
Which corpus
selector. - Paste a comma-separated list of URLs into the
URLs
box. Make sure these URLs aren't behind a log in/paywall. If that is the case, copy the content to a Word or .txt file and upload directly. - Upload/drag a single .csv, .doc, .docx, .pdf, or .txt file into the
Upload your own documents
box - Upload a single
metadata.csv
file into theUpload your own documents
box. The CSV can include any metadata you want, but must at least include aweb_filepath
column pointing to the website or PDF file online. - Upload a .zip file containing many documents. Put all your documents into a directory called
corpus/
, then zip it. Upload that file into theUpload your own documents
box. - Upload a .zip file containing many documents as well as a metadata file. Put all your documents into a directory called
corpus/
, then put a file calledmetadata.csv
at the same level as thecorpus/
directory (not in directory), then zip the directory and CSV together. The CSV needs to have at least a column namedfilename
with the filename of the documents. Upload that file into theUpload your own documents
box. - Fill in the
Google News query
parameter to create a corpus based on results from Google News.
- Preprocess your files into .txts and place in the appropriate places according to the instructions in the "Set up" section. The corpus will then appear as an option under the
-
You can persist your corpus if it is large by typing a name other than
temporary
to theUploaded corpus name
box. This name will then appear as an option under theWhich corpus
dropdown. It should be lower case with no spaces or special characters, use underscores for spaces. -
Then hit the
Process corpus
button. This will both process the corpus and then reinitialize the model on this corpus, wait for both to finish. -
You can clear out old corpora from local files and the database by using
helper/clear_corpus.py
. E.g., run the command line in thehelper/
directory, then enter:python clear_corpus.py --keep corpus1,corpus1
to delete everything except corpus1 and corpus2python clear_corpus.py --remove corpus1,corpus2
to remove only corpus1 and corpus2
- Database credentials are stored in
metadata/settings.csv
- For backing up, if you have
dump_on_exit
set to1
in themetadata/settings.csv
file, a database dump will be created each time a user exits the application incorpora/vector_db_dump.sql
- If you want to recreate the vector database in another place, for instance for running the application on a different computer, copy the entire
corpora/
directory to the new application and setrestore_db
to1
in themetadata/settings.csv
file.
If only using the CPU or an Nvidia GPU, you can run the application exclusively with Docker.
- Download the
docker-compose.yml
andDockerfile
(for CPU-only) orDockerfile-gpu
(for GPU) files - In
docker-compose.yml
, edit the HF_TOKEN to your API token - There are four elements which need to exist outside of the container for persistence, portability, and personalization:
corpora/
directory: this is where processed corpora (text files and vector database dumps) are saved. Change the<local corpora directory>
line indocker-compose.yml
to your local path for these files.metadata/
directory: this is where various information like database credentials, user list, llm list, etc. are stored. Change the<local metadata directory>
line indocker-compose.yml
to your local path for these files. The elements to be manually checked and changed aresettings.csv
, theapp_title
,author_name
, andauthor_email
columns,llm_list.csv
, anduser_list.csv
.models/
directory: this is where the actual LLMs are stored. Change the<local models directory>
line indocker-compose.yml
to your local path for these files.secrets.toml
file: this is where you can change the application's password. Change the<local secrets.toml file path>
line indocker-compose.yml
to your local path for this file.
- If you are using the CPU, delete or comment out the
deploy:
section indocker-compose.yml
, and change thedockerfile: Dockerfile-gpu
line todockerfile: Dockerfile
. - Navigate to the directory where you saved the
docker-compose.yml
file and Dockerfile and rundocker compose up
- The application will now be available on port 8502 by default.
If you are using Apple silicon, you won't be able to run everything in Docker because of the lack of MPS drivers. You can still use the pgvector image however.
- follow the instructions to install local_rag_llm and nlp_pipeline individually
- Download the
docker-compose.yml
file - From the
docker-compose.yml
file, delete thestreamlit:
line and everything below it - Start the postgres container with
docker compose up
- Edit your
metadata/settings.csv
file and change thehost
column fromlocalhost
topostgres
andusername
topostgres
- Run the application as normal.