Welcome to a rough draft/demo for a vector database that acts as the primary source of information for an LLM. If the LLM does not find any related into in the database, it should just say it doesn't know.
This model's RAG database has access only to information on vector databases. So, it should admit that it doesn't have any relevant information in its memory for questions unrelated to vector databases (fingers crossed).
The intention is to increase the general dependency of a local LLM. The logical next step is having it search the internet if it doesn't know and embed that information into its database. Then I'll make a chat interface, then a memory database, then this, then that... basically there's a few more related projects/repositories after this one.
Also, this runs pretty well on an M4 chip, so that's nice!
To run the demo, you'll have to first click the model links below and put them in the models directory after cloning:
Then, just copy and paste this into your terminal and hit enter:
git clone https://github.com/crdcamp/llama-cpp-llm-embedding.git
cd llama-cpp-llm-embedding
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python3 chat_demo.py
You can also test it with your prompts at the top of chat_demo.py. This is a proof of concept and not meant to be an interactive chat.