A tiny implementation of a RAG system that runs entirely on your computer!
Important
This repo is in development. It can have important changes between commits
- Start Ollama engine.
- Install dependencies with uv:
uv sync
- Run
uv run minirag. It will run a Llama3.2:1b model. - Chat with the model
There is 1 configurable param:
-m --model: model to use (llama3.2:1b by default). You can check the full list of available models here
- Type a message to chat with the model. All the conversation will be remembered by the model.
- Type
/byeto exit the chat. - Type
/helpto show all the commands. - Type
/addto create a collection.- You'll be asked to enter the paths for all the documents for the collection. You can enter specific files or directories, in which case it will process all the files within the directory.
- You will be asked to introduce a collection name.
- Then the embeddings will be generated and stored in a .npy file for future reference. The embeddings will be stored in memory with numpy.
- Type
/activateto load and use a collection. - Type
/deactivateto deactivate the active collection. - Type
/listto list available collections.
These are the next steps I plan to take:
- Support vision models
- Support for more files (see section below)
- Testing
- Improve index algorithm
- Performance metrics (speed, storage, scalability, ...)
- UI (somthing very light and simple)
This project uses uv for dependency management. The project configuration is in pyproject.toml:
- Core dependencies are listed under
[project.dependencies] - Development dependencies are listed under
[tool.uv.dev-dependencies] - To install dependencies, run
uv sync - To run the CLI, use
uv run miniragoruv run minirag --model <model_name> - To run tests, use
uv run pytest tests
Feel free to suggest any other relevant topic or idea to be included in the code (contributions are also welcome)
- .txt
Testing has been done with pytest.
To run all the tests:
uv run pytest tests
To generate coverage report:
uv run pytest --cov-report term --cov-report xml:coverage.xml --cov=minirag
The evaluation/ folder contains scripts and data used for evaluate the performance of this rag system.
The supported metrics right now are the following:
This system uses the basic np.dot function to compute the similarity search between the embeddings. To compute the eval data for this metric execute the following command: python -m evaluation.eval, it will generate a .csv file with the result benchmark for different collection sizes.