Burdock was created completely using C++. The project uses the file system and a custom filetype (.brdk) to store your vector files. Burdock can query your database using cosine similarity search to find the closest vectors.
Burdock was created to be a free and local alternative to other vector databases that require you to store your data on the cloud. With Burdock being open source, you can trust that your vectors are completely private.
The code is built with:
This is an example of how you may give instructions on setting up your project locally. To get a local copy up and running follow these simple example steps.
Below is an example of how you can instruct your audience on installing and setting up your app. This template doesn't rely on any external dependencies or services.
- Clone the repo
git clone https://github.com/anirudlappathi/burdock.git
- CD into the repository folder
- Create a build folder
mkdir build
- CD into the build folder
cd build
- Make the proper cmake files
cmake ..
Example:
VectorDatabase vdb("./testing", 5);
for (int i = 0; i < 16; i++) {
// The embedding function without
// any vector inserted creates a random
// test vector for now
Embedding e(5);
vdb.insert(e);
}
vdb.WriteToFile();
// Loading this folder automatically reads
// in the written files
VectorDatabase vdb2("./testing");
std::vector<float> a;
for (int i = 0; i < 5; i++) {
a.push_back(1);
}
Embedding e(a);
auto b = vdb2.query(e);
// The first value contains the vector that is
// closest to the queried vector and the
// second value shows the cosine similarity
// score
std::cout << *b.first << ", " << b.second << std::endl;
- Add README.md
- Create POC
- Implement KDB-Tree
- Use KD-Tree within nodes
- Use various techniques for optimization
- Random Projection
- Product Quantization
- Locality-sensitive hashing
- Hierarchical Navigable Small World (HNSW)
- Python connector library
- Optimize saving already open databases
- When nodes reach max vector limit, increase KDB-Tree by only that single node rather than the entire layer
- Save only changed parts of nodes files
- Monitering
- Logging
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See MIT License for more information.
Project Link: https://github.com/anirudlappathi/burdock
Use this space to list resources you find helpful and would like to give credit to. I've included a few of my favorites to kick things off!