React/ASP Net 8 App for Retrieval-Augemented Generation on local hardware (no apis). Uses C# for embedding generation via AllMiniLML6v2Sharp, LLAMA-2-Chat-7b via LLamaSharp, and connection to a ChromaDB with ChromaDBSharp all running on the local machine.
This app is for demo purposes only, to show the whole pipeline can be run from .NET, and not for production use.
-
Start a hosted version of chromadb. Follow instructions here. I recommend using the docker build for an easy solution.
-
Download the All-Mini-LM-L6-v2 .onnx and .vocab from here
-
Download the .gguf LLAMA weights. For example LLama 2 Chat 7b
-
Update
appsettings.json
to point to models/chromadb
{
"AppSettings": {
"ChromaDbUrl": "http://chroma-endpoint",
"ChromaDocumentCollection": "chroma-collection-name",
"Separators": [ "\n\n", "\n", " ", "" ],
"AllMiniV2Vocab": "path/to/vocab.txt",
"AllMiniV2Model": "path/to/model.onnx",
"ChatModelPath": "path/to/weights.gguf"
}
}
- Run the app.
dotnet build
thendotnet run
or run with visual studio.
- Add some documents. (Currently only supports pasting text.)
- Find answers via Vector Search.
- Find answers via LLM