This repository has been archived by the owner on Oct 14, 2024. It is now read-only.
docs: integrations/AnythingLLM #91
Labels
type: documentation
Improvements or additions to documentation
AnythingLLM
AnythingLLM enables the embedding of documents and web URLs, cuts them into chunks and stores them in a vector database.
AnythingLLM uses the VectorDB to match the chunks that have the highest semantic similarity to your query. Then it adds those as context to the prompt that is sent to the LLM running on Jan. You can 'pin' a particular document to paste that into the context in its entirety. How well this pinning works depends on how well the model that you use can handle large contexts.
Tip
Mistral 7B instruct v0.2 can handle contexts up to 32k tokens without having a 'lost-in-the-middle' problem. This makes it a good candidate when you want to run a model locally for a RAG use-case. Another good option is a large context flavour of Llama 3. There's 32k, 64k, and even 262k context versions. Make sure you have enough (V)RAM though!
On top of threads, they've added the concept of workspaces. Per workspace you can embed sets of documents that make sense together. This way you can have separate workspaces for asking questions about different topics.
This enhances Jan to enable more advanced RAG applications; Jan can currently only attach one document to a thread at a time.
Setup
Jan - local API
<>
in the bottom left corner.AnythingLLM
http://<IP>:<port>/v1
, if you used the defaults that would behttp://127.0.0.1:1337/v1
."id"
. This looks something likellama-3-8b-instruct-32k-q8
.Setting up a workspace
Your first RAG
↥
next to the cogwheel (⚙️).This is the principle. You can do the same with text files, audio, csv's, spreadsheets,....
The text was updated successfully, but these errors were encountered: