[Question]: How to use Own Model with Chroma Vector DB #8745

taynotfound · 2023-11-07T15:17:26Z

Question Validation

I have searched both the documentation and discord for an answer.

Question

Basically what the Title Says.
On MacBook Pro 2019 i9, 8th gen. 32GB RAM, Vega Pro 20

Here is my Code:

from llama_index import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    ServiceContext,
)
from llama_index.llms import LlamaCPP
from llama_index.llms.llama_utils import messages_to_prompt, completion_to_prompt

model_url = "https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF/resolve/main/mistral-7b-v0.1.Q5_K_M.gguf"
llm = LlamaCPP(
    # You can pass in the URL to a GGML model to download it automatically
    # optionally, you can set the path to a pre-downloaded model instead of model_url
    model_path="/home/tay/AGVMastril/em_german_13b_v01-GGUF.gguf",
    temperature=0.1,
    max_new_tokens=4048,
    # llama2 has a context window of 4096 tokens, but we set it lower to allow for some wiggle room
    context_window=3900,
    # kwargs to pass to __call__()
    generate_kwargs={},
    # kwargs to pass to __init__()
    # set to at least 1 to use GPU
    # model_kwargs={"n_gpu_layers": 1},
    # transform inputs into Llama2 format
    # messages_to_prompt=messages_to_prompt,
    # completion_to_prompt=completion_to_prompt,
    verbose=True,
    
)
response_iter = llm.stream_complete(
    "Schreib mir eine kleine Geschichte über eine Katze."
)
for response in response_iter:
    print(response.delta, end="", flush=True)
print(response.text)

The text was updated successfully, but these errors were encountered:

dosubot · 2023-11-07T15:17:31Z

🤖

Hello @taynotfound,

I'm Dosu, a bot designed to assist you with LlamaIndex. I can help you debug issues, answer any questions you might have, and guide you on how to contribute to the repository. While we're waiting for a human maintainer to join us, I'm here to assist you.

I'm currently reviewing your issue and will get back to you with a response soon.

dosubot · 2023-11-07T15:29:39Z

🤖

Based on the code you've provided, it seems like you're trying to use a custom model with the LlamaCPP class in the LlamaIndex library. The LlamaCPP class is designed to interact with custom models, specifically those that are compatible with the llama_cpp library.

In your code, you've specified the model_path parameter in the LlamaCPP constructor. This should be the path to your custom model. If you want to use a model from Hugging Face, you can use the model_url parameter instead.

Here's a brief explanation of some of the parameters you're using:

temperature: This controls the randomness of the output. A higher value will make the output more random, while a lower value will make it more deterministic.
max_new_tokens: This is the maximum number of tokens to generate.
context_window: This is the maximum number of context tokens for the model. It's set lower than the model's actual context window to allow for some wiggle room.
generate_kwargs: These are keyword arguments used for generation. You can pass any additional arguments for the model's generate method here.
verbose: This controls whether to print verbose output.

The stream_complete method is used to generate completions for a given prompt. In your case, the prompt is "Schreib mir eine kleine Geschichte über eine Katze." The method returns an iterator of CompletionResponse objects, which you can print to see the generated text.

Regarding the use of your own model with Chroma Vector DB, you can refer to the code provided in the context. It shows how to use the ChromaReader class to load data from a persisted Chroma collection and query it using a SummaryIndex.

If you're still having trouble, could you please provide more details about the issue you're facing? For example, are you getting any error messages? What is the expected output and how does it differ from the actual output? This information will help me provide a more accurate solution.

Sources:

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

logan-markewich · 2023-11-07T16:24:29Z

@taynotfound you just have to put the LLM into the service context and off you go. You probably also want to use a local embedding model too.

service_context = ServiceContext.from_defaults(llm=llm, embed_model="local:BAAI/bge-base-en-v1.5")

index = VectorStoreIndex.from_documents(documents, service_context=service_context)

taynotfound added the question Further information is requested label Nov 7, 2023

logan-markewich closed this as completed Nov 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: How to use Own Model with Chroma Vector DB #8745

[Question]: How to use Own Model with Chroma Vector DB #8745

taynotfound commented Nov 7, 2023

dosubot bot commented Nov 7, 2023

dosubot bot commented Nov 7, 2023

logan-markewich commented Nov 7, 2023

[Question]: How to use Own Model with Chroma Vector DB #8745

[Question]: How to use Own Model with Chroma Vector DB #8745

Comments

taynotfound commented Nov 7, 2023

Question Validation

Question

dosubot bot commented Nov 7, 2023

dosubot bot commented Nov 7, 2023

Sources

logan-markewich commented Nov 7, 2023