### What you will find here
- Using Chroma collection and vector store
- Using the primary prompt templates used in llama-index
    + Load an LLM 
    + use index as `chat`
    + use index as `query engine` 

In [1]:
import os
from bubls.utils.chromadb_utils import ConfigurableChromaIndex
from llama_index.llms.openai import OpenAI
from llama_index.core.memory import ChatMemoryBuffer

<jemalloc>: Unsupported system page size


### Create Chroma Index

In [3]:
chroma_index = ConfigurableChromaIndex(
        "williams_family_collection", "data/williams_family/biographies"
    )

### Index as query engine

In [5]:
query_engine = chroma_index.index.as_query_engine()
response = query_engine.query("How did Paulo and his siblings spend their days growing up?")
print(response)

Paulo and his siblings spent their days growing up exploring the world around them and creating cherished memories that would last a lifetime. They navigated the ups and downs of adolescence together, supporting each other through thick and thin.


### Chat Example

Load an LLM

In [6]:
llm = OpenAI(model="gpt-3.5-turbo")

Since the context retrieved can take up a large amount of the available LLM context, let’s ensure we configure a smaller limit to the chat history!

In [7]:
memory = ChatMemoryBuffer.from_defaults(token_limit=3900)

In [9]:
CHAT_SYSTEM_CONTENT = """
    Here are the relevant documents for the context:
    {context_str}
    ----
    Given the context information and not prior knowledge,
    answer to the question, as briefly as possible.
    Structure your response as a list of facts.
"""

In [11]:
chat_engine = chroma_index.index.as_chat_engine(
    chat_mode="condense_plus_context",
    memory=memory,
    llm=llm,
    context_prompt=CHAT_SYSTEM_CONTENT,
    verbose=False,
)

In [12]:
question = "How many children did Immanuel II and Matilda have?"
print(chat_engine.chat(question))

- Immanuel II and Matilda had three children: Immanuel III, Raphael, and Sora.
