## Load env variables

In [None]:
%load_ext dotenv
%dotenv

# Ingesting data and creating an index
The first part in any RAG adventure is creating an index.

To do this we'll have to load our data which will give us a list of `Document`. After this we still need to transform each Document into a Node.

Documents and nodes not only contain text but also any usefull metadata (tags, where did the data come from, ...) which can all be set by us.

## Load data from Markdown files
The first step is loading data from Markdown files. If you need other file formats or locations you can take a look at the plethora of readers on llamahub.ai.

In [None]:
from llama_index.core import SimpleDirectoryReader
from llama_index.readers.file import MarkdownReader

# Markdown parser will remove hyperlinks and images by default
markdown_parser = MarkdownReader()
file_extractors = {
    ".md": markdown_parser,
}

documents = SimpleDirectoryReader(
    './data',
    file_extractor=file_extractors
).load_data()

if len(documents) >= 1:
    print(documents[0])

## From Documents to an index
The next step, now that we've got a reference to our Documents is creating a searchable index from these documents.

In [None]:
from llama_index.indices.managed.colbert import ColbertIndex

index = ColbertIndex(
    nodes=documents,
    index_name='factoids'
)

## Storing the index
To store the index, you can simply call the `.persist` method on the index. This stores it to the local fs allowing it to be loaded in by (for example) a different process. (Run `python3 interactive.py` to see it in action)

In [None]:
# Store index
index.persist(persist_dir="./index")

## Querying example
After you've created the index, you can start querying it.
To do this, you first need to cast the index to a query engine using `index.as_query_engine()`.

In this case, Groq is used for the inference/generation of the answers, but this can be exchanged for any of the supported LLM's by LlamaIndex.

Another great feature is that results can be inspected, and relevant metadata can thus be extracted from it (such as the source document, file path, ...).

In [None]:
from os import getenv
from llama_index.llms.groq import Groq

query = "How are alerts configured in Prometheus? Please provide an example as well."

llm = Groq(
    model="llama3-8b-8192",
    api_key=getenv("GROQ_API_KEY")
)

query_engine = index.as_query_engine(
    llm=llm,
    top_k=5
)

result = query_engine.query(
    query
)

print("Answer: {}".format(result))
print("Sources: \n{}".format(result.get_formatted_sources(length=500)))

## Chatting instead of querying
If you want to be able to truly chat with the documents/index you need to cast it to a chat engine.
This keeps the history and other relevant information allowing for multi-turn chatting and follow-up questions.

In [None]:
from llama_index.core.chat_engine.types import ChatMode

chat_engine = index.as_chat_engine(
    llm=llm,
    chat_mode=ChatMode.CONDENSE_PLUS_CONTEXT,
)

chat_engine.reset()

# Every chat call is added to the history

response = chat_engine.chat(
    "What service discovery mechanisms are there in Prometheus?"
)

print("Chat response: {}".format(response))

response = chat_engine.chat(
    "Can you show an example of how to do this for all of them?"
)

print("Chat response: {}".format(response))


response = chat_engine.chat(
    "Please create a config that scrapes from a node exporter which is discovered by using HTTP service discovery."
)
print("Chat response: {}".format(response))