Read this blog post for context: Retrieval-Augmented Generation: How to Use Your Data to Guide LLMs
conda env create -f environment.yaml
pip install -r requirements.txt
Run the repository extraction and Markdown chunking flow.
Looks in /flows/config/repo_params.py
to specify new configurations; be mindful of repository licensing!
python flows/markdown_chunker.py run
Post-process the results of the MarkdownChunker
flow.
python flows/data_table_processor.py run
Let's inspect the results of these workflows in a notebook.
If you are in a Conda environment, you need to install the kernel like so before opening the notebooks:
pip install ipykernel python -m ipykernel install --user --name rag-demo --display-name "RAG demo"
Open notebooks/analyze_chunks.ipynb
to use the Metaflow Client API to explore the results of the flow runs:
jupyter notebook notebooks/analyze_chunks.ipynb
Now that we have indexed a bunch of source objects, let's index them and use them to modify an LLM context window.
You can either use a Llama 2 model, or OpenAI APIs. Letting LlamaIndex use Llama 2 runs locally, so it doesn't require an API key but takes a lot longer. Using OpenAI APIs is faster and cheap. You can create/find your API key here.
Open notebooks/llama_index_exploration.ipynb
jupyter notebook notebooks/llama_index_exploration.ipynb
Go here, copy your key value, and set the following environment variable:
export OPENAI_API_KEY=<YOUR KEY>
streamlit run chat_app.py
There are two indexing workflows in the /flows
folder, one indexes into Pinecone for a VectorDB and another that uses the open-source LanceDB.
Go here, create a Pinecone account if you have to, copy your API key, and set the following environment variable:
export PINECONE_API_KEY=<YOUR KEY>
Set the following environment variable too:
export GCP_ENVIRONMENT=us-central1-gcp
python flows/pinecone_index.py run
python flows/lancedb_index.py run