## Preparation

As always, we have some work to do before we can jump straight into the workflows.

Let's set-up some boilerplate, add some dependencies, and get ready to rock!

### Async Boilerplate:

Since "workflows make async a first-class citizen", and we're running these examples in a Jupyter Notebook (which is in an active async loop!) we'll need to use the `nest_asyncio` library to ensure we're able to take advantage of the async capabilities of the workflows we're making!

In [1]:
import nest_asyncio

nest_asyncio.apply()

### Installing Dependencies:

Next, we're going to install our dependencies!

We'll want to get the [Taviliy Research Tool](https://llamahub.ai/l/tools/llama-index-tools-tavily-research?from=) which will allow us to do open research as part of our Corrective RAG Implementation (more details on that later).

We'll also want to grab our `llama-index-utils-workflow` package which will let us draw all possible paths through the resultant workflow.

In [2]:
%pip install -qU llama-index llama-index-utils-workflow

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m30.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m34.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.5/1.5 MB[0m [31m28.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m756.0/756.0 kB[0m [31m23.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m173.8/173.8 kB[0m [31m8.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.4/76.4 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m78.0/78.0 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m383.7/383.7 kB[0m [31m18.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [3]:
%pip install -qU pinecone llama-index-vector-stores-pinecone

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m419.8/419.8 kB[0m [31m5.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m244.8/244.8 kB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m85.4/85.4 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[?25h

In [41]:
%pip install -qU llama-index-embeddings-mistralai llama-index-llms-text-generation-inference

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/229.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m229.7/229.7 kB[0m [31m11.8 MB/s[0m eta [36m0:00:00[0m
[?25h

In [11]:
%pip install -qU llama-index-core llama-parse llama-index-readers-file python-dotenv

In [77]:
%pip install -qU llama-index-readers-file

In [5]:
import os
import getpass

os.environ["PINECONE_API_KEY"] = getpass.getpass("Pinecone API Key:")

Pinecone API Key:··········


In [12]:
os.environ["LLAMA_CLOUD_API_KEY"] = getpass.getpass("Llama Cloud API Key")

Llama Cloud API Key··········


In [22]:
os.environ["HF_TOKEN"] = getpass.getpass("Huggingface Token:")

Huggingface Token:··········


In [42]:
os.environ["MISTRAL_API_KEY"] = getpass.getpass("Mistral API Key:")

Mistral API Key:··········


In [10]:
!git clone https://github.com/AI-Maker-Space/DataRepository.git

Cloning into 'DataRepository'...
remote: Enumerating objects: 110, done.[K
remote: Counting objects: 100% (102/102), done.[K
remote: Compressing objects: 100% (88/88), done.[K
remote: Total 110 (delta 34), reused 35 (delta 9), pack-reused 8 (from 1)[K
Receiving objects: 100% (110/110), 71.41 MiB | 22.89 MiB/s, done.
Resolving deltas: 100% (34/34), done.


In [89]:
from llama_parse import LlamaParse
from llama_index.core import SimpleDirectoryReader

parser = LlamaParse(
    result_type="markdown"  # "markdown" and "text" are available
)

file_extractor = {".pdf": parser}
pdf_documents = SimpleDirectoryReader(input_files=['./DataRepository/RAGATHON/musk_v_openai.pdf'], file_extractor=file_extractor).load_data()
print(len(pdf_documents))

Started parsing the file under job_id 4bdf8676-bde3-4f8b-b215-34ffe36b1083
86


In [87]:
from llama_index.readers.file import PagedCSVReader
from llama_index.core.node_parser import SimpleFileNodeParser

reader = PagedCSVReader()
file_extractor = {".csv": reader}
csv_documents = SimpleDirectoryReader(
    input_files=["./DataRepository/RAGATHON/elon_tweets.csv"], file_extractor=file_extractor
).load_data()

2668


In [94]:
all_documents = pdf_documents

In [43]:
from llama_index.embeddings.mistralai import MistralAIEmbedding

model_name="mistral-embed"

embed_model = MistralAIEmbedding(model_name=model_name)

In [57]:
embeddings = embed_model.get_text_embedding("Welcome to the RAGATHON!")
print(len(embeddings))
embedding_dimension = len(embeddings)

1024


In [45]:
from llama_index.core import Settings

Settings.embed_model = embed_model

In [58]:
from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

In [59]:
index_name = "llamaindex-ragathon-demo-index"

pc.create_index(
    name=index_name,
    dimension=embedding_dimension,
    metric="cosine",
    spec=ServerlessSpec(
        cloud="aws",
        region="us-east-1"
    )
)

In [60]:
pinecone_index = pc.Index(index_name)

In [68]:
import os
from typing import List, Optional

from llama_index.llms.text_generation_inference import (
    TextGenerationInference,
)

URL = "https://cx7s40y9qdd7zxhr.us-east-1.aws.endpoints.huggingface.cloud"
hf_llm = TextGenerationInference(
    model_url=URL, token=os.environ["HF_TOKEN"]
)

completion_response = hf_llm.complete("To infinity, and")
print(completion_response)



...beyond!


In [63]:
from llama_index.core import PromptTemplate

DEFAULT_RAG_PROMPT = PromptTemplate(
    template="""Use the provided context to answer the question. If you don't know the answer, say you don't know.

    Context:
    {context}

    Question:
    {question}
    """
)

In [72]:
from llama_index.core.workflow import Event
from llama_index.core.schema import NodeWithScore

class PrepEvent(Event):
    """Prep event (prepares for retrieval)."""
    pass

class RetrieveEvent(Event):
    """Retrieve event (gets retrieved nodes)."""

    retrieved_nodes: list[NodeWithScore]

class AugmentGenerateEvent(Event):
    """Query event. Queries given relevant text and search text."""
    relevant_text: str
    search_text: str

In [73]:
from llama_index.core.workflow import (
    Workflow,
    step,
    Context,
    StartEvent,
    StopEvent,
)
from llama_index.core import (
    VectorStoreIndex,
    Document,
    SummaryIndex,
)
from llama_index.core.query_pipeline import QueryPipeline
from llama_index.llms.openai import OpenAI
from llama_index.core import StorageContext
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.vector_stores.pinecone import PineconeVectorStore
from IPython.display import Markdown, display
from llama_index.core.base.base_retriever import BaseRetriever

class OpenSourceRAG(Workflow):
    @step
    async def ingest(self, ctx: Context, ev: StartEvent) -> StopEvent | None:
        """Ingest step (for ingesting docs and initializing index)."""
        documents: list[Document] | None = ev.get("documents")

        if documents is None:
            return None

        vector_store = PineconeVectorStore(pinecone_index=pinecone_index)
        storage_context = StorageContext.from_defaults(vector_store=vector_store)
        index = VectorStoreIndex.from_documents(
            documents, storage_context=storage_context
        )

        return StopEvent(result=index)

    @step
    async def prepare_for_retrieval(
        self, ctx: Context, ev: StartEvent
    ) -> PrepEvent | None:
        """Prepare for retrieval."""

        model_url = "https://cx7s40y9qdd7zxhr.us-east-1.aws.endpoints.huggingface.cloud"

        query_str: str | None = ev.get("query_str")
        retriever_kwargs: dict | None = ev.get("retriever_kwargs", {})

        if query_str is None:
            return None

        index = ev.get("index")

        llm = TextGenerationInference(
            model_url=model_url,
            token=os.environ["HF_TOKEN"],
            model_name="hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4 "
        )
        await ctx.set("rag_pipeline", QueryPipeline(
            chain=[DEFAULT_RAG_PROMPT, llm]
        ))

        await ctx.set("llm", llm)
        await ctx.set("index", index)

        await ctx.set("query_str", query_str)
        await ctx.set("retriever_kwargs", retriever_kwargs)

        return PrepEvent()

    @step
    async def retrieve(
        self, ctx: Context, ev: PrepEvent
    ) -> RetrieveEvent | None:
        """Retrieve the relevant nodes for the query."""
        query_str = await ctx.get("query_str")
        retriever_kwargs = await ctx.get("retriever_kwargs")

        if query_str is None:
            return None

        index = await ctx.get("index", default=None)
        if not (index):
            raise ValueError(
                "Index and tavily tool must be constructed. Run with 'documents' and 'tavily_ai_apikey' params first."
            )

        retriever: BaseRetriever = index.as_retriever(
            **retriever_kwargs
        )
        result = retriever.retrieve(query_str)
        await ctx.set("query_str", query_str)
        return RetrieveEvent(retrieved_nodes=result)

    @step
    async def augment_and_generate(self, ctx: Context, ev: RetrieveEvent) -> StopEvent:
        """Get result with relevant text."""
        relevant_nodes = ev.retrieved_nodes
        relevant_text = "\n".join([node.get_content() for node in relevant_nodes])
        query_str = await ctx.get("query_str")

        relevancy_pipeline = await ctx.get("rag_pipeline")

        relevancy = relevancy_pipeline.run(
                context=relevant_text, question=query_str
        )

        return StopEvent(result=relevancy.message.content)

In [65]:
from llama_index.utils.workflow import draw_all_possible_flows

draw_all_possible_flows(
    OpenSourceRAG, filename="os_rag_workflow.html"
)

os_rag_workflow.html


In [95]:
from llama_index.core import SimpleDirectoryReader

rag_workflow = OpenSourceRAG()
index = await rag_workflow.run(documents=all_documents)

Upserted vectors:   0%|          | 0/87 [00:00<?, ?it/s]

In [111]:
from IPython.display import Markdown, display

response = await rag_workflow.run(
    query_str="Why did Elon Musk sue OpenAI?",
    index=index,
)
display(Markdown(str(response)))



I don't know the specific reason why Elon Musk sued OpenAI. The provided context mentions that Musk contributed more than $15 million to the project and paid much of its overhead expenses, but it does not specify the reason for the lawsuit.

In [97]:
from IPython.display import Markdown, display

response = await rag_workflow.run(
    query_str="In what state was this complaint levied?",
    index=index,
)
display(Markdown(str(response)))



The complaint was levied in the State of California.

---