#### Using Local LLM with Ollama

Using local LLMs with Ollama offers several advantages over using externally hosted models:

![Ollama](..//images//ollama.jpg)

1. Enhanced Data Privacy and Security

    On-Premise Control: By running the LLM on your local machine or private server, you retain complete control over your data. This is crucial when working with sensitive or proprietary information.
    Reduced Risk of Data Leaks: Eliminates the need to send your data to third-party servers, minimizing the risk of unauthorized access or accidental exposure.

2. Cost-Effectiveness

    Free to Use: Ollama is an open-source project, meaning you can use it without incurring usage fees or subscription costs.
    No Cloud Costs: Avoid the expenses associated with using cloud-based LLM services, especially for high-volume usage or larger models.

3. Customization and Flexibility

    Model Variety: Ollama supports various open-source models, giving you the freedom to choose the one that best suits your task.
    Easy Model Switching: Easily switch between different models or experiment with different configurations without relying on external providers.
    Fine-tuning: You have the ability to fine-tune models on your specific datasets to achieve better performance for your particular use case.

4. Offline Access

    Reliable Availability: Unlike cloud-based LLMs, which can experience downtime or connectivity issues, a local Ollama instance is always accessible, even without an internet connection.

#### In this notebook we will create a adpative RAG agent system which will create reports for the Senior Executive Team from the SHAP Explnations of our model

1.  We will use LangSmith To Track the chain we create
2.  We will create the whole system in a graph agent Architecture using Langgraph
3. This is inspired from the langchain explanation for adaptiv RAG - https://github.com/langchain-ai/langgraph/blob/main/examples/rag/langgraph_rag_agent_llama3_local.ipynb

LLM agents are developed using Llama3 - There are many other open source models available with ollama

![Llama3](..//images//llama3.jpg)

In [1]:
from langchain.prompts import PromptTemplate
from langchain_community.chat_models import ChatOllama
from langchain_core.output_parsers import JsonOutputParser,StrOutputParser


from langchain import hub
from langchain.llms import Ollama
import ollama
from langchain_community.embeddings import OllamaEmbeddings
# nomic_embed = OllamaEmbeddings(model="nomic-embed-text")
#paraphrase-distilroberta-base-v2
#all-distilroberta-v1
#all-MiniLM-L6-v2
#all-MiniLM-L12-v2

###Using HuggingFace Embeddings over OpenAI Embeddings - Tried the one's above but this looks to give best result
from sentence_transformers import SentenceTransformer
from langchain_community.embeddings import HuggingFaceEmbeddings
hf = HuggingFaceEmbeddings(
    model_name='all-MiniLM-L12-v2'
)


from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import GPT4AllEmbeddings

import os
from dotenv import load_dotenv
load_dotenv()


  from .autonotebook import tqdm as notebook_tqdm


True

In [2]:
###Objective here is to create a global report from whole data and not local summary. Hence we will not use shap_explanation file


text = [
    "..//documents//data_dictionary.txt",
    "..//documents//data_summary.txt",
    "..//documents//shap_summary.txt"
]
docs = [TextLoader(url).load() for url in text]
docs_list = [item for sublist in docs for item in sublist]

text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=400, chunk_overlap=100
)
doc_splits = text_splitter.split_documents(docs_list)

# Add to vectorDB
vectorstore = Chroma.from_documents(
    documents=doc_splits,
    collection_name="churn-rag-chroma-2",
    embedding=hf,
    
)


First we will create a Business Analyst and Data Science Manager to reports from the model. This is simple agents created by Prompt defention and simple RAG. 

Multiple questions are asked to the business analyst agent and then the answers for these questions are passed as context to Data Science Manger

In [3]:


# Post-processing
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

def business_analyst1(question,model,top_k=7,fetch_k=20,retriver_type="normal",search_type="similarity"):

    # Prompt
    prompt = PromptTemplate(
        template=""" You are a business analyst for a telecom company. 
                Your job is to answer question from executives on Churn. 
                Answer ONLY with stats you got from the context provided.
                DO NOT add any erronous stats.
                Answer from context only.

                Context: {context}

                Question: {question}""",
        input_variables=["question", "context"],
    )

    if retriver_type=="fetch_k":
        retriever1 = vectorstore.as_retriever(search_type="mmr",
                                          search_kwargs={'k': top_k, 'fetch_k': fetch_k})
    elif retriver_type=='non-similar':
        retriever1 = vectorstore.as_retriever(search_type="mmr",
                                          search_kwargs={'k': top_k, 'lambda_mult': 0.25})
    else:
        retriever1 = vectorstore.as_retriever(search_type='similarity',search_kwargs={'k': top_k})
        
    docs=retriever1.invoke(question)

    print(len(docs))

    #llm = Ollama(model=llm1)
    llm0 = ChatOllama(model=model)
    # Chain
    rag_chain0 = prompt | llm0 | StrOutputParser()

    # docs = retriever.invoke(question)
    document=format_docs(docs)
    answer = rag_chain0.invoke({"question": question,"context": document})

    return answer



def manager(question,model,output1,output2,output3,output4):

    # Prompt
    prompt = PromptTemplate(
        template=""" You are manager of data science team for a telecom company. 
                You get different reports on churn from your business analyst in your team.
                Your job is to format and proof read the report provided to you by the business analyst and create a enriched report for senior executive team within 1000 words.
                The end user might not be tech savvy, so keep the language easy to understand
                Answer should be properly formatted and user friendly and easy to understand.
                class 1 is churn.
                Use the below context for reference to make answer.

                Context: {context}

                Question: {question}""",
        input_variables=["question", "context"],
    )


    llm0 = ChatOllama(model=model)
    # Chain
    rag_chain0 = prompt | llm0 | StrOutputParser()

    answer = rag_chain0.invoke({"question": question,"context": output1 + "\n" + output2 + "\n" + output3+"\n"+output4})

    return answer




In [4]:
output1=(business_analyst1(question="Look at all features and identify the top 20 features which increases the probability of class 1 the most. The highest increase in probability will be on top",
                        model="llama3:latest",top_k=8,retriver_type='normal'))
print(output1)

8
Based on the SHAP values, here are the top 20 features that increase the probability of class 1:

1. contract - Month-to-month (increase: 16.49%)
2. paymentmethod - Electronic check (increase: 4.13%)
3. tenure - within (-0.001, 2.0] (increase: 13.86))
 SHAP Question: Look at all features and identify the top 20 features which increases the probability most.


In [5]:
output2=(business_analyst1(question="Look at all features and identify the top 10 features which decreases the probability of class 1 the most. The highest decrease in probability will be on top",
                        model="llama3:latest",top_k=8,retriver_type='normal'))
print(output2)

8
After analyzing the feature importances, I found that the top 10 features that decrease the probability of class 1 the most are:

1. **Two year contract**: Decreases probability by 19.73%
2. **No internet service**: Decreases probability by 11.49% (for multiple lines)
3. **No online security**: Decreases probability by 6.08%
4. **No device protection**: Decreases probability by 5.05% (with tech support)
5. **Two year tenure**: Decreases probability by 4.60% (payment method is mailed check)
6. **No phone service**: Decreases probability by 4.50% (multiple lines and internet services)
7. **One year contract**: Decreases probability by 3.55% (paperless billing)
8. **No online backup**: Decreases probability by 3.35% (tech support is no)
9. **DSL internet service**: Decreases probability by 2.88% (payment method is credit card automatic)
10. **No streaming movies**: Decreases probability by 2.53% (tech support is yes)

These features are likely to have a significant impact on the likelih

In [6]:
output3=(business_analyst1(question="Look at all features and identify the top 10 features importance rank in the model prediction. The lowest importance rank in the model prediction be on top",
                        model="llama3:latest",top_k=8,retriver_type='normal'))
print(output3)

8
Based on the feature importances calculated using the `feature_importances_` attribute from the `RandomForestClassifier`, the top 10 features in terms of their importance in the model's predictions are:

1. tenure (importance: 0.1452)
2. totalcharges (importance: 0.1264)
3. monthlycharges (importance: 0.1155)
4. seniorcitizen (importance: 0.1048)
5. onlinebackup (importance: 0.0956)
6. deviceprotection (importance: 0.0869)
7. techsupport (importance: 0.0791)
8. streamingtv (importance: 0.0763)
9. contract (importance: 0.0745)
10. gender (importance: 0.0662)

The lowest importance rank in the model prediction is on top, so the feature with the highest importance is actually `seniorcitizen`, which has an importance of 0.1048.

Here's a rough interpretation of these results:

- The most important features are related to customer tenure and billing information (tenure, totalcharges, monthlycharges), suggesting that longer-term customers who spend more on their plans are more likely to ch

In [7]:
output4=(business_analyst1(question="What are the list of different action features?",
                        model="llama3:latest",top_k=3,retriver_type='normal'))
print(output4)

3
According to the context, the list of different action features are:

1. PhoneService
2. InternetService
3. OnlineBackup
4. OnlineSecurity
5. DeviceProtection
6. TechSupport
7. Contract


In [8]:
final_report=manager(question=f"""What are the top 5 reasons for churn?\n
                             What are top 5 recommended actions to improve retention and decrease churn?""",
                             output1=output1,
                             output2=output2,
                             output3=output3,
                             output4=output4,
                             model="llama3:latest")
print(final_report)

**Churn Report**

As a telecom company, understanding why customers leave us is crucial to improving our services and retaining valuable relationships. Based on the SHAP values and feature importances, I have compiled the top 20 features that increase or decrease the probability of class 1 (churn). Below are the key findings:

**Top 20 Features that Increase Churn Probability:**

The top 10 features that increase the probability of churn are:

1. **Month-to-month contract**: Increases probability by 16.49%
2. **Electronic check payment method**: Increases probability by 4.13%
3. **Tenure within (-0.001, 2.0]**: Increases probability by 13.86%

These features indicate that customers who are on month-to-month contracts and use electronic checks for payments are more likely to churn.

**Top 20 Features that Decrease Churn Probability:**

The top 10 features that decrease the probability of churn are:

1. **Two-year contract**: Decreases probability by 19.73%
2. **No internet service** (mu

In [9]:
response2=manager(question=f"""
                             What are top 5 recommended actions to improve retention and decrease churn specifically using action features?
                             Only use action features for this recommnedation""",
                             output1=output1,
                             output2=output2,
                             output3=output3,
                             output4=output4,
                             model="llama3:latest")
print(response2)

**Report: Top 5 Recommended Actions to Improve Retention and Decrease Churn**

**Executive Summary**

Our data science team has analyzed the SHAP values and feature importances to identify the top 5 recommended actions to improve retention and decrease churn, specifically using action features. These findings are based on our analysis of customer behavior and preferences.

**Recommended Actions**

1. **Offer Phone Service**: Providing phone service to customers is crucial in reducing the likelihood of churn by 19.73%. This feature has a significant impact on customer satisfaction and loyalty.
2. **Provide Internet Service**: Offering internet services, especially for multiple lines, can decrease the probability of churn by 11.49%. This highlights the importance of reliable connectivity in retaining customers.
3. **Enhance Online Security**: Providing online security features to customers decreases the probability of churn by 6.08%. This emphasizes the need for robust digital protection


#### LangGraph Agents


##### What are LangGraph Agents?

    In the LangChain and LangGraph context, an agent is a sophisticated construct that utilizes language models (LLMs) to reason and decide on actions in an autonomous manner. Unlike traditional LangChain chains that follow a predefined sequence, agents dynamically determine their next steps based on the current situation and goals.

##### Building Agents with LangGraph

    Define Nodes: Each node in the LangGraph represents a function or a LangChain runnable (like an LLMChain).

    Connect with Edges: Define edges to dictate the flow of information and control between nodes. You'll often have conditional edges where the next node is determined based on the output of the current node.
    
    Agent Logic: One or more nodes will typically contain the agent's decision-making logic, using an LLM to analyze the current state and choose the next action.


We will try to implement an advanced RAG system for the above Task which will be more accurate. We will design this system as LangGraph agents. We will add the Business Analysts and Manager Agents into this architecture

In [11]:
from langchain.prompts import PromptTemplate
from langchain_community.chat_models import ChatOllama
from langchain_core.output_parsers import JsonOutputParser

from langchain import hub
from langchain_community.chat_models import ChatOllama
from langchain_core.output_parsers import StrOutputParser
from typing_extensions import TypedDict
from typing import List
from langchain.schema import Document
from langgraph.graph import END, StateGraph
from pprint import pprint

The LangGraph Agent we will create will be having this architecture

![Adaptive_RAG_Agent](..//images//Adaptive_RAG_Agent.png)

In [13]:

##Retrieval Grader
retrieval_prompt = PromptTemplate(
        template="""You are a grader assessing relevance of a retrieved document to a user question. \n 
        Here is the retrieved document: \n\n {document} \n\n
        Here is the user question: {question} \n
        If the document contains keywords related to the user question, grade it as relevant. \n
        It does not need to be a stringent test. The goal is to filter out erroneous retrievals. \n
        Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question. \n
        Provide the binary score as a JSON with a single key 'score' and no premable or explaination.""",
        input_variables=["question", "document"],
    )

llm = ChatOllama(model="llama3:latest", format="json", temperature=0)
retrieval_grader = retrieval_prompt | llm | JsonOutputParser()
retriever=vectorstore.as_retriever(search_type='similarity',search_kwargs={'k': 20})
# docs = retriever.invoke(question)


##Answer Generator
buisiness_analyst_prompt = PromptTemplate(
    template=""" You are a business analyst for a telecom company. 
            Your job is to create a report for senior executives on questions asked. 
            Answer ONLY with stats you got from the context provided.
            DO NOT add any erronous stats.

            Context: {context}

            Question: {question}""",
    input_variables=["question", "context"],
)


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

generator_llm = ChatOllama(model="llama3:latest",temperature=0)
# Chain
business_analyst_rag_chain = buisiness_analyst_prompt | generator_llm | StrOutputParser()


##Manager Answer Generator
manager_prompt = PromptTemplate(
    template=""" You are manager of data science team for a telecom company. 
                You get different reports on churn from your business analyst in your team.
                Your job is to format and proof read the report provided to you by the business analyst and create a enriched report for senior executive team.
                The end user might not be tech savvy, so keep the language easy to understand
                Answer should be properly formatted and user friendly and easy to understand.
                class 1 is churn.
                Use the below context for reference to make answer.

            Context: {context}

            Question: {question}""",
    input_variables=["question", "context"],
)

manager_rag_chain = manager_prompt | generator_llm | StrOutputParser()



### Hallucination Grader 

# LLM
llm = ChatOllama(model="llama3:latest", format="json", temperature=0)

# Prompt
hallucination_prompt = PromptTemplate(
    template="""You are a grader assessing whether an answer is grounded in / supported by a set of facts. \n 
    Here are the facts:
    \n ------- \n
    {documents} 
    \n ------- \n
    Here is the answer: {generation}
    Give a binary score 'yes' or 'no' score to indicate whether the answer is grounded in / supported by a set of facts. \n
    Provide the binary score as a JSON with a single key 'score' and no preamble or explanation.""",
    input_variables=["generation", "documents"],
)

hallucination_grader = hallucination_prompt | llm | JsonOutputParser()


### Answer Grader 


# Prompt
answer_grader_prompt = PromptTemplate(
    template="""You are a grader assessing whether an answer is useful to resolve a question. \n 
    Here is the answer:
    \n ------- \n
    {generation} 
    \n ------- \n
    Here is the question: {question}
    Give a binary score 'yes' or 'no' to indicate whether the answer is useful to resolve a question. \n
    Provide the binary score as a JSON with a single key 'score' and no preamble or explanation.""",
    input_variables=["generation", "question"],
)

answer_grader = answer_grader_prompt | llm | JsonOutputParser()
# answer_grader.invoke({"question": question,"generation": answer_generation})




### Question Re-writer
rewrite_llm = ChatOllama(model="llama3:latest", temperature=0)
# Prompt 
re_write_prompt = PromptTemplate(
    template="""You a question re-writer that converts an input question to a better version that is optimized \n 
     for vectorstore retrieval. Look at the initial and formulate an improved question. \n
     Here is the initial question: \n\n {question}.
     ONLY RETURN THE REWRITTEN QUESTION.\n
     DONOT ADD ANY TEXT OTHER THAN REWRITTEN QUESTION.\n
     Improved question with no preamble: \n """,
    input_variables=["generation", "question"],
)

question_rewriter = re_write_prompt | rewrite_llm | StrOutputParser()






In [14]:
class GraphState(TypedDict):
    """
    Represents the state of our graph.

    Attributes:
        question: question
        generation: LLM generation
        documents: list of documents 
        report: FInal report from Manager LLM
    """
    question : str
    generation : str
    documents : List[str]
    report : str

retriever=vectorstore.as_retriever(search_type='similarity',search_kwargs={'k': 20})

In [15]:
### Nodes
def retrieve(state):
    """
    Retrieve documents

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): New key added to state, documents, that contains retrieved documents
    """
    print("---RETRIEVE---")
    question = state["question"]

    # Retrieval
    documents = retriever.invoke(question)
    return {"documents": documents, "question": question}

def analyst_generate(state):
    """
    Generate answer

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): New key added to state, generation, that contains LLM generation
    """
    print("---GENERATE---")
    question = state["question"]
    documents = state["documents"]
    context=format_docs(documents)
    
    # RAG generation
    generation = business_analyst_rag_chain.invoke({"context": context, "question": question})
    return {"documents": documents, "question": question, "generation": generation}

def grade_documents(state):
    """
    Determines whether the retrieved documents are relevant to the question.

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): Updates documents key with only filtered relevant documents
    """

    print("---CHECK DOCUMENT RELEVANCE TO QUESTION---")
    question = state["question"]
    documents = state["documents"]
    
    # Score each doc
    filtered_docs = []
    for d in documents:
        score = retrieval_grader.invoke({"question": question, "document": d.page_content})
        grade = score['score']
        if grade == "yes":
            print("---GRADE: DOCUMENT RELEVANT---")
            filtered_docs.append(d)
        else:
            print("---GRADE: DOCUMENT NOT RELEVANT---")
            continue
    return {"documents": filtered_docs, "question": question}

def transform_query(state):
    """
    Transform the query to produce a better question.

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): Updates question key with a re-phrased question
    """

    print("---TRANSFORM QUERY---")
    question = state["question"]
    documents = state["documents"]

    # Re-write question
    better_question = question_rewriter.invoke({"question": question})
    return {"documents": documents, "question": better_question}


def decide_to_generate(state):
    """
    Determines whether to generate an answer, or re-generate a question.

    Args:
        state (dict): The current graph state

    Returns:
        str: Binary decision for next node to call
    """

    print("---ASSESS GRADED DOCUMENTS---")
    question = state["question"]
    filtered_documents = state["documents"]

    if not filtered_documents:
        # All documents have been filtered check_relevance
        # We will re-generate a new query
        print("---DECISION: ALL DOCUMENTS ARE NOT RELEVANT TO QUESTION, TRANSFORM QUERY---")
        return "transform_query"
    else:
        # We have relevant documents, so generate answer
        print("---DECISION: GENERATE---")
        return "analyst_generate"

def grade_generation_v_documents_and_question(state):
    """
    Determines whether the generation is grounded in the document and answers question.

    Args:
        state (dict): The current graph state

    Returns:
        str: Decision for next node to call
    """

    print("---CHECK HALLUCINATIONS---")
    question = state["question"]
    documents = state["documents"]
    generation = state["generation"]

    score = hallucination_grader.invoke({"documents": format_docs(documents), "generation": generation})
    grade = score['score']

    # Check hallucination
    if grade == "yes":
        print("---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---")
        # Check question-answering
        print("---GRADE GENERATION vs QUESTION---")
        score = answer_grader.invoke({"question": question,"generation": generation})
        grade = score['score']
        if grade == "yes":
            print("---DECISION: GENERATION ADDRESSES QUESTION---")
            return "useful"
        else:
            print("---DECISION: GENERATION DOES NOT ADDRESS QUESTION---")
            return "not useful"
    else:
        pprint("---DECISION: GENERATION IS NOT GROUNDED IN DOCUMENTS, RE-TRY---")
        return "not supported"
    
def manager_generate(state):
    """
    Generate answer

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): New key added to state, generation, that contains LLM generation
    """
    print("---GENERATE---")
    question = state["question"]
    context=state["generation"]
    
    # RAG generation
    report = manager_rag_chain.invoke({"context": context, "question": question})
    return {"question": question, "report": report}

In [16]:
## Create the Graph Agent

workflow = StateGraph(GraphState)

# Define the nodes
workflow.add_node("retrieve", retrieve) # retrieve
workflow.add_node("grade_documents", grade_documents) # grade documents
workflow.add_node("analyst_generate", analyst_generate) # generatae
workflow.add_node("transform_query", transform_query) # transform_query
workflow.add_node("manager_generate", manager_generate) # manager_query

# Build graph
workflow.set_entry_point("retrieve")
workflow.add_edge("retrieve", "grade_documents")
workflow.add_conditional_edges(
    "grade_documents",
    decide_to_generate,
    {
        "transform_query": "transform_query",
        "analyst_generate": "analyst_generate",
    },
)
workflow.add_edge("transform_query", "retrieve")
workflow.add_conditional_edges(
    "analyst_generate",
    grade_generation_v_documents_and_question,
    {
        "not supported": "analyst_generate",
        "useful": "manager_generate",
        "not useful": "transform_query",
    },
)
workflow.add_edge("manager_generate",END)

# Compile
app = workflow.compile()

In [18]:
##Test the Agent with a Question

# Run 
inputs = {"question": "What are the top reasons which increases the probability of class 1?"}
for output in app.stream(inputs):
    for key, value in output.items():
        # Node
        pprint(f"Node '{key}':")
    pprint("\n---\n")

# Final generation
pprint(value["report"])

Number of requested results 20 is greater than number of elements in index 16, updating n_results = 16


---RETRIEVE---
"Node 'retrieve':"
'\n---\n'
---CHECK DOCUMENT RELEVANCE TO QUESTION---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---ASSESS GRADED DOCUMENTS---
---DECISION: GENERATE---
"Node 'grade_documents':"
'\n---\n'
---GENERATE---
---CHECK HALLUCINATIONS---
---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---
---GRADE GENERATION vs QUESTION---
---DECISION: GENERATION ADDRESSES QUESTION---
"Node 'analyst_generate':"
'\n---\n'
---GENERATE---
"Node 'manager_generate':"
'\n---\n'
('**Churn Prediction Report**\n'
 

In [19]:
inputs = {"question": "What are 5 recommended actions to increase retention or to reduce churn?"}
for output in app.stream(inputs):
    for key, value in output.items():
        # Node
        pprint(f"Node '{key}':")
    pprint("\n---\n")

# Final generation
pprint(value["report"])

Number of requested results 20 is greater than number of elements in index 16, updating n_results = 16


---RETRIEVE---
"Node 'retrieve':"
'\n---\n'
---CHECK DOCUMENT RELEVANCE TO QUESTION---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---ASSESS GRADED DOCUMENTS---
---DECISION: GENERATE---
"Node 'grade_documents':"
'\n---\n'
---GENERATE---
---CHECK HALLUCINATIONS---
---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---
---GRADE GENERATION vs QUESTION---
---DECISION: GENERATION ADDRESSES QUESTION---
"Node 'analyst_generate':"
'\n---\n'
---GENERATE---
"Node 'manager_generate':"
'\n---\n'
('**Ch

Coudn't visalize graph directly within notebook. There might be a way but I printed the mermaid graph structure and the used this to plot graph on a online mermaid live service

https://mermaid.live/edit#pako:eNqNVF1rGzEQ_CtCxdiFOK-hFyiYmKQvTalb-uILZi2tbIFOMntSQjD336u76L7wGfK2Ws3MSrMrnblwEnnGZ7Ozttpn7DxXxr2JI5Cf1ysR6BVjNDfaItC8qqrZLLcHgtOR_V3f5zb3u13pI3632_4OWHrt7EuWZU1OGCjLBEIrI-RRWzBsgydHvobFbA_aoCeNr0iLbRe-fI0o549IPW7tRCjQ-icCWYPH6ynGKlZ9L_1imwK2OkT8FHKDiqDA9ir1SZoE6y43QfoJFg71SVJwXf4JLRJ4_EVJd7FtU2xly7fIdcQ-UzPR4m5rQ59h153ousWWy--ss3l4jWYnNWzcl2Zn7PaUaRPSY1IDuLBi2iC2vL3wg902BUZFr7Pbnn9MXcNNqeFsDM_UmTrp9FAz9UyX7AcYE0Qcb4_yosaUxLMbcOpLPTgiFJ3mI7miq1MLpvZ8XnAffJ38B0bLKdVrLjbTskbF-kfMlDYm-6KUVHsYg9onnCB7UEp8G0P6GWx1QN5JvOc3vEAqQMv4CZ1zy1jOI7TAnGcxlKggGJ_z3FYRCsG7P-9W8MxTwBseTjKavdYQv6PiI1n9B-nrv8k

In [20]:
print(app.get_graph().draw_mermaid())

%%{init: {'flowchart': {'curve': 'linear'}}}%%
graph TD;
	__start__[__start__]:::startclass;
	__end__[__end__]:::endclass;
	retrieve([retrieve]):::otherclass;
	grade_documents([grade_documents]):::otherclass;
	analyst_generate([analyst_generate]):::otherclass;
	transform_query([transform_query]):::otherclass;
	manager_generate([manager_generate]):::otherclass;
	__start__ --> retrieve;
	manager_generate --> __end__;
	retrieve --> grade_documents;
	transform_query --> retrieve;
	grade_documents -.-> transform_query;
	grade_documents -.-> analyst_generate;
	analyst_generate -. not supported .-> analyst_generate;
	analyst_generate -. useful .-> manager_generate;
	analyst_generate -. not useful .-> transform_query;
	classDef startclass fill:#ffdfba;
	classDef endclass fill:#baffc9;
	classDef otherclass fill:#fad7de;

