<h1 style="font-family: 'Helvetica Neue', Arial, sans-serif; font-size: 32px; color: black;">
    Retrieval-Augmented Generation (RAG)
</h1>


<p style="font-size: 16px; font-family: Arial, sans-serif; color: #333;">
    Retrieval-Augmented Generation (RAG) is a method that combines the two worlds of information retrieval and generative models. 
    In RAG, an information retrieval component fetches relevant documents or data based on a user query, and then this is passed down the pipeline to a language model which uses this retrieved information to generate a more informed, accurate, and contextually relevant response. 
 RAG helps us overcome limitations of traditional language models, which may lack specific or up-to-date knowledge.
</p>

In [7]:
# pre-requisite installs 
!pip3 install requests
!pip3 install opensearch-py




In [8]:
import requests
from opensearchpy import OpenSearch
import json
import ipywidgets as widgets
from IPython.display import display, clear_output
import asyncio
import re

<h1 style="font-family: 'Helvetica Neue', Arial, sans-serif; font-size: 20px; color: black;">
    1st Step: Set Up a Connection with the OpenSearch Client 🧪
</h1>

<p style="font-size: 16px; font-family: Arial, sans-serif; color: #333;">
    In this first step, we establish a connection to the OpenSearch client, which is essential for querying and retrieving our data. 
</p>


In [10]:
password = getpass.getpass(f"Password: ")
client = OpenSearch(
    hosts=["https://149.165.153.78:9200"],  # dashboard is 5201 / opensearch indices at port 9200
    http_auth=("admin", password),
    verify_certs=False
)

Password:  ········




<h1 style="font-family: 'Helvetica Neue', Arial, sans-serif; font-size: 20px; color: black;">
    2nd Step: Connect to the Llama Model 🔗
</h1>

<p style="font-size: 16px; font-family: Arial, sans-serif; color: #333;">
    In this second step, we establish a connection to the Llama model. The Llama 3 model is hosted locally on LM Studio. By connecting to this model, we can leverage its capabilities to process retrieved information from OpenSearch.
</p>


In [12]:
anvil_gpt_api_key = getpass.getpass("Enter the AnvilGPT API key: ")
async def call_llama_model(query_payload):
    llama_api_url = "https://anvilgpt.rcac.purdue.edu/ollama/api/chat"
    try:
        response = requests.post(
            llama_api_url,
            headers = {
                "Authorization": f"Bearer {anvil_gpt_api_key}",
                "Content-Type": "application/json"
            },
            json=query_payload
        )
        if response.status_code == 200:
            return response.json()
        else:
            raise Exception(f"Error: {response.status_code}, {response.text}")
        
    except Exception as error:
        print("Error fetching from Llama model:", error)
        return None

Enter the AnvilGPT API key:  ········


<h1 style="font-family: 'Helvetica Neue', Arial, sans-serif; font-size: 20px; color: black;">
    3rd Step: Query the OpenSearch Index with Data 🔍
</h1>

<p style="font-size: 16px; font-family: Arial, sans-serif; color: #333;">
    In this third step, we perform a query on the OpenSearch index to retrieve relevant data based on user input. By querying this index, we can filter and retrieve the specific documents or records that match the user's query.
</p>


In [18]:
# Function to get search results from OpenSearch
async def get_search_results(user_query):
    try:
        response = client.search(
            index="neo4j-elements",
            body={
                "query": {
                    "match": {
                        "contents": user_query  # Query based on user input
                    }
                }
            }
        )
        return response['hits']['hits']
    except Exception as error:
        print('Error connecting to OpenSearch:', error)
        return []

<h1 style="font-family: 'Helvetica Neue', Arial, sans-serif; font-size: 20px; color: black;">
    4th Step: Process the User Query Through the Pipeline 🔄
</h1>

<p style="font-size: 16px; font-family: Arial, sans-serif; color: #333;">
    In this final step, the user query moves through the entire processing pipeline. 

</p>


In [21]:
async def handle_user_input(user_query):
    print("Fetching search results...")
    search_results = await get_search_results(user_query)

    if not search_results:
        print("No search results found.")
        return

    print("Preparing payload for Llama model...")

    formatted_results = '\n'.join([
        f"Title: {hit['_source']['title']}\n"
        f"Content: {hit['_source']['contents']}\n"
        f"Contributor: {hit['_source']['contributor']}\n\n"
        for hit in search_results
    ])

    query_payload = {
        "model": "llama3:instruct",
        "messages": [
            {
                "role": "system",
                "content": "You are an assistant who helps summarize and organize information from search results."
            },
            {
                "role": "user",
                "content": f"User Query: {user_query}\nSearch Results:\n{formatted_results}"
            }
        ],
        "stream": False
    }

    print("Calling Llama model...")
    llama_response = await call_llama_model(query_payload)
    print(llama_response)

    if llama_response and 'choices' in llama_response and llama_response['choices']:
        print("\nLlama model response:")
        print(llama_response['choices'][0]['message']['content'])
    else:
        print("Unexpected response format or no choices available.")

<h1 style="font-family: 'Helvetica Neue', Arial, sans-serif; font-size: 20px; color: black;">
    Now try and ask the Model a Question 💬
</h1>

<p style="font-size: 16px; font-family: Arial, sans-serif; color: #333;">
    What is CyberGIS? What is geospatial data?
</p>



In [24]:
user_query = input("Enter the user query: ")

Enter the user query:  Give me some datasets about Chicago


<h1 style="font-family: 'Helvetica Neue', Arial, sans-serif; font-size: 20px; color: black;">
Generating Results...
</h1>


In [26]:
await handle_user_input(user_query)

Fetching search results...




Preparing payload for Llama model...
Calling Llama model...
{'model': 'llama3:instruct', 'created_at': '2024-11-12T17:29:00.974919041Z', 'message': {'role': 'assistant', 'content': "Based on the search results, I've summarized some datasets related to Chicago:\n\n1. **Chicago Communities**: A geospatial dataset of communities within the city of Chicago (contributor: Rebecca (Becky) Vandewalle)\n2. **Chicago Major Streets**: A geospatial dataset of major streets within the city of Chicago (contributor: Rebecca (Becky) Vandewalle)\n\nThese datasets might be useful for geographic information systems (GIS), urban planning, or community development projects.\n\nAdditionally, there are other results that may not directly relate to Chicago but could be relevant to GIS or geography-related topics:\n\n1. **Computer Science and Programming Courses in Geography Departments in the United States**: A study on how U.S. geography departments introduce computer science and programming skills in their 

In [28]:
def create_query_payload(model, systemMessage, userMessage, stream):
    query_payload = {
        "model": model,
        "messages": [
            {
                "role": "system",
                "content": systemMessage
            },
            {
                "role": "user",
                "content": userMessage
            }
        ],
        "stream": stream
    }
    return query_payload

In [106]:
async def retrieve_documents(userQuery):
    print("Fetching search results...")
    search_result = await get_search_results(user_query)

    if not search_result:
        print("No search results found.")
        return

    documents = [
        f"Title: {hit['_source']['title']}\n"
        f"Content: {hit['_source']['contents']}\n"
        f"Contributor: {hit['_source']['contributor']}\n"
        for hit in search_result
    ]
    return documents

def extract_binary_score(content):
    # Use regex to find the value of "binary_score" directly, even in incomplete JSON
    match = re.search(r'"binary_score":\s*"(yes|no)"', content)
    
    if match:
        # Extract the matched value ("yes" or "no")
        binary_score = match.group(1)
        return binary_score
    else:
        print("No valid binary_score found.")
        return None


In [32]:
doc_grader_instructions = """You are a grader assessing relevance of a retrieved document to a user question.

If the document contains keyword(s) or semantic meaning related to the question, grade it as relevant."""
doc_grader_prompt = """Here is the retrieved document: \n\n {document} \n\n Here is the user question: \n\n {question}. 

This carefully and objectively assess whether the document contains at least some information that is relevant to the question.

Return only JSON with single key, binary_score, that is 'yes' or 'no' score to indicate whether the document contains at least some information that is relevant to the question. No reasoning or explination"""
async def grade_documents(state):
    question = state["question"]
    document_list = state["documents"]
    print("---CHECK DOCUMENT RELEVANCE TO QUESTION---")
    # Score each doc
    filtered_docs = []
    web_search = "No" 
    for d in document_list:
        doc_grader_prompt_formated = doc_grader_prompt.format(document=d, question=question)
        result = await call_llama_model(create_query_payload("llama3.2:latest", doc_grader_instructions, doc_grader_prompt_formated, False))
        #print(result)
        grade = extract_binary_score(result['message']['content'])
        # Document relevant
        if grade.lower() == "yes":
            print("---GRADE: DOCUMENT RELEVANT---")
            filtered_docs.append(d)
        # Document not relevant
        else:
            print("---GRADE: DOCUMENT NOT RELEVANT---")
            # We do not include the document in filtered_docs
            # We set a flag to indicate that we want to run web search
            web_search = "Yes"
            continue
    return {"documents": filtered_docs, "question": question, "web_search": web_search}

In [34]:
question = "Give me some dataset about Chicago"
docs = await retrieve_documents(question)

Fetching search results...




In [38]:
docs

['Title: Computer Science and Programming Courses in Geography Departments in the United States\nContent: Geographic information systems (GIS) are fundamental information technologies. The capabilities and applications of GIS continue to rapidly expand, requiring practitioners to have new skills and competencies, especially in computer science. There is little research, however, about how best to prepare the next generation of GIScientists with adequate computer science skills. This article explores how U.S. geography departments are introducing and developing computer science and programming skills in their geography and GIS degree programs. We review the degree requirements in fifty-five geography departments and discover that forty-four of them offer some kind of GIS programming course. Of the 210 separate degree options identified, however, only 22 require one of these courses for a degree. There is little consistency or emphasis on computer science and programming skills in geogra

In [36]:
grade_state = await grade_documents({"documents": docs, "question": question})

---CHECK DOCUMENT RELEVANCE TO QUESTION---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---


In [40]:
generation_instructions = "You are an assistant who helps summarize and organize information from search results."
generation_prompt = """User Query: {user_query}\nSearch Results:\n{formatted_results}"""
def format_docs(docs):
    return "\n\n".join(doc for doc in docs)
async def generate(state):
    """
    Generate answer using RAG on retrieved documents

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): New key added to state, generation, that contains LLM generation
    """
    print("---GENERATE---")
    question = state["question"]
    documents = state["documents"]
    loop_step = state.get("loop_step", 0)
    
    # RAG generation
    docs_txt = format_docs(documents)
    generation_prompt_formatted = generation_prompt.format(user_query=question, formatted_results=docs_txt)
    llm_response = await call_llama_model(create_query_payload("llama3.2:latest", generation_instructions, generation_prompt_formatted, False))
    generation = llm_response['message']['content']
    return {"documents": documents, "generation": generation, "question": question, "loop_step": loop_step+1}

In [84]:
generation_state = await generate(grade_state)

---GENERATE---


In [85]:
print(generation_state["generation"])

Here is a summary of the search results related to Chicago datasets:

1. **Chicago Communities Dataset**
	* Source: Chicago Communitites ( contributed by Rebecca (Becky) Vandewalle)
	* Type: Geospatial dataset
	* Description: A geospatial dataset of communities within the city of Chicago.
2. **Human Sentiments of Heat Exposure in Chicago**
	* Source: Intermediate Results for Human Sentiments of Heat Exposure
	* Contribution: Fangzheng Lyu
	* Type: Data analysis results
	* Description: Example intermediate results for human sentiments of heat exposure at national-level and city-level in the city of Chicago.
3. **Social Media (Twitter) Data Visualization**
	* Source: Social Media (Twitter) Data Visualization
	* Contribution: Fangzheng Lyu
	* Type: Data visualization examples
	* Description: Examples of visualization of social media data, including location-based Twitter data for the City of Chicago and worldwide.

Let me know if you'd like me to help with anything else!


In [118]:
### Hallucination Grader 

# Hallucination grader instructions 
hallucination_grader_instructions = """

You are a teacher grading a quiz. 

You will be given FACTS and a STUDENT ANSWER. 

Here is the grade criteria to follow:

(1) Ensure the STUDENT ANSWER is grounded in the FACTS. 

(2) Ensure the STUDENT ANSWER does not contain "hallucinated" information outside the scope of the FACTS.

Score:

A score of yes means that the student's answer meets all of the criteria. This is the highest (best) score. 

A score of no means that the student's answer does not meet all of the criteria. This is the lowest possible score you can give.

Explain your reasoning in a step-by-step manner to ensure your reasoning and conclusion are correct. 

Avoid simply stating the correct answer at the outset."""

# Grader prompt
hallucination_grader_prompt = """FACTS: \n\n {documents} \n\n STUDENT ANSWER: {generation}. 

Return JSON with two two keys, binary_score is 'yes' or 'no' score to indicate whether the STUDENT ANSWER is grounded in the FACTS. And a key, explanation, that contains an explanation of the score. Return the json only without other explinations."""

### Answer Grader 

# Answer grader instructions 
answer_grader_instructions = """You are a teacher grading a quiz. 

You will be given a QUESTION and a STUDENT ANSWER. 

Here is the grade criteria to follow:

(1) The STUDENT ANSWER helps to answer the QUESTION

Score:

A score of yes means that the student's answer meets all of the criteria. This is the highest (best) score. 

The student can receive a score of yes if the answer contains extra information that is not explicitly asked for in the question.

A score of no means that the student's answer does not meet all of the criteria. This is the lowest possible score you can give.

Explain your reasoning in a step-by-step manner to ensure your reasoning and conclusion are correct. 

Avoid simply stating the correct answer at the outset."""

# Grader prompt
answer_grader_prompt = """QUESTION: \n\n {question} \n\n STUDENT ANSWER: {generation}. 

Return JSON with two two keys, binary_score is 'yes' or 'no' score to indicate whether the STUDENT ANSWER meets the criteria. And a key, explanation, that contains an explanation of the score. Return the json only without additional explination"""


async def grade_generation_v_documents_and_question(state, show_reason = False):
    """
    Determines whether the generation is grounded in the document and answers question

    Args:
        state (dict): The current graph state

    Returns:
        str: Decision for next node to call
    """

    print("---CHECK HALLUCINATIONS---")
    question = state["question"]
    documents = state["documents"]
    generation = state["generation"]
    max_retries = state.get("max_retries", 3) # Default to 3 if not provided

    hallucination_grader_prompt_formatted = hallucination_grader_prompt.format(documents=format_docs(documents), generation=generation)
    result = await call_llama_model(create_query_payload("llama3.2:latest", hallucination_grader_instructions, hallucination_grader_prompt_formatted, False))
    if(show_reason == True):
        print(result['message']['content'])
    grade = extract_binary_score(result['message']['content'])

    # Check hallucination
    if grade == "yes":
        print("---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---")
        # Check question-answering
        print("---GRADE GENERATION vs QUESTION---")
        # Test using question and generation from above 
        answer_grader_prompt_formatted = answer_grader_prompt.format(question=question, generation=generation)
        result = await call_llama_model(create_query_payload("llama3.2:latest", answer_grader_instructions, answer_grader_prompt_formatted, False))
        if(show_reason == True):
            print(result['message']['content'])
        grade = extract_binary_score(result['message']['content'])
        if grade == "yes":
            print("---DECISION: GENERATION ADDRESSES QUESTION---")
            return "useful"
        elif state["loop_step"] <= max_retries:
            print("---DECISION: GENERATION DOES NOT ADDRESS QUESTION---")
            return "not useful"
        else:
            print("---DECISION: MAX RETRIES REACHED---")
            return "max retries"  
    elif state["loop_step"] <= max_retries:
        print("---DECISION: GENERATION IS NOT GROUNDED IN DOCUMENTS, RE-TRY---")
        return "not supported"
    else:
        print("---DECISION: MAX RETRIES REACHED---")
        return "max retries"

In [124]:
generation_state = await generate(grade_state)
verdict = await grade_generation_v_documents_and_question(generation_state, show_reason = True)
while verdict != "useful":
    if verdict == "not supported" or verdict == "not useful":
        generation_state = await generate(generation_state)
        verdict = await grade_generation_v_documents_and_question(generation_state, show_reason = True)
    elif verdict == "max retries":
        break

---GENERATE---
---CHECK HALLUCINATIONS---
{
  "binary_score": "yes",
  "explanation": "The student answer accurately summarizes all three datasets and their contributors, ensuring it is grounded in the provided FACTS."
}
---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---
---GRADE GENERATION vs QUESTION---
```
{
  "binary_score": "yes",
  "explanation": "The student answer provides specific and relevant dataset information about Chicago, including geospatial data, human sentiments, and social media data. The datasets are not just listed but also include metadata (creator names) which is additional context."
}
```
---DECISION: GENERATION ADDRESSES QUESTION---


In [72]:
generation

'useful'