# RAG Project 2nd Phase - General Handbook / Querying Database and Prompting LLM
## Querying Pinecone
I will start by importing the required libraries.

In [16]:
from pinecone import Pinecone, ServerlessSpec
import google.generativeai as genai

from dotenv import load_dotenv
import os
load_dotenv()

## Initializing connection to Pinecone index
pinecone_api_key = os.environ.get("PINECONE_API_KEY")

pc = Pinecone(api_key=pinecone_api_key)
index = pc.Index("general-handbook-2")

## Initializing Google API to generate embeddings
google_api_key=os.environ.get("GOOGLE_API_KEY")

genai.configure(api_key=google_api_key)

Now, I will create a function that sends queries to Pinecone and returns the response.

In [17]:
import textwrap

def query_pinecone_with_id(query_text, top_k=5, context_window=3):
    """
    Query Pinecone with the given query text and return a context window based on the reference vector's ID.
    
    Args:
        query_text (str): The query text to search.
        top_k (int): Number of top results to return.
        context_window (int): Number of surrounding vectors to include before and after the reference vector.
        
    Returns:
        list: A list of formatted strings with the reference metadata at the top and the surrounding content.
    """
    
    # Step 1: Embed the query text using Gemini API
    query_vector = genai.embed_content(
        model="models/embedding-001",
        content=query_text,
        task_type="retrieval_document",
        title="Embedding of single string"
    )

    # Step 2: Query Pinecone for the top result (reference vector)
    results = index.query(
        namespace="general-handbook-vectors-2",
        vector=query_vector["embedding"],
        top_k=top_k,  # Retrieve just the top result
        include_values=False,
        include_metadata=True
    )

    # Step 3: Find the top result (reference vector) and its ID
    top_result = results["matches"][0]
    reference_id = int(top_result["id"])  # Convert the ID to an integer
    
    # Step 4: Retrieve the exact preceding and following vectors using the reference ID
    surrounding_ids = list(range(reference_id - context_window, reference_id + context_window + 1))

    # Step 5: Query Pinecone for each surrounding vector by ID
    context_matches = []
    for vector_id in surrounding_ids:
        try:
            result = index.query(
                namespace="general-handbook-vectors-2",
                id=str(vector_id),  # Use the vector ID as a string
                top_k=1,  # Get just the vector with this ID
                include_values=False,
                include_metadata=True
            )
            context_matches.append(result["matches"][0])  # Add the result to the context matches list
        except:
            # If a vector with the ID doesn't exist, skip it
            continue
        
    # Step 6: Format the metadata for the reference vector
    reference_metadata = textwrap.dedent(f"""
    Reference Metadata From Top Result:
    Chapter: {top_result['metadata']['chapter']}
    Section Title: {top_result['metadata']['title']}
    Section Number: {top_result['metadata']['section']}
    Url: {top_result['metadata']['chapter_url']}
    --------------------------------------------------""")

    # Step 7: Format the surrounding context (excluding metadata, just content)
    preceding_context = []
    following_context = []
    
    for match in context_matches:
        if match["id"] == str(reference_id):
            continue  # Skip the reference vector itself
        elif int(match["id"]) < reference_id:
            preceding_context.append(textwrap.dedent(f"Content: {match['metadata']['content']} (Chapter: {match['metadata']['chapter']}, Section: {match['metadata']['section']} {match['metadata']['title']})"))
        else:
            following_context.append(textwrap.dedent(f"Content: {match['metadata']['content']} (Chapter: {match['metadata']['chapter']}, Section: {match['metadata']['section']} {match['metadata']['title']})"))

    # Step 8: Combine the preceding context, reference metadata, and following context
    final_output = [reference_metadata] + preceding_context + [f"Content: {top_result['metadata']['content']} (Chapter: {match['metadata']['chapter']}, Section: {match['metadata']['section']} {match['metadata']['title']})"] + following_context
    
    print("Top result: ", top_result["metadata"]["content"]) # Identify the top result by printing its content

    match_string = "\n".join(final_output)

    return match_string

In [22]:
user_input = "What are the questions for the temple recommend interview?"

match_string = query_pinecone_with_id(user_input, top_k = 5)

print(match_string)

Top result:  The temple is the house of the Lord. Entering the temple and participating in ordinances there is a sacred privilege. This privilege is reserved for those who are spiritually prepared and striving to live the Lord’s standards, as determined by authorized priesthood leaders.To make this determination, priesthood leaders interview the member using the questions below. Leaders should not add or remove any requirements. However, they may adapt the questions to the age and circumstances of the member.Sometimes members have questions during a temple recommend interview. The priesthood leader may explain basic gospel principles. He may also help members understand the temple recommend questions if needed. However, he should not present his personal beliefs, preferences, or interpretations as Church doctrine or policy.Temple recommend interviews should not be rushed. They should be private. However, the person being interviewed may invite a spouse, parent, or another adult to be p

## Prompting Gemini with queried vectors
Now, I will create a function that takes up the results from the query to Pinecone, and uses them to answer the user's question.

First, let's import and create the necessary funcitons for the initial setup.

In [23]:
from IPython.display import Markdown
import textwrap

def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))

text_generator = genai.GenerativeModel('gemini-1.5-flash')

I will first try with a test prompt, and adjust it before creating the funtions that will put it all together.

In [24]:
promt = f"""
Your role:
You are a chatbot that answers questions about the General Handbook of the Church of Jesus Christ of Latter Day Saints.

Intructions:
Below is a list of information received from a vector database where you might find information to answer the user's question about the General Handbook.
Each chunk of information contains a title, url, and content. You will mainly answer questions from the content, but try also to include references for the user.

Information:
START OF INFORMATION

{match_string}
END OF INFORMATION

(Further instructions: If the user seems to be asking about something not related to the General Handbook, invite them to ask about it. If instead of a questions they seem to be thanking you for previous answers, show that you welcome their gratitude)
User Input or question:

{user_input}
"""
print(promt)


Your role:
You are a chatbot that answers questions about the General Handbook of the Church of Jesus Christ of Latter Day Saints.

Intructions:
Below is a list of information received from a vector database where you might find information to answer the user's question about the General Handbook.
Each chunk of information contains a title, url, and content. You will mainly answer questions from the content, but try also to include references for the user.

Information:
START OF INFORMATION


Reference Metadata From Top Result:
Chapter: 26. Temple Recommends
Section Title: Conducting a Temple Recommend Interview
Section Number: 26.3.3
Url: https://www.churchofjesuschrist.org/study/manual/general-handbook/26-temple-recommends?lang=eng
--------------------------------------------------
Content: Authorized priesthood leaders conduct interviews before a member can receive a temple recommend. Instructions are in LCR. Priesthood leaders should issue a recommend only if the member answers th

In [25]:
response = text_generator.generate_content(promt)

to_markdown(response.text)

> The General Handbook outlines the following questions for temple recommend interviews:
> 
> * Do you have faith in and a testimony of God, the Eternal Father; His Son, Jesus Christ; and the Holy Ghost?
> * Do you have a testimony of the Atonement of Jesus Christ and of His role as your Savior and Redeemer?
> * Do you have a testimony of the Restoration of the gospel of Jesus Christ?
> * Do you sustain the President of The Church of Jesus Christ of Latter-day Saints as the prophet, seer, and revelator and as the only person on the earth authorized to exercise all priesthood keys? Do you sustain the members of the First Presidency and the Quorum of the Twelve Apostles as prophets, seers, and revelators? Do you sustain the other General Authorities and local leaders of the Church?
> * The Lord has said that all things are to be “done in cleanliness” before Him (Doctrine and Covenants 42:41). Do you strive for moral cleanliness in your thoughts and behavior? Do you obey the law of chastity?
> * Do you follow the teachings of the Church of Jesus Christ in your private and public behavior with members of your family and others?
> * Do you support or promote any teachings, practices, or doctrine contrary to those of The Church of Jesus Christ of Latter-day Saints?
> * Do you strive to keep the Sabbath day holy, both at home and at church; attend your meetings; prepare for and worthily partake of the sacrament; and live your life in harmony with the laws and commandments of the gospel?
> * Do you strive to be honest in all that you do?
> * Are you a full-tithe payer? For new members seeking a recommend to perform proxy baptisms and confirmations: Are you willing to obey the commandment to pay tithing?
> * Do you understand and obey the Word of Wisdom?
> * (This question is omitted when interviewing a child or youth.) Do you have any financial or other obligations to a former spouse or to children? If yes, are you current in meeting those obligations?
> * (This question is omitted when interviewing a member who is not endowed.) Do you keep the covenants that you made in the temple?
> * (This question is omitted when interviewing a member who is not endowed.) Do you honor your sacred privilege to wear the garment as instructed in the initiatory ordinances? (Read the “Wearing the Temple Garment” statement, included below, to each member.)
> * Are there serious sins in your life that need to be resolved with priesthood authorities as part of your repentance?
> * Do you consider yourself worthy to enter the Lord’s house and participate in temple ordinances? 
> 
> These questions are found in the General Handbook, Chapter 26, Section 26.3.3.1. 


Finally, I will use gradio as my UI, and I will use the query_pinecone function and the generate_content function to prompt Gemini for answers to the users questions.

In [15]:
import gradio as gr

def format_chat_prompt(message, chat_history):
    prompt = ""
    for turn in chat_history:
        user_message, bot_message = turn
        prompt = f"{prompt}\nUser: {user_message}\nAssistant: {bot_message}"
    prompt = f"{prompt}\nUser: {message}\nAssistant:"
    return prompt

def respond(message, chat_history):
        
        #information = query_pinecone(message)        
        information = query_pinecone_with_id(message, top_k=7, context_window=5)
        print(information)

        full_prompt = f"""
        Your role:
        You are a chatbot that answers questions about the General Handbook of the Church of Jesus Christ of Latter Day Saints.

        Intructions:
        Below is a list of information received from a vector database where you might find information to answer the user's question about the General Handbook.
        The information contains a chapter, section title and number, and a url from the top result. It also contains the content of the top result and neighbor results, aka context window.
        You will answer questions using the content. Always include references for the user if available (Chapter, section title, section number, and url), including a link using the URL of the top result.
        If the user wants to speak in another language, use their language in your answer.

        Information:
        START OF INFORMATION

        {information}

        END OF INFORMATION

        Further instructions: 
        If the user seems to be asking about something not related to the General Handbook, invite them to ask about it. 
        If instead of a questions they seem to be thanking you for previous answers, show that you welcome their gratitude.

        User Input or Question:
        {message}
        """


        formatted_prompt = format_chat_prompt(full_prompt, chat_history)
        bot_message = text_generator.generate_content(formatted_prompt).text
        chat_history.append((message, bot_message))
        return "", chat_history

with gr.Blocks(theme=gr.themes.Soft(), css=".main {max-width: 800px; margin: auto}") as demo:
    # Add a title and description at the top
    gr.Markdown("""
    # General Handbook Chatbot
    Ask questions about the General Handbook of the Church of Jesus Christ of Latter-Day Saints.
    
    Type your question below and get relevant answers based on the content of the handbook.
    """)
    
    chatbot = gr.Chatbot(height=480)  # Fixed height for the chatbot box
    msg = gr.Textbox(label="Question")
    btn = gr.Button("Submit")
    clear = gr.ClearButton(components=[msg, chatbot], value="Clear console")
    
    # Add clickable example questions at the bottom
    examples = gr.Examples(
        examples=["What are the responsibilities of the Elder's Quorum President?", 
                  "What are the questions for the temple recommend interview?", 
                  "What does the church say about immigration?",
                  "Ward and stake callings chart."],
        inputs=msg
    )

    # Button click and submit actions
    btn.click(respond, inputs=[msg, chatbot], outputs=[msg, chatbot])
    msg.submit(respond, inputs=[msg, chatbot], outputs=[msg, chatbot])

gr.close_all()
demo.launch(share=True)

Running on local URL:  http://127.0.0.1:7870
Running on public URL: https://12d7ba7475210f882d.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




Top result:  

Reference Metadata From Top Result:
Chapter: 38. Church Policies and Guidelines
Section Title: Internet
Section Number: 38.8.20
Url: https://www.churchofjesuschrist.org/study/manual/general-handbook/38-church-policies-and-guidelines?lang=eng
--------------------------------------------------
Content: The Church encourages self-reliance. Members are encouraged to be spiritually and physically prepared for life’s challenges. See 22.1.However, Church leaders have counseled against extreme or excessive preparation for possible catastrophic events. Such efforts are sometimes called survivalism. Efforts to prepare should be motivated by faith, not fear.Church leaders have counseled members not to go into debt to establish food storage. Instead, members should establish a home storage supply and a financial reserve over time. See 22.1.4 and “Food Storage” (Topics and Questions, topics.ChurchofJesusChrist.org). (Chapter: 38. Church Policies and Guidelines, Section: 38.8.15 Extre