# Setup

## Environment

1. Create a virtual environment or similar (this was built with Python 3.10, but 3.11 should work too), and install `requirements.txt`:
    ```bash
    pip install -r requirements.txt
    ```
2. Setup Google Cloud Application Default Credentials (see [this doc](https://cloud.google.com/docs/authentication/provide-credentials-adc)).
3. Copy the `.env.template` file and set keys and other information as indicated.

## Connection Setup

Load the `.env` file into the Python environment:

In [18]:
from dotenv import load_dotenv
load_dotenv(override=True)  

True

Initialize VertexAI:

In [2]:
import os
import vertexai
vertexai.init(
    project=os.environ.get("GOOGLE_PROJECT_NAME"),
    location=os.environ.get("GOOGLE_LOCATION",'us-east1'),
)

Initialize LangChain VertexAI components:

In [3]:
from langchain_google_vertexai import VertexAI
from langchain_google_vertexai import ChatVertexAI
from langchain_google_vertexai import VertexAIEmbeddings
from vertexai.generative_models import GenerativeModel

llmModel = VertexAI(model_name=os.environ.get('GOOGLE_LLM','gemini-1.5-flash'))
chatModel = ChatVertexAI(model=os.environ.get('GOOGLE_LLM','gemini-1.5-flash'))
embedModel = VertexAIEmbeddings(model_name=os.environ.get('GOOGLE_EMBED_MODEL','multimodalembedding')) 
genModel = GenerativeModel(model_name=os.environ.get('GOOGLE_LLM','gemini-1.5-flash'))

Initialize Cassio (Astra DB)

In [4]:
import cassio
cassio.init(auto=True)

And establish the graph store:

In [5]:
from ragstack_langchain.graph_store import CassandraGraphStore     

SITE_PREFIX="travel_docs"
graph_store = CassandraGraphStore(
    embedModel,
    node_table=f"{SITE_PREFIX}_nodes",
    edge_table=f"{SITE_PREFIX}_edges")

# Create and Load LangChain `Document`s

The example `Tourbook.pdf` is fairly complex in structure, both digitally and visually. 

## Text

A variety of parsing tools such as Unstructured and Adobe ExtractAPI were attempted on `Tourbook.pdf` file, attempting with both file structure and OCR techniques, to no avail. The Vertex LLM was able to parse (with a fairly generic prompt), but unfortunately exited early as it determined it was repeating existing content and cited the URL of this!

In this notebook we are trying to demonstrate multi-modal embedding and retrieval, so the information was manually parsed, and put into the file `Tourbook.json`. This contains the first 24 pages minus the cover page, the table of contents, and a map on page 3.

In [80]:
from langchain_core.documents import Document
from ragstack_knowledge_store.link_tag import BidirLinkTag
import json

with open('Tourbook.json', 'r', encoding='utf-8') as file:
    text_data = json.load(file)

text_documents = []
h1_dict = {}

for i, entry in enumerate(text_data):
    # Capture the H1 header for each page for future reference
    h1 = entry['metadata'].get('h1')
    if h1:
        h1_dict[entry['metadata']['page_number']] = h1

    mime_type = entry['metadata'].get('mime_type','text/plain').split('/')
    entry['metadata']['mime_type'] = mime_type[0]
    if len(mime_type) > 1:
        entry['metadata']['mime_subtype'] = '/'.join(mime_type[1:])

    # Identify all header levels in the metadata and sort them
    headers = sorted([key for key in entry['metadata'] if key.startswith('h')], key=lambda x: int(x[1:]))
    if headers:
        # Lowest level header is the last in the sorted list
        lowest_header = headers[-1]
        header_content = entry['metadata'][lowest_header]
        link_header = BidirLinkTag(kind=lowest_header, tag=header_content)
        entry['metadata']['link_tags'] = [link_header]
    
    doc = Document(page_content=entry['page_content'], metadata=entry['metadata'])
    text_documents.append(doc)


Note the `metadata.link_tags` list; here we are linking to and from the H1 header level, which corresponds to the section. In this way, any information in a section will be linked to other information in the section.

## Images

For images, we will use `PyMuPDF` to extract images from the document, `base64` encode the image, and create a `Document` referencing the appropriate H1 heading for the page. 

In [81]:
import pymupdf
import base64

doc = pymupdf.open('Tourbook.pdf')
image_documents = []

# page_index starts from 0, so these are actual pages 3, 5-23, but are numbered 1 and 3-21. 
pages_to_process = [2] + list(range(4, 23))  

for page_index in pages_to_process:
    page = doc[page_index]
    image_list = page.get_images()
    adjusted_page_number = page_index - 1

    # Iterate over the images on the page
    for image_index, img in enumerate(image_list, start=1):
        xref = img[0]
        pix = pymupdf.Pixmap(doc, xref) 
        if pix.n - pix.alpha > 3:
            pix = pymupdf.Pixmap(pymupdf.csRGB, pix)

        base64_image = base64.b64encode(pix.tobytes(output="png")).decode('utf-8')

        if adjusted_page_number % 2 == 0:  # If it's even
            page_spread = f"{adjusted_page_number}-{adjusted_page_number+1}"
        else:  # If it's odd
            page_spread = f"{adjusted_page_number-1}-{adjusted_page_number}"

        h1 = h1_dict[adjusted_page_number]
        link_h1 = BidirLinkTag(kind="h1", tag=h1)
        doc_metadata = {
            "mime_type": "image",
            "mime_subtype": "png",
            "mime_encoding": "base64",
            "page_number": adjusted_page_number, 
            "page_spread": page_spread, 
            "image_index": image_index,
            "h1": h1, 
            "link_tags" : [ link_h1 ]

        }
        # Now in theory, langchain_google_vertex.embeddings.embed_image() calls ImageBytesLoader.load_bytes
        # which can take a base64 string, but that wasn't working...but this URI trick does work!
        image_doc = Document(page_content=f"data:image/png;base64,{base64_image}", metadata=doc_metadata)
        image_documents.append(image_doc)

## Load Knowledge Store

In [82]:
docs = []
#for doc in text_documents + image_documents:
for doc in text_documents:
    docs.append(doc)

    if len(docs) >= 50:
        print("saving batch")
        graph_store.add_documents(docs)
        docs.clear()

if docs:
    print("saving batch")
    graph_store.add_documents(docs)

saving batch
saving batch
saving batch
saving batch


# Query Examples

In [21]:
from langchain_core.messages import HumanMessage

from langchain_core.prompts import ChatPromptTemplate

template = """
You are a helpful travel agent bot. 
You should provide answers to the travellers questions in a manner that encourages them to travel, without sounding American. 
Include specific details of things to see and do, so that you appear knowledgeable about the destination. Give a detailed itinerary.
Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

def format_docs(docs):
    formatted = "\n\n".join(f"From {doc.metadata['content_id']}: {doc.page_content}" for doc in docs)
    return formatted

In [14]:
from IPython.display import display, Markdown

# Helper method to render markdown in responses to a chain.
def run_and_render(chain, question):
    result = chain.invoke(question)
    display(Markdown(result))

In [83]:
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

# Depth 0 doesn't traverses edges and is equivalent to vector similarity only.
vector_retriever = graph_store.as_retriever(search_kwargs={"depth": 0})

vector_rag_chain = (
    {"context": vector_retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | chatModel
    | StrOutputParser()
)

QUESTION="What can I do in the golden triangle?"
run_and_render(vector_rag_chain, QUESTION)

Retrying langchain_google_vertexai.chat_models._completion_with_retry.<locals>._completion_with_retry_inner in 4.0 seconds as it raised ServiceUnavailable: 503 Getting metadata from plugin failed with error: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')).
Retrying langchain_google_vertexai.chat_models._completion_with_retry.<locals>._completion_with_retry_inner in 4.0 seconds as it raised ServiceUnavailable: 503 Connection reset.


Ah, the Golden Triangle - a true gem of India!  It's a journey that will transport you through history, culture, and vibrant colours.  

You'll find yourself in Delhi, the bustling heart of India, where you can wander through the ancient bazaars of Chandni Chowk, explore the majestic Red Fort, and delve into the spiritual heart of the city at the Lotus Temple. 

Then, Agra beckons, home to the timeless Taj Mahal, a testament to eternal love and architectural brilliance.  Be sure to witness the sunrise over this ethereal masterpiece - it's a moment you'll never forget! 

Finally, you'll arrive in Jaipur, the 'Pink City', where palaces shimmer in the desert sun and the Hawa Mahal, the 'Palace of Winds', stands as a testament to the city's rich history.  Don't miss the opportunity to explore the vibrant markets and witness the local artisans at work.  

This is just a taste of what awaits you in the Golden Triangle.  Let your senses be captivated by the vibrant colours, the enticing aromas, and the warmth of the Indian people.  

Now, would you like to explore a few ways to enhance your Golden Triangle adventure? Perhaps extend your journey with a river cruise on the Brahmaputra or add a few days in the mystical region of Rajasthan?  


In [84]:
# Depth 1 does vector similarity and then traverses 1 level of edges.
graph_retriever = graph_store.as_retriever(search_kwargs={"depth": 1})

graph_rag_chain = (
    {"context": graph_retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | chatModel
    | StrOutputParser()
)
run_and_render(graph_rag_chain, QUESTION)

The Golden Triangle is a classic journey through India, taking you to three incredible cities: Delhi, Agra, and Jaipur. 

In Delhi, you'll be swept away by the energy of the bustling capital. Explore the magnificent Red Fort and the Jama Masjid, one of India's largest mosques.  You can also visit the impressive Qutub Minar, a UNESCO World Heritage Site, or lose yourself in the colourful streets of Chandni Chowk, a bustling bazaar overflowing with spices, fabrics, and everything imaginable.

Next, journey to Agra, home to the iconic Taj Mahal, a monument to love built by Mughal Emperor Shah Jahan for his beloved wife Mumtaz Mahal.  You'll be mesmerized by its beauty, built entirely from white marble, and adorned with intricate details. Agra Fort, another UNESCO World Heritage Site, offers breathtaking views and showcases the architectural mastery of the Mughal era.

Finally, head to Jaipur, the vibrant Pink City, known for its magnificent forts, palaces, and bustling bazaars.  Explore the majestic Amber Fort, atop a hill overlooking the city, and learn about the colourful history of the region. Visit the City Palace, a stunning complex of courtyards and gardens, or wander through the Hawa Mahal, the Palace of Winds, with its intricate latticework windows.  Jaipur is a feast for the senses, with its vibrant bazaars, offering everything from textiles and jewellery to spices and handicrafts.

The Golden Triangle offers a glimpse into the rich history, vibrant culture, and breathtaking beauty of India.  What are you waiting for? Start planning your journey today! 


In [85]:
mmr_graph_retriever = graph_store.as_retriever(
    search_type = "mmr_traversal",
    search_kwargs = {
        "k": 4,
        "fetch_k": 10,
        "depth": 2,
        # "score_threshold": 0.2,
    },
)

mmr_graph_rag_chain = (
    {"context": mmr_graph_retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | chatModel
    | StrOutputParser()
)

run_and_render(mmr_graph_rag_chain, QUESTION)

Ah, the Golden Triangle!  A journey through the heart of India, brimming with rich history and vibrant culture.  

The classic route weaves through Delhi, Agra, and Jaipur, each city a treasure trove of architectural marvels, bustling bazaars, and delicious cuisine. 

In Delhi, immerse yourself in the grandeur of the Red Fort and the Jama Masjid, the largest mosque in India. Explore the bustling Chandni Chowk market and savour the fragrant street food.

Agra is synonymous with the Taj Mahal, a breathtaking monument to love, a testament to Mughal artistry.  Don't miss the Agra Fort, a UNESCO World Heritage Site, and indulge in the local delicacies of 'Petha' and 'Dal Moth'.

Jaipur, the Pink City, is a kaleidoscope of colour and charm.  Admire the Hawa Mahal, a palace with 953 windows, and wander through the City Palace, a masterpiece of Rajput architecture.  Explore the vibrant bazaars and witness the artistry of the local artisans.

Consider adding an extension to your journey to enhance your experience.  Perhaps a visit to the birthplace of Lord Buddha in Lumbini, Nepal, or a journey into the breathtaking Himalayas? 

Let me know if you have other questions, and I'll be happy to share more about this exciting journey! 


In [86]:
from langchain_core.messages import HumanMessage

image_message = {
    "type": "image_url",
    "image_url": {"url":"camel-riding.jpg"},
}

text_message = {
    "type": "text",
    "text": "Where can I do this?",
}
message = HumanMessage(content=[text_message, image_message])

output = chatModel.invoke([message])
display(Markdown(output.content))


You can go on a camel ride in many places around the world, including:

* **India:** The Thar Desert in Rajasthan is a popular destination for camel safaris. 
* **Morocco:** The Sahara Desert in Morocco offers stunning camel treks through the dunes.
* **Egypt:** The Western Desert of Egypt is home to the White Desert National Park, where you can ride camels and explore unique rock formations.
* **UAE:** The Liwa Oasis in the United Arab Emirates offers camel rides through the Rub' al Khali desert.
* **Australia:** The Outback of Australia has many camel farms and tour operators offering camel rides.
* **Peru:** The Atacama Desert in Peru is another place where you can enjoy a camel trek. 

Before booking a camel ride, it is essential to do your research and choose a reputable operator that prioritizes animal welfare. 


In [89]:
from langchain_core.messages import HumanMessage
from langchain_core.messages import SystemMessage
from langchain_core.language_models import BaseChatModel

def describe_image(llm: BaseChatModel, image_url: str, instruct: str) -> str:
    """Describe the contents of an image."""
    system_message = SystemMessage(content="""
        You are a tool that converts images to text.     
    """)

    image_message = HumanMessage(content=[
    {"type": "text", "text": instruct or "Describe the contents of this image."},
    {"type": "image_url","image_url": {"url":image_url}}
    ])
    
    output = llm.invoke([system_message, image_message])
    return output.content
    
image_description = describe_image(chatModel,"camel-riding.jpg","Describe the image as completely as possible in 50-80 words, in a manner that is suitable for searching travel brochures.")
IMAGE_QUESTION=f"Image: {image_description}\n Where can I do this?"

Retrying langchain_google_vertexai.chat_models._completion_with_retry.<locals>._completion_with_retry_inner in 4.0 seconds as it raised ServiceUnavailable: 503 Connection reset.


In [90]:
run_and_render(graph_rag_chain, IMAGE_QUESTION)

Ah, a camel ride through the desert! That sounds like an excellent adventure.  You can experience the thrill of riding a camel across the golden sands of Jaisalmer in Rajasthan, India. Imagine yourself riding through the dunes, the sun setting over the horizon, casting long shadows on the sand. It's truly a breathtaking sight!  Just picture yourself being guided by experienced locals who can share stories of the desert, its history, and its people. This is a truly unforgettable cultural experience you won't want to miss. 
