<a href="https://colab.research.google.com/github/graphlit/graphlit-samples/blob/main/python/Notebook%20Examples/Graphlit_2024_12_13_RAG_without_Embeddings.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Description**

This example shows how to use Graphlit for RAG without doing embedding similarity search.

**Requirements**

Prior to running this notebook, you will need to [signup](https://docs.graphlit.dev/getting-started/signup) for Graphlit, and [create a project](https://docs.graphlit.dev/getting-started/create-project).

You will need the Graphlit organization ID, preview environment ID and JWT secret from your created project.

Assign these properties as Colab secrets: GRAPHLIT_ORGANIZATION_ID, GRAPHLIT_ENVIRONMENT_ID and GRAPHLIT_JWT_SECRET.


---

Install Graphlit Python client SDK

In [None]:
!pip install --upgrade graphlit-client

Collecting graphlit-client
  Downloading graphlit_client-1.0.20241213002-py3-none-any.whl.metadata (3.2 kB)
Downloading graphlit_client-1.0.20241213002-py3-none-any.whl (228 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m228.6/228.6 kB[0m [31m5.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: graphlit-client
Successfully installed graphlit-client-1.0.20241213002


Initialize Graphlit

In [None]:
import os
from google.colab import userdata
from graphlit import Graphlit
from graphlit_api import input_types, enums, exceptions

os.environ['GRAPHLIT_ORGANIZATION_ID'] = userdata.get('GRAPHLIT_ORGANIZATION_ID')
os.environ['GRAPHLIT_ENVIRONMENT_ID'] = userdata.get('GRAPHLIT_ENVIRONMENT_ID')
os.environ['GRAPHLIT_JWT_SECRET'] = userdata.get('GRAPHLIT_JWT_SECRET')

graphlit = Graphlit()

Define Graphlit helper functions

In [None]:
from typing import List, Optional

async def ingest_uri(uri: str):
    if graphlit.client is None:
        return;

    try:
        # Using synchronous mode, so the notebook waits for the content to be ingested
        response = await graphlit.client.ingest_uri(uri=uri, is_synchronous=True)

        return response.ingest_uri.id if response.ingest_uri is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

async def create_google_specification(model: enums.GoogleModels):
    if graphlit.client is None:
        return;

    input = input_types.SpecificationInput(
        name=f"Google [{str(model)}]",
        type=enums.SpecificationTypes.COMPLETION,
        serviceType=enums.ModelServiceTypes.GOOGLE,
        google=input_types.GoogleModelPropertiesInput(
            model=model,
        ),
        searchType=enums.ConversationSearchTypes.NONE # bypass semantic search from user prompt
    )

    try:
        response = await graphlit.client.create_specification(input)

        return response.create_specification.id if response.create_specification is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

    return None

async def create_conversation(specification_id: str):
    if graphlit.client is None:
        return;

    input = input_types.ConversationInput(
        name="Conversation",
        specification=input_types.EntityReferenceInput(
            id=specification_id
        )
    )

    try:
        response = await graphlit.client.create_conversation(input)

        return response.create_conversation.id if response.create_conversation is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

async def delete_conversation(conversation_id: str):
    if graphlit.client is None:
        return;

    if conversation_id is not None:
        _ = await graphlit.client.delete_conversation(conversation_id)

async def prompt_conversation(conversation_id: str, prompt: str):
    if graphlit.client is None:
        return None, None

    try:
        response = await graphlit.client.prompt_conversation(prompt, conversation_id, include_details=True)

        message = response.prompt_conversation.message.message if response.prompt_conversation is not None and response.prompt_conversation.message is not None else None
        details = response.prompt_conversation.details if response.prompt_conversation is not None else None

        return message, details
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None, None

async def delete_all_specifications():
    if graphlit.client is None:
        return;

    _ = await graphlit.client.delete_all_specifications(is_synchronous=True)

async def delete_all_conversations():
    if graphlit.client is None:
        return;

    _ = await graphlit.client.delete_all_conversations(is_synchronous=True)

async def delete_all_contents():
    if graphlit.client is None:
        return;

    _ = await graphlit.client.delete_all_contents(is_synchronous=True)


Execute Graphlit example

In [None]:
from IPython.display import display, Markdown, HTML
import time

# Remove any existing contents, conversations and specifications; only needed for notebook example
await delete_all_conversations()
await delete_all_specifications()
await delete_all_contents()

print('Deleted all contents, conversations and specifications.')

# Ingest a few papers

content_id = await ingest_uri(uri="https://graphlitplatform.blob.core.windows.net/test/documents/Attention%20Is%20All%20You%20Need.1706.03762.pdf")

if content_id is not None:
    print(f'Ingested content [{content_id}]')

content_id = await ingest_uri(uri="https://graphlitplatform.blob.core.windows.net/test/documents/Unifying%20Large%20Language%20Models%20and%20Knowledge%20Graphs%20A%20Roadmap-2306.08302.pdf")

if content_id is not None:
    print(f'Ingested content [{content_id}]')

content_id = await ingest_uri(uri="https://graphlitplatform.blob.core.windows.net/test/documents/2404.17723v2.pdf")

if content_id is not None:
    print(f'Ingested content [{content_id}]')


Deleted all contents, conversations and specifications.
Ingested content [38b81ac3-b662-46a6-9dcf-2a628e350193]
Ingested content [af3c86b0-070a-4711-89bd-c643ef159e4e]
Ingested content [1b68039d-20d7-4c2c-8855-2476ae77749d]


In [None]:
    # Specify the RAG prompt
    prompt = """
    Write me a long, detailed blog post. Explain RAG and its usefulness for knowledge capture and retrieval.
    Be specific about the usage of LLMs. Explain how knowledge graphs can be used for RAG, as well. Add quotes from the original papers.
    """

Create Google Gemini 1.5 Flash specification, and prompt RAG conversation.

In [None]:
    specification_id = await create_google_specification(enums.GoogleModels.GEMINI_1_5_FLASH_002)

    if specification_id is not None:
        print(f'Created specification [{specification_id}].')

        conversation_id = await create_conversation(specification_id)

        if conversation_id is not None:
            print(f'Created conversation [{conversation_id}].')

            message, details = await prompt_conversation(conversation_id, prompt)

            if message is not None:
                display(Markdown('### Conversation:'))
                display(Markdown(f'**User:**\n{prompt}'))
                display(Markdown(f'**Assistant:**\n{message}'))
                print()

            if details is not None:
                display(Markdown('### Details:'))
                display(Markdown(f'**Model**: {details.model_service} {details.model}'))
                display(Markdown(f'**Token Limit**: {details.token_limit}'))
                display(Markdown(f'**Completion Token Limit**: {details.completion_token_limit}'))

                print()

                # NOTE: uncomment to see the formatted sources provided to LLM
                #if details.formatted_sources is not None:
                    #display(Markdown(f'#### Formatted Sources:'))
                    #print(details.formatted_sources)
                    #print()

            await delete_conversation(conversation_id)

Created specification [54770783-df15-4990-98ed-19dd690eb221].
Created conversation [be507cee-8c7a-471c-af33-61f7e0b5257d].


### Conversation:

**User:**

Write me a long, detailed blog post. Explain RAG and its usefulness for knowledge capture and retrieval. 
Be specific about the usage of LLMs. Explain how knowledge graphs can be used for RAG, as well. Add quotes from the original papers.


**Assistant:**
## Revolutionizing Knowledge Capture and Retrieval with Retrieval-Augmented Generation (RAG)

In today's data-rich world, efficiently capturing and retrieving relevant knowledge is paramount.  Retrieval-Augmented Generation (RAG) has emerged as a powerful technique, leveraging the strengths of large language models (LLMs) and structured knowledge bases to achieve this goal. This blog post delves into the intricacies of RAG, highlighting its capabilities and exploring its synergy with knowledge graphs.

**Understanding RAG: A Synergistic Approach**

RAG fundamentally alters the traditional approach to question answering. Instead of relying solely on the knowledge embedded within an LLM, RAG augments the LLM with external knowledge retrieved from a relevant knowledge base.  This external knowledge acts as context, enriching the LLM's understanding and enabling it to generate more accurate, informative, and contextually appropriate responses.  As Patrick Lewis et al. state in their seminal paper, "Retrieval-augmented generation for knowledge-intensive NLP tasks," RAG aims to "combine the strengths of retrieval-based and generation-based methods."  This means leveraging the ability of LLMs to generate fluent and coherent text while simultaneously grounding that text in factual information from an external source.

**The Role of LLMs in RAG**

LLMs are the heart of RAG, acting as sophisticated text generators.  They take the retrieved information as input, along with the original query, and generate a response.  The LLM's ability to understand context and generate human-quality text is crucial for creating meaningful and useful answers.  The process isn't simply pasting retrieved text; the LLM synthesizes the information, potentially summarizing, paraphrasing, or combining multiple sources to create a coherent response.  This is a key difference from simple retrieval systems.

**Knowledge Graphs: Enhancing RAG's Precision**

While RAG can utilize various knowledge bases, knowledge graphs (KGs) offer a particularly powerful approach. KGs represent information as a network of interconnected entities and relationships, providing a structured and semantically rich representation of knowledge.  This structured nature allows for more precise retrieval.  Instead of searching through unstructured text, RAG with KGs can navigate the graph to find relevant entities and relationships, significantly improving retrieval accuracy.

A recent paper, "Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering," highlights the advantages of using KGs in RAG for customer service. The authors explain how their system "constructs a KG from historical issues for use in retrieval, retaining the intra-issue structure and inter-issue relations." This approach avoids the limitations of treating a corpus of past issues as plain text, leading to improved retrieval accuracy and answer quality.

**Illustrative Example: Customer Service**

Imagine a customer service scenario where a user asks a complex question about a product.  A traditional LLM might struggle to answer accurately if the necessary information isn't directly encoded in its training data.  However, a RAG system with a KG could retrieve relevant past support tickets, product documentation, or other relevant information from the KG.  The LLM would then use this retrieved information to generate a comprehensive and accurate response, potentially even referencing specific sections of the retrieved documents.

**Challenges and Future Directions**

While RAG offers significant advantages, challenges remain.  These include:

* **KG Construction:** Creating and maintaining comprehensive and accurate KGs can be a laborious and resource-intensive process.
* **Retrieval Efficiency:** Efficiently retrieving relevant information from large KGs is crucial for real-time performance.
* **Hallucination Mitigation:** LLMs can sometimes generate factually incorrect information ("hallucinations").  RAG needs mechanisms to mitigate this risk.

Future research will likely focus on automating KG construction, improving retrieval efficiency, and developing robust methods for detecting and correcting hallucinations.  The integration of multi-modal LLMs, capable of processing information from various sources like images and videos, will further enhance RAG's capabilities.

**Conclusion: A Promising Future**

RAG, particularly when combined with KGs, represents a significant advancement in knowledge capture and retrieval.  By leveraging the strengths of LLMs and structured knowledge, RAG empowers systems to answer complex questions accurately and efficiently.  As research continues to address the existing challenges, RAG's impact across various domains will undoubtedly grow, transforming how we interact with and utilize information.





### Details:

**Model**: GOOGLE Gemini 1.5 Flash (002 version)

**Token Limit**: 1048576

**Completion Token Limit**: 8191


