# Test out GraphRag

[Building a Graph RAG System: A Step-by-Step Approach](https://machinelearningmastery.com/building-graph-rag-system-step-by-step-approach/)

## Why GraphRag

The document retrieved by regular RAG systems can lack of dependenies and cause the answers generated by the LLM to be fragmented. The article [Building a Graph RAG System: A Step-by-Step Approach](https://machinelearningmastery.com/building-graph-rag-system-step-by-step-approach/) used a good example for the fragmented answer generated by regular RAG.

    In a traditional RAG setup, the system might retrieve the following pieces of information:

    Document 1: “James Watson and Francis Crick proposed the double-helix structure in 1953.”
    Document 2: “Rosalind Franklin’s X-ray diffraction images were critical in identifying DNA’s helical structure.”
    Document 3: “Maurice Wilkins shared Franklin’s images with Watson and Crick, which contributed to their discovery.”
    The problem? Traditional RAG systems treat these documents as independent units. They don’t connect the dots effectively, leading to fragmented responses like: 

    “Watson and Crick proposed the structure, and Franklin’s work was important.”

    This response lacks depth and misses key relationships between contributors. Enter Graph RAG! By organizing the retrieved data as a graph, Graph RAG represents each document or fact as a node, and the relationships between them as edges.

    Here’s how Graph RAG would handle the same query:

    Nodes: Represent facts (e.g., “Watson and Crick proposed the structure,” “Franklin contributed critical X-ray images”).
    Edges: Represent relationships (e.g., “Franklin’s images → shared by Wilkins → influenced Watson and Crick”).
    By reasoning across these interconnected nodes, Graph RAG can produce a complete and insightful response like:

    “The discovery of DNA’s double-helix structure in 1953 was primarily led by James Watson and Francis Crick. However, this breakthrough heavily relied on Rosalind Franklin’s X-ray diffraction images, which were shared with them by Maurice Wilkins.”

    This ability to combine information from multiple sources and answer broader, more complex questions is what makes Graph RAG so popular.



## Initiate RAG Building Blocks

Deploy Azure OpenAI Services, including an LLM and an embedding. Model deployed as an Azure endpoint on an [AI Foundary workspace](https://oai.azure.com/resource/overview?wsid=/subscriptions/d91792a2-c9bd-44bc-bcd8-fdddc7ceb1c5/resourceGroups/agentic_applications/providers/Microsoft.CognitiveServices/accounts/multi-agentic-applications&tid=565f1c8e-754e-473e-8352-ac5b86a38c93). Set and AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT and OPENAI_API_VERSION in .env file.

In [2]:
import dotenv
import sys
from pathlib import Path

## Setup Environment
sys.path.append(Path.cwd().parent) # Append project home to system path
dotenv.load_dotenv(override=True) # Load .env

True

In [3]:
# # Test Azure Connection
import openai

client = openai.AzureOpenAI(
    api_version="2025-01-01-preview",
)

# gpt-4o-mini only support chat completion. Use client.chat.completions.create instead of
# client.completions.create
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Test prompt"}],
)

response

ChatCompletion(id='chatcmpl-BAxPnfsQLjNEjPvnFVE4k4gd4k1VQ', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="It looks like you're testing the prompt functionality! How can I assist you today? If you have any questions or need information on a specific topic, feel free to ask.", refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None), content_filter_results={'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}})], created=1741951371, model='gpt-4o-mini-2024-07-18', object='chat.completion', service_tier=None, system_fingerprint='fp_b705f0c291', usage=CompletionUsage(completion_tokens=34, prompt_tokens=9, total_tokens=43, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected

### Setup RAG Building Blocks

* LLM
* Embedding
* Vector store

In [4]:
from langchain_openai import AzureChatOpenAI, AzureOpenAIEmbeddings
from langchain_core.vectorstores import InMemoryVectorStore

# Connect to chat model. 
# Here we use AzureChatOpenAI instead AzureOpenAI to connect to gpt-4o-mini
llm = AzureChatOpenAI(azure_deployment="gpt-4o-mini")

# Connect to embedding
embeddings = AzureOpenAIEmbeddings(model="text-embedding-3-large")

# Instantiate vector store
vector_store = InMemoryVectorStore(embeddings)

In [5]:
# Test azure connection
llm.invoke("Tell me a joke")

AIMessage(content="Why don't skeletons fight each other? \n\nThey don't have the guts!", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 15, 'prompt_tokens': 11, 'total_tokens': 26, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_b705f0c291', 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'jailbreak': {'filtered': False, 'detected': False}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}], 'finish_reason': 'stop', 'logprobs': None, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 

## Chunk Documents

In [11]:
# Download data
import pandas as pd
news = pd.read_csv("https://raw.githubusercontent.com/tomasonjo/blog-datasets/main/news_articles.csv")[:50]
news[:1]

Unnamed: 0,title,date,text
0,Chevron: Best Of Breed,2031-04-06T01:36:32.000000000+00:00,JHVEPhoto Like many companies in the O&G secto...


In [21]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.documents import Document

# Convert string to Langchain document
news_documents = [Document(row[1]['text']) for row in news.iterrows()]
news_documents[:1]

[Document(metadata={}, page_content='JHVEPhoto Like many companies in the O&G sector, the stock of Chevron (NYSE:CVX) has declined about 10% over the past 90-days despite the fact that Q2 consensus earnings estimates have risen sharply (~25%) during that same time frame. Over the years, Chevron has kept a very strong balance sheet. That allowed the...')]

In [None]:
# Splits text into chunks of 500 characters with a 100-character overlap to maintain context between chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
all_splits = text_splitter.split_documents(news_documents)
all_splits[:1]

[Document(metadata={}, page_content='JHVEPhoto Like many companies in the O&G sector, the stock of Chevron (NYSE:CVX) has declined about 10% over the past 90-days despite the fact that Q2 consensus earnings estimates have risen sharply (~25%) during that same time frame. Over the years, Chevron has kept a very strong balance sheet. That allowed the...')]

## Extract Knowledge Graph

In [None]:
from typing import Callable

from langchain_core.language_models.chat_models import BaseChatModel
from langchain_core.prompts import PromptTemplate

class GraphRAGExtractor:
   '''Extract triples from a graph.

   Uses an LLM and a simple prompt + output parsing to extract paths (i.e. triples) and entity, relation descriptions from text.

   Args:
      llm (LLM):
         The language model to use.
      extract_prompt (Union[str, PromptTemplate]):
         The prompt to use for extracting triples.
      parse_fn (callable):
         A function to parse the output of the language model.
      num_workers (int):
         The number of workers to use for parallel processing.
      max_paths_per_chunk (int):
         The maximum number of paths to extract per chunk.

   '''
   llm: BaseChatModel
   extract_prompt: PromptTemplate
   parse_fn: Callable
   


