# Battle of the Semantics: GraphRag vs Embeddings Index

Retrieval Augmented Generation (RAG) is often performed by chunking long texts, creating a text embedding for each chunk, and retrieving chunks for including in the LLM generation context based on a similarity search against the query. This approach works well in many scenarios, and at compelling speed and cost trade-offs, but doesn't always cope well in scenarios where a detailed understanding of the text is required.

GraphRag ( [microsoft.github.io/graphrag](https://microsoft.github.io/graphrag/) ), a new indexing method released by Microsoft, promises to address this defficiency by using an LLM to analyse the indexed text and construct and knowledge graph of entities from it. This more detailed semantic understanding of the content of the text can result in searches that produces a more accurate and complete context for the LLM to work with in generation.

To compare both method, let's see what results we get when indexing and retrieving the text with both techniques.

In [1]:
%pip install openai graphrag pandas requests python-dotenv langchain numpy tiktoken matplotlib scikit-learn pyyaml pydantic instructor
from IPython.display import clear_output ; clear_output()

To run this test yourself, copy the template file `dot.env` to `.env` and fill in the details for your OpenAI or Azure Open AI endpoint. The following code loads these environment variables and sets up our AI client.

In [1]:
from dotenv import load_dotenv
import os
load_dotenv()

is_azure = (
  os.getenv("AZURE_OPENAI_ENDPOINT", default="") != "" and
  os.getenv("OPENAI_API_KEY", default="") == ""
)

GPT_4_O_MODEL_NAME = os.getenv("GPT_4_O_MODEL_NAME", default="gpt-4o")
TEXT_EMBEDDING_3_LARGE_MODEL_NAME = os.getenv("TEXT_EMBEDDING_3_LARGE_MODEL_NAME", default="text-embedding-3-large")

if is_azure:
  AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT")
  AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY")
  AZURE_OPENAI_API_VERSION = "2024-05-01-preview"
  from openai import AzureOpenAI
  oai = AzureOpenAI(azure_endpoint=AZURE_OPENAI_ENDPOINT, api_key=AZURE_OPENAI_API_KEY, api_version=AZURE_OPENAI_API_VERSION)
else:
  OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
  from openai import OpenAI
  oai = OpenAI(api_key=OPENAI_API_KEY)

We'll start by getting a text to work with. The Wikipedia article on the French Revolution is a longer text, rich in detail and structure. We'll download it from Wikipedia and convert it to an LLM-digestable piece of text using the Jina AI Reader API. We'll also trim the last sections of the article (See Also and References) that do not contribute relevant content.

In [2]:
import requests
import os

if not os.path.exists('data'): os.makedirs('data')

if not os.path.exists('data/french_revolution.md'):
  french_revolution = requests.get("https://r.jina.ai/https://en.wikipedia.org/wiki/French_Revolution").text.split('\nSee also')[0]
  with open('data/french_revolution.md', 'w') as f:
    f.write(french_revolution)
else:
  with open('data/french_revolution.md', 'r') as f:
    french_revolution = f.read()

print(french_revolution[:123])

Title: French Revolution

URL Source: https://en.wikipedia.org/wiki/French_Revolution

Published Time: 2001-10-18T00:19:10Z


We can now chunk the text and embed the chunks. An optimal chunking strategy is content-dependent, but this particular one works well for articles of this length and format, and optimising for reducing context length.

In [18]:
from langchain.text_splitter import MarkdownTextSplitter
import pandas as pd

if not os.path.exists('data/embeddings.parquet'):
  embeddings = pd.DataFrame(columns=['Text', 'Embedding'])

  splitter = MarkdownTextSplitter(chunk_size=300, chunk_overlap=100)

  chunks = splitter.split_text(french_revolution)
  chunk_embeddings = oai.embeddings.create(
    input=chunks,
    model=TEXT_EMBEDDING_3_LARGE_MODEL_NAME
  )
  for i, chunk in enumerate(chunks):
    embeddings.loc[len(embeddings)] = [chunk, chunk_embeddings.data[i].embedding]
  embeddings.to_parquet('data/embeddings.parquet')
else:
  embeddings = pd.read_parquet('data/embeddings.parquet')

embeddings

Unnamed: 0,Text,Embedding
0,Title: French Revolution\n\nURL Source: https:...,"[-0.003698753658682108, -0.019621584564447403,..."
1,Markdown Content:\nJump to content\nMain menu\...,"[-0.01978607289493084, 0.009914387948811054, -..."
2,"Tools\nFrom Wikipedia, the free encyclopedia\n...","[-0.032426539808511734, -0.009258555248379707,..."
3,"The Storming of the Bastille, 14 July 1789","[0.006268054712563753, -0.0037634416949003935,..."
4,"Date\t5 May 1789 – 9 November 1799\n(10 years,...","[-0.014528979547321796, 0.0037560712080448866,..."
...,...,...
465,Alfred Cobban challenged Jacobin-Marxist socia...,"[-0.012176590040326118, -0.012197822332382202,..."
466,Revolution (1964). He argued the Revolution wa...,"[-0.0006112174596637487, 0.02068362943828106, ..."
467,"In their 1965 work, La Revolution française, F...","[0.01710551045835018, 0.012524602934718132, -0..."
468,"From the 1990s, Western scholars largely aband...","[0.00677307927981019, 0.007338008843362331, -0..."


To search for chunks, we'll use cosine similarity, a popular comparison method for embeddings. Our search function will look for the chunks with the highest similarity index and limit the set of results by the number of tokens we're willing to use in our context for generation (for convenience and debugging, we can also filter by the top number of results or by a similarity threshold).

In [5]:
import numpy as np
import tiktoken

def cosine_similarity(vector1, vector2):
  dot_product = np.dot(vector1, vector2)
  norm1 = np.linalg.norm(vector1)
  norm2 = np.linalg.norm(vector2)
  similarity = dot_product / (norm1 * norm2)
  return similarity

tokenizer = tiktoken.encoding_for_model('gpt-4o')

def embeddings_search(query, max_tokens=10000, k=100, min_similarity=0.2):
  query_embedding = oai.embeddings.create(
    input=[query],
    model=TEXT_EMBEDDING_3_LARGE_MODEL_NAME
  ).data[0].embedding
  results = embeddings.copy()
  results['Similarity'] = results['Embedding'].apply(lambda x: cosine_similarity(x, query_embedding))
  results = results.sort_values(by='Similarity', ascending=False).head(k)
  results = results[results['Similarity'] >= min_similarity]
  results['Tokens'] = results['Text'].apply(lambda txt: len(tokenizer.encode(txt)))
  while results['Tokens'].sum() > max_tokens:
    results = results[:-1]
  return results['Text'].tolist()

To run a complete RAG with our embeddings, we'll retrieve chunks using our embeddings search then pass those on to the LLM for generation. Note that we're using the same system prompt used in GraphRag, to make sure that our comparison only differs on the retrieved context, and not the instructions to the LLM.

In [6]:
from graphrag.query.structured_search.global_search.reduce_system_prompt import REDUCE_SYSTEM_PROMPT as SYSTEM_PROMPT
import re

DEFAULT_RESPONSE_TYPE = 'Summarize and explain in 1-2 paragraphs with bullet points using at most 300 tokens'
DEFAULT_MAX_CONTEXT_TOKENS = 10000

def remove_data(text):
    return re.sub(r'\[Data:.*?\]', '', text).strip()

def ask_embeddings(query):
  results = embeddings_search(query, max_tokens=DEFAULT_MAX_CONTEXT_TOKENS)
  response = oai.chat.completions.create(
    model=GPT_4_O_MODEL_NAME,
    messages=[
      {
        "role": "system",
        "content": SYSTEM_PROMPT.format(
          response_type=DEFAULT_RESPONSE_TYPE,
          report_data="---\n---\n".join(results),
        ),
      },
      {"role": "user", "content": query}
    ],
    max_tokens=4000,
    temperature=0.5,
  ).choices[0].message.content
  return remove_data(response)

Now let's index using GraphRag. GraphRag has a convenient set of CLI commands we can use. We'll start by configuring the system, then run the indexing operation. Indexing with GraphRag is a much lengthier process, and one that costs significantly more, since rather than just calculating embeddings, GraphRag makes many LLM calls to analyse the text, extract entities, and construct the graph. That's a one-time expense, though.

In [19]:
import yaml

if not os.path.exists('data/graphrag'):
  !python -m graphrag.index --init --root data/graphrag

with open('data/graphrag/settings.yaml', 'r') as f:
  settings_yaml = yaml.load(f, Loader=yaml.FullLoader)
settings_yaml['llm']['model'] = GPT_4_O_MODEL_NAME
settings_yaml['llm']['api_key'] = AZURE_OPENAI_API_KEY if is_azure else OPENAI_API_KEY
settings_yaml['llm']['type'] = 'azure_openai_chat' if is_azure else 'openai_chat'
settings_yaml['embeddings']['llm']['api_key'] = AZURE_OPENAI_API_KEY if is_azure else OPENAI_API_KEY
settings_yaml['embeddings']['llm']['type'] = 'azure_openai_embedding' if is_azure else 'openai_embedding'
settings_yaml['embeddings']['llm']['model'] = TEXT_EMBEDDING_3_LARGE_MODEL_NAME
if is_azure:
  settings_yaml['llm']['api_version'] = AZURE_OPENAI_API_VERSION
  settings_yaml['llm']['deployment_name'] = GPT_4_O_MODEL_NAME
  settings_yaml['llm']['api_base'] = AZURE_OPENAI_ENDPOINT
  settings_yaml['embeddings']['llm']['api_version'] = AZURE_OPENAI_API_VERSION
  settings_yaml['embeddings']['llm']['deployment_name'] = TEXT_EMBEDDING_3_LARGE_MODEL_NAME
  settings_yaml['embeddings']['llm']['api_base'] = AZURE_OPENAI_ENDPOINT

with open('data/graphrag/settings.yaml', 'w') as f:
  yaml.dump(settings_yaml, f)

if not os.path.exists('data/graphrag/input'):
  os.makedirs('data/graphrag/input')
  !cp data/french_revolution.md data/graphrag/input/french_revolution.txt
  !python -m graphrag.index --root ./data/graphrag

[2KInitializing project at data/graphrag
[2K🚀 [32mReading settings from data/graphrag/settings.yaml[0m
[2K⠼ GraphRAG Indexer 
[2K[1A[2K⠼ GraphRAG Indexer les loaded (0 filtered) [90m━━━━━━[0m [35m100%[0m [36m0:00:…[0m [33m0:00:…[0m
├── Loading Input (text) - 1 files loaded (0 filtered) [90m━━━━━━[0m [35m100%[0m [36m0:00:…[0m [33m0:00:…[0m
[2K[1A[2K[1A[2K⠼ GraphRAG Indexer 
├── Loading Input (text) - 1 files loaded (0 filtered) [90m━━━━━━[0m [35m100%[0m [36m0:00:…[0m [33m0:00:…[0m
[2K[1A[2K[1A[2K⠼ GraphRAG Indexer 
├── Loading Input (text) - 1 files loaded (0 filtered) [90m━━━━━━[0m [35m100%[0m [36m0:00:…[0m [33m0:00:…[0m
[2K[1A[2K[1A[2K⠼ GraphRAG Indexer 
├── Loading Input (text) - 1 files loaded (0 filtered) [90m━━━━━━[0m [35m100%[0m [36m0:00:…[0m [33m0:00:…[0m
[2K[1A[2K[1A[2K⠧ GraphRAG Indexer 
├── Loading Input (text) - 1 files loaded (0 filtered) [90m━━━━━━[0m [35m100%[0m [36m0:00:…[0m [33m0:00:…[0m
[2K[1

To query GraphRag we'll use its CLI again, making sure to configure it with a context length equivalent to what we use in our embeddings search.

In [15]:
import subprocess

def ask_graph(query):
  env = os.environ.copy() | {
    'GRAPHRAG_GLOBAL_SEARCH_MAX_TOKENS': str(DEFAULT_MAX_CONTEXT_TOKENS),
  }
  command = [
    'python', '-m', 'graphrag.query',
    '--root', './data/graphrag',
    '--method', 'global',
    '--response_type', DEFAULT_RESPONSE_TYPE,
    query,
  ]
  output = subprocess.check_output(command, universal_newlines=True, env=env, stderr=subprocess.DEVNULL)
  return remove_data(output.split('Search Response: ')[1])

Let's try a couple of questions and compare. Is the extra cost of indexing with GraphRag worth it?

In [10]:
from IPython.display import Markdown

md = ""

for question in [
  'Timeline of the French revolution',
  'Who was Robespierre and what was his role in the French revolution?',
  ]:
  for func in [ask_embeddings, ask_graph]:
    result = func(question)
    md += f"**{question} ({func.__name__})**\n\n{result}\n\n---\n\n"

Markdown(md)

**Timeline of the French revolution (ask_embeddings)**

### Timeline of the French Revolution

The French Revolution was a period of significant political and societal change in France, spanning from May 5, 1789, to November 9, 1799. Below are the key events and outcomes:

- **May 5, 1789**: The Estates General convened, marking the beginning of the Revolution.
- **July 14, 1789**: The Storming of the Bastille, a symbolic event that is still commemorated as Bastille Day.
- **August 10, 1792**: Insurrection led to the abolition of the monarchy.
- **September 22, 1792**: Proclamation of the French First Republic.
- **January 21, 1793**: Execution of Louis XVI.
- **September 1793 - July 1794**: The Reign of Terror, during which around 16,000 people were executed.
- **November 9, 1799**: The coup of 18 Brumaire led to the establishment of the French Consulate, marking the end of the Revolution .

### Outcomes and Impact

- **Abolition of the Ancien Régime**: The Revolution dismantled feudal structures and privileges.
- **Creation of a Constitutional Monarchy**: Initially, the Revolution aimed to establish a constitutional monarchy, which later transitioned to a republic.
- **Influence on Western History**: The Revolution ended feudalism in France and paved the way for advances in individual freedoms and democratic ideals across Europe .

These events and outcomes highlight the transformative nature of the French Revolution, which continues to influence modern political discourse and democratic principles.

---

**Timeline of the French revolution (ask_graph)**

### Timeline of the French Revolution

The French Revolution, spanning from 1789 to 1799, was marked by significant events that reshaped France's political and social landscape. Key milestones include:

- **1789**:
  - **May 5**: The Estates-General convened by Louis XVI at Versailles to address the financial crisis .
  - **June 17**: The Third Estate declared itself the National Assembly, challenging the existing political order .
  - **July 14**: The Storming of the Bastille, symbolizing the fall of the Ancien Régime .
  - **August 26**: The Declaration of the Rights of Man and of the Citizen was approved .

- **1792**:
  - **August 10**: The Tuileries Palace was stormed, leading to the downfall of the monarchy .
  - **September**: The French First Republic was established, ending the monarchy .

- **1793**:
  - **January 21**: Execution of King Louis XVI, marking a turning point in the Revolution .
  - **April 6**: The Committee of Public Safety was created, leading to the Reign of Terror .
  - **March-July**: The Reign of Terror, characterized by extreme political repression and mass executions .

- **1794**:
  - **July 28**: Execution of Robespierre, ending the Reign of Terror .

- **1799**:
  - **November 9**: The Coup of 18 Brumaire led by Napoleon Bonaparte, establishing the French Consulate and marking the end of the French Revolution .

These events collectively highlight the radical transformation of French society and governance during the Revolution.

---

**Who was Robespierre and what was his role in the French revolution? (ask_embeddings)**

### Maximilien Robespierre and His Role in the French Revolution

Maximilien Robespierre was a prominent and influential figure during the French Revolution. His actions and ideologies significantly shaped the course of the revolution, particularly during its most radical phase.

#### Key Points:
- **Political Influence and Reforms**:
  - Robespierre opposed the criteria for "active citizens," advocating for broader political participation, which gained him substantial support among the Parisian populace .
  - He played a pivotal role in the radical Jacobin club and was instrumental in pushing for universal male suffrage and other radical reforms through the new Constitution ratified on 24 June 1793 .

- **Committee of Public Safety and the Reign of Terror**:
  - Robespierre became a leading member of the Committee of Public Safety, which assumed control after the constitution was suspended in June 1793. He was a central figure during the Reign of Terror, a period marked by mass executions of perceived enemies of the revolution .
  - Under his influence, the Law of Suspects was enacted, leading to the arrest and execution of thousands .

- **Downfall and Execution**:
  - Robespierre's dominance and the ensuing terror led to growing opposition. On 26 July 1794, he accused certain members of the Convention of conspiracy, which backfired and led to his arrest. After a failed suicide attempt, he was executed on 28 July 1794, marking the end of the Reign of Terror .

Robespierre's legacy remains controversial; he is seen as both a defender of revolutionary ideals and a symbol of the revolution's excesses. His execution was a turning point that ended the most extreme phase of the French Revolution.

---

**Who was Robespierre and what was his role in the French revolution? (ask_graph)**

### Maximilien Robespierre: Key Figure in the French Revolution

Maximilien Robespierre was a central figure in the French Revolution, known for his radical leadership and controversial actions. His influence spanned various political factions and revolutionary activities, significantly shaping the course of the revolution.

#### Key Roles and Actions:

- **Leadership During the Reign of Terror**:
  - Robespierre's leadership during the Reign of Terror was marked by extreme political repression and mass executions, contributing to an atmosphere of fear and instability in Paris .
  - His actions and policies led to accusations of dictatorship, ultimately resulting in his execution on 28 July 1794, which marked the end of the Reign of Terror .

- **Advocacy for the Third Estate**:
  - He was deeply involved in advocating for the Third Estate, organizing meetings, petitions, and literature to support their cause .
  - His involvement in the Estates-General underscored the growing influence of the common people in the political process .

- **Association with Radical Factions**:
  - Robespierre was a leading figure of the Montagnards, a radical political faction, and was closely associated with the Jacobins, Cordeliers, and the Society of Thirty .
  - He opposed both moderate and radical factions within the Montagnard group, reflecting the complex political dynamics of the time .

- **Influence on Revolutionary Policies**:
  - He proposed significant motions in the Legislative Assembly, such as barring existing deputies from elections, which had substantial political implications .
  - Robespierre led the Cult of the Supreme Being, a revolutionary cult, although it faced opposition and ridicule, contributing to his downfall .

Robespierre's actions and policies were pivotal in shaping the revolutionary activities and the political landscape of France during this tumultuous period .

---



Initial results look quite good. For a more comprehensive evaluation, let's use our LLM to judge between attempts by both techniques to answer the same question. When making a judgement call, the LLM doesn't have access to the original text, but we can assume that it has been exposed to Wikipedia and many other sources in pre-training and can evaluate the accuracy and relevance of a question. We'll run each of our 5 questions 5 times with each method, and look at the result to decide who won the battle of the semantics.

In [17]:
from pydantic import BaseModel, Field
from typing import Literal
import instructor
import json
import random

QUESTIONS = [
  'How did the financial and political crisis contribute to the calling of the Estates-General in 1789?',
  'What role did the Enlightenment and previous revolutions play in shaping the French Revolution?',
  'Analyze how the various social classes in France were affected by the Revolution and the policies implemented, such as the Civil Constitution of the Clergy and the abolition of feudal dues.',
  'What were the key events of the French Revolution that led to the rise of Napoleon Bonaparte?',
  'The role of ideology in the French Revolution is a subject of ongoing debate among historians. Analyze the conflicting interpretations of Jonathan Israel and Alfred Cobban, assessing their arguments and evidence in light of the information presented in the text.',
]

class EvalAnswers(BaseModel):
  best_answer: Literal[1, 2] = Field(..., description="The index of the best answer, evaluated for accuracy and relevance")
  explanation: str = Field(..., description=(
    "Short explanation for the choice of the best answer (max 100 tokens). "
    "The explanation refers to the chosen best answer as 'the best answer' "
    "and the other answer as 'the other answer'."
    )
  )

evals = []

for question in QUESTIONS:
  evals_best_answer = []
  evals_explanation = []
  for i in range(5):
    answer_graph = ask_graph('question')
    answer_embeddings = ask_embeddings('question')
    graph_index = random.choice([1, 2])
    embeddings_index = 1 if graph_index == 2 else 2
    evaluation = instructor.from_openai(oai).chat.completions.create(
      response_model=EvalAnswers,
      model=GPT_4_O_MODEL_NAME,
      messages=[
        {
          "role": "system",
          "content": ("Evaluate the two answers below based on accuracy and relevance to the question. "
                      "Select the index of the best answer (1 or 2) and explain why you made that choice.")
        },
        {"role": "user", "content": json.dumps({
          'question': question,
          'answers': dict(sorted({
            graph_index: answer_graph,
            embeddings_index: answer_embeddings,
          }.items())),
        })},
      ],
      max_tokens=250,
      temperature=0.5,
    )
    evals_best_answer.append('graph' if evaluation.best_answer == graph_index else 'embeddings')
    evals_explanation.append(evaluation.explanation)
  embeddings_wins = evals_best_answer.count('embeddings')
  graph_wins = evals_best_answer.count('graph')
  evals.append({
    'question': question,
    'best_answer': evals_best_answer,
    'explanation': evals_explanation,
    'graph_wins': graph_wins,
    'embeddings_wins': embeddings_wins,
    'winner': 'graph' if graph_wins > embeddings_wins else 'embeddings',
  })
  total_graph_wins = sum([ev['graph_wins'] for ev in evals])
  total_embeddings_wins = sum([ev['embeddings_wins'] for ev in evals])
  total_winner = 'graph' if total_graph_wins > total_embeddings_wins else 'embeddings'

for ev in evals:
  print(f"Question: {ev['question']}")
  print(f"Winner: {ev['winner']}")
  print(f"Graph wins: {ev['graph_wins']}, Embeddings wins: {ev['embeddings_wins']}")
  print("Explanations:")
  for i, explanation in enumerate(ev['explanation']):
    print(f"  {i+1} ({ev['best_answer'][i]} won). {explanation}")
  print()
print("-----------------------")
print(f"Total Graph Wins: {total_graph_wins}")
print(f"Total Embeddings Wins: {total_embeddings_wins}")
print(f"Total Winner: {total_winner}")

Question: How did the financial and political crisis contribute to the calling of the Estates-General in 1789?
Winner: graph
Graph wins: 5, Embeddings wins: 0
Explanations:
  1 (graph won). The best answer is the first one because it acknowledges the inability to answer the question directly, whereas the other answer is irrelevant and does not address the question at all.
  2 (graph won). The best answer is the other answer because it directly addresses the inability to answer the question based on the provided data, whereas the other answer is irrelevant and does not address the question at all.
  3 (graph won). The best answer is the second one because it acknowledges the inability to answer the question based on the provided data. The other answer does not address the question at all and instead asks for the question to be provided again.
  4 (graph won). The best answer is the first one because it directly acknowledges the inability to answer the question, while the other answer is

So GraphRag is the clear winner. No wonder - by analysing the text, extracting entities and facts, and retrieving them to construct a rich context, it prepares the LLM to answer our questions with depth and accuracy. The cost upfront (in both time and money) is significantly higher, but at inference time the expense, while still higher, is tolerable, and similar to many complex retrieval systems, while resulting in a much better understanding of the data and superior result quality.