# Practicle implementation of RAG & GraphRAG (microsoft)




# Installing libraries

In [1]:
! pip install tantivy
! pip install -U langchain-openai langchain-community
! pip install lancedb
! pip install graphrag
! pip install langchainhub

Collecting tantivy
  Downloading tantivy-0.22.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.2 kB)
Downloading tantivy-0.22.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.5/4.5 MB[0m [31m21.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: tantivy
Successfully installed tantivy-0.22.0
Collecting langchain-openai
  Downloading langchain_openai-0.1.22-py3-none-any.whl.metadata (2.6 kB)
Collecting langchain-community
  Downloading langchain_community-0.2.12-py3-none-any.whl.metadata (2.7 kB)
Collecting langchain-core<0.3.0,>=0.2.33 (from langchain-openai)
  Downloading langchain_core-0.2.33-py3-none-any.whl.metadata (6.2 kB)
Collecting openai<2.0.0,>=1.40.0 (from langchain-openai)
  Downloading openai-1.41.0-py3-none-any.whl.metadata (22 kB)
Collecting tiktoken<1,>=0.7 (from langchain-openai)
  Downloading tiktoken-0.7.0-cp310-cp310-manylinux_2_17_x86_64.ma

#Dataset Download

In [2]:
# Create the dataset directory
!mkdir -p rag_exim/input

# Download the dataset into the input directory
!wget -P rag_exim/input https://github.com/akashAD98/dummy_data/raw/main/HTE_gst_scheme.txt


--2024-08-19 13:15:26--  https://github.com/akashAD98/dummy_data/raw/main/HTE_gst_scheme.txt
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/akashAD98/dummy_data/main/HTE_gst_scheme.txt [following]
--2024-08-19 13:15:26--  https://raw.githubusercontent.com/akashAD98/dummy_data/main/HTE_gst_scheme.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 216384 (211K) [text/plain]
Saving to: ‘rag_exim/input/HTE_gst_scheme.txt’


2024-08-19 13:15:27 (5.88 MB/s) - ‘rag_exim/input/HTE_gst_scheme.txt’ saved [216384/216384]



In [3]:
import os
import getpass

os.environ['OPENAI_API_KEY'] = 'sk-proj-'
os.environ["LANGCHAIN_TRACING_V2"] = "False"


In [12]:

import bs4
from langchain import hub
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import CharacterTextSplitter
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import LanceDB
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from lancedb.rerankers import LinearCombinationReranker



loader = TextLoader("/content/rag_exim/input/HTE_gst_scheme.txt")
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1100, chunk_overlap=100)
splits = text_splitter.split_documents(docs)
reranker = LinearCombinationReranker(weight=0.3)
vectorstore = LanceDB.from_documents(documents=splits, embedding=OpenAIEmbeddings(),reranker=reranker)

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

query = "tell me how to do export?"

rag_chain.invoke(query)

  warn_beta(


'To export, one must first decide on the mode of shipping and find an ocean freight forwarder. It is important to arrange for a customs clearance agent to ensure compliance with export processes. Exporting can lead to increased profit margins, reduced production costs per unit, and improved liquidity for a business.'

#GraphRAG

In [4]:
#please pass opemai api key
import os
os.environ["GRAPHRAG_API_KEY"] = "sk-proj-"


#### Step 1. *This* command sets up the initial structure for indexing your dataset.


In [6]:
!python3 -m graphrag.index --init --root /content/rag_exim

2024-08-19 13:24:21.696316: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-08-19 13:24:21.732293: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-08-19 13:24:21.742711: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
[2KInitializing project at [35m/content/[0m[95mrag_exim[0m
⠋ GraphRAG Indexer 

#### Step 2. Indexing the **dataset**



In [7]:
!python3 -m graphrag.index --root  /content/rag_exim

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  return bound(*args, **kwds)
⠹ GraphRAG Indexer 
├── Loading Input (text) - 1 files loaded (0 filtered) [90m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m100%[0m [36m0:00:00[0m [33m0:00:00[0m
└── create_base_text_units
[2K[1A[2K[1A[2K[1A[2K⠹ GraphRAG Indexer 
├── Loading Input (text) - 1 files loaded (0 filtered) [90m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m100%[0m [36m0:00:00[0m [33m0:00:00[0m
[2K[1A[2K[1A[2K🚀 [32mcreate_base_text_units[0m
⠹ GraphRAG Indexer 
├── Loading Input (text) - 1 files loaded (0 filtered) [90m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m100%[0m [36m0:00:00[0m [33m0:00:00[0m
[2K[1A[2K[1A[2K                                  id  [33m...[0m n_tokens
[1;36m0[0m   b72621ee5015cef242293da6c269b253  [33m...[0m     [1;36m1200[0m
[1;36m1[0m   f47f6a2e18ac153318640fa71a7c1538  [33m...[0m     [1;36m1200[0m
[1;36m2[0m   0a7889ece89d1dda42f5268b737eca32  [33m...[0m     [1;36m1200[0m

#### step 3. Execute the query


In [8]:
!python3 -m graphrag.query --root /content/rag_exim --method global "tell me how to do export?"

2024-08-19 13:36:54.304139: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-08-19 13:36:54.378178: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-08-19 13:36:54.397095: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered



INFO: Reading settings from /content/rag_exim/settings.yaml
creating llm client with {'api_key': 'REDACTED,len=56', 'type': "openai_chat", 'model': 'gpt-4-turbo-preview', 'max_tokens': 4000, 'temperature': 0.0, 'top_p': 1.0, 'n': 1, 'request_timeout': 180.0, 'api_base': None, 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoin

In [9]:
!python3 -m graphrag.query --root /content/rag_exim --method local "Tell me how to do export ?"


2024-08-19 13:38:13.020369: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-08-19 13:38:13.045082: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-08-19 13:38:13.052288: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered



INFO: Reading settings from /content/rag_exim/settings.yaml

INFO: Vector Store Args: {}
[0m[38;5;8m[[0m2024-08-19T13:38:26Z [0m[33mWARN [0m lance::dataset[0m[38;5;8m][0m No existing dataset at /content/lancedb/entity_description_embeddings.lance, it will be created
creating llm client with {'api_key': 'REDACTED,len=56', 'type': "openai_chat", 'model'

#### Query local search

In [None]:
!python3 -m graphrag.query --root /content/rag_exim --method local "what are the different Payment Methods in International Trade"

2024-08-11 08:34:43.868779: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-08-11 08:34:43.912881: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-08-11 08:34:43.924254: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


INFO: Reading settings from /content/rag_exim/settings.yaml

INFO: Vector Store Args: {}
[0m[38;5;8m[[0m2024-08-11T08:34:58Z [0m[33mWARN [0m lance::dataset[0m[38;5;8m][0m No existing dataset at /content/lancedb/entity_description_embeddings.lance, it will be created
creating llm client with {'api_key': 'REDACTED,len=56', 'type': "openai_chat", 'model':

### Below is python code you can use it for your chatbot application

In [None]:


import os
import pandas as pd
import tiktoken
import asyncio
from graphrag.query.indexer_adapters import read_indexer_entities, read_indexer_reports
from graphrag.query.llm.oai.chat_openai import ChatOpenAI
from graphrag.query.llm.oai.typing import OpenaiApiType
from graphrag.query.structured_search.global_search.community_context import (
    GlobalCommunityContext,
)
from graphrag.query.structured_search.global_search.search import GlobalSearch

# ## Global Search example
api_key = os.environ["GRAPHRAG_API_KEY"] = "sk-proj-"
llm_model = "gpt-3.5-turbo"

llm = ChatOpenAI(
    api_key=api_key,
    model=llm_model,
    api_type=OpenaiApiType.OpenAI,  # OpenaiApiType.OpenAI or OpenaiApiType.AzureOpenAI
    max_retries=20,
)


print("llm loaded okay")

token_encoder = tiktoken.get_encoding("cl100k_base")

INPUT_DIR = "/content/rag_exim/output/20240811-082355/artifacts"   # path of output folder which has all parquete files
COMMUNITY_REPORT_TABLE = "create_final_community_reports"
ENTITY_TABLE = "create_final_nodes"
ENTITY_EMBEDDING_TABLE = "create_final_entities"

# community level in the Leiden community hierarchy from which we will load the community reports
# higher value means we use reports from more fine-grained communities (at the cost of higher computation cost)
COMMUNITY_LEVEL = 2

# %%
entity_df = pd.read_parquet(f"{INPUT_DIR}/{ENTITY_TABLE}.parquet")
report_df = pd.read_parquet(f"{INPUT_DIR}/{COMMUNITY_REPORT_TABLE}.parquet")
entity_embedding_df = pd.read_parquet(f"{INPUT_DIR}/{ENTITY_EMBEDDING_TABLE}.parquet")

reports = read_indexer_reports(report_df, entity_df, COMMUNITY_LEVEL)
entities = read_indexer_entities(entity_df, entity_embedding_df, COMMUNITY_LEVEL)
print(f"Report records: {len(report_df)}")
report_df.head()

# #### Build global context based on community reports
context_builder = GlobalCommunityContext(
    community_reports=reports,
    entities=entities,  # default to None if you don't want to use community weights for ranking
    token_encoder=token_encoder,
)

# #### Perform global search

context_builder_params = {
    "use_community_summary": False,  # False means using full community reports. True means using community short summaries.
    "shuffle_data": True,
    "include_community_rank": True,
    "min_community_rank": 0,
    "community_rank_name": "rank",
    "include_community_weight": True,
    "community_weight_name": "occurrence weight",
    "normalize_community_weight": True,
    "max_tokens": 3_000,  # change this based on the token limit you have on your model (if you are using a model with 8k limit, a good setting could be 5000)
    "context_name": "Reports",
}

map_llm_params = {
    "max_tokens": 1000,
    "temperature": 0.0,
    "response_format": {"type": "json_object"},
}

reduce_llm_params = {
    "max_tokens": 2000,  # change this based on the token limit you have on your model (if you are using a model with 8k limit, a good setting could be 1000-1500)
    "temperature": 0.0,
}

search_engine = GlobalSearch(
    llm=llm,
    context_builder=context_builder,
    token_encoder=token_encoder,
    max_data_tokens=12_000,  # change this based on the token limit you have on your model (if you are using a model with 8k limit, a good setting could be 5000)
    map_llm_params=map_llm_params,
    reduce_llm_params=reduce_llm_params,
    allow_general_knowledge=False,  # set this to True will add instruction to encourage the LLM to incorporate general knowledge in the response, which may increase hallucinations, but could be useful in some use cases.
    json_mode=True,  # set this to False if your LLM model does not support JSON mode.
    context_builder_params=context_builder_params,
    concurrent_coroutines=10,
    response_type="multiple-page report",  # free form text describing the response type and format, can be anything, e.g. prioritized list, single paragraph, multiple paragraphs, multiple-page report
)

query = "what are the different Payment Methods in International Trade"
result = await search_engine.asearch(query)
print(result.response)
print("____________________________________")
print(f"LLM calls: {result.llm_calls}. LLM tokens: {result.prompt_tokens}")


llm loaded okay
Report records: 48
# Summary of Analyst Reports on Payment Methods in International Trade

In the realm of international trade, various payment methods play crucial roles in facilitating transactions between parties. Analysts have highlighted key methods that are commonly employed, each with its unique characteristics and implications.

## Letter of Credit (LC)
Analysts emphasize the significance of **Letter of Credit (LC)** as a widely used payment method in international trade. This method involves the buyer's bank guaranteeing payment to the seller upon the presentation of compliant documents. It serves to provide security for both parties involved in the transaction [Data: Reports (15, 24, 35, 42, +more)].

## Documentary Collection
**Documentary Collection** is another essential payment method where the exporter entrusts the collection of payment to their bank. The bank forwards shipping documents to the importer's bank for payment, offering more security than open

You can clearly see the difference in output quality between the two methods. GraphRAG is able to extract deeper insights from graph nodes, whereas traditional RAG struggles in this area. However, this comes at a higher cost - GraphRAG requires 11 LLM calls, leading to increased token usage.
Both methods have their trade-offs, but there are ways to enhance their performance. For GraphRAG, using a local LLaMA model could reduce costs, while RAG has various optimization techniques available. Check out our repository for more details on improving both approaches