# GraphRAG using langchain

* Video tutorial [here](https://medium.com/data-science-in-your-pocket/graphrag-using-langchain-31b1ef8328b9)

## Approach 1: LLMGraphTransformer

LLMGraphTransformer: [documentation](https://api.python.langchain.com/en/latest/experimental/graph_transformers/langchain_experimental.graph_transformers.llm.LLMGraphTransformer.html#langchain_experimental.graph_transformers.llm.LLMGraphTransformer.convert_to_graph_documents)

pip install --upgrade --quiet  json-repair networkx langchain-core langchain-google-vertexai langchain-experimental langchain-community

#versions used
langchain==0.2.8
langchain-community==0.2.7
langchain-core==0.2.19
langchain-experimental==0.0.62
langchain-google-vertexai==1.0.3

2. Import required functions. Initialize your LLM object & reference text. Use any SOTA LLM for best results as Knowledge Graph creation is a complicated task.

In [None]:
import os
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain_google_vertexai import VertexAI 
import networkx as nx
from langchain.chains import GraphQAChain
from langchain_core.documents import Document
from langchain_community.graphs.networkx_graph import NetworkxEntityGraph

llm = VertexAI(max_output_tokens=4000,model_name='text-bison-32k')

text = """
Marie Curie, born in 1867, was a Polish and naturalised-French physicist and chemist who conducted pioneering research on radioactivity.
She was the first woman to win a Nobel Prize, the first person to win a Nobel Prize twice, and the only person to win a Nobel Prize in two scientific fields.
Her husband, Pierre Curie, was a co-winner of her first Nobel Prize, making them the first-ever married couple to win the Nobel Prize and launching the Curie family legacy of five Nobel Prizes.
She was, in 1906, the first woman to become a professor at the University of Paris. 
"""

3. Next, we need to load this text as GraphDocuments and create a GraphTransformer object using the LLM-loaded

In [None]:
documents = [Document(page_content=text)]
llm_transformer = LLMGraphTransformer(llm=llm)
graph_documents = llm_transformer.convert_to_graph_documents(documents)

4. Its time to create the Knowledge Graph. For this, you better provide a list of entities and relationships you wish to extract else LLM might identify everything as an entity or relationship

In [None]:
llm_transformer_filtered = LLMGraphTransformer(
    llm=llm,
    allowed_nodes=["Person", "Country", "Organization"],
    allowed_relationships=["NATIONALITY", "LOCATED_IN", "WORKED_AT", "SPOUSE"],
)
graph_documents_filtered = llm_transformer_filtered.convert_to_graph_documents(
    documents
)

As you must have guessed, the above snippet creates

Node = “Person”, “Country”, “Organization”

Relation = [“NATIONALITY”, “LOCATED_IN”, “WORKED_AT”, “SPOUSE”]

Note: Any other potential node or relation would be discarded. If you aren’t sure, you can just pass the LLM object and let the LLM decide

5. We now need to create a Networkx graph and add the above-identified nodes and edges to this graph

In [None]:
graph = NetworkxEntityGraph()

# Add nodes to the graph
for node in graph_documents_filtered[0].nodes:
    graph.add_node(node.id)

# Add edges to the graph
for edge in graph_documents_filtered[0].relationships:
    graph._graph.add_edge(
            edge.source.id,
            edge.target.id,
            relation=edge.type,
        )

6. Let’s create a GraphQAChain now that will help us to interact with the Knowledge Base

In [None]:
chain = GraphQAChain.from_llm(
    llm=llm, 
    graph=graph, 
    verbose=True
)

7. Call the chain object with your query

In [None]:
question = """Who is Marie Curie?"""
chain.run(question)

## Approach 2: GraphIndexCreator

Another approach is to use GraphIndexCreator in LangChain which is very similar to the above approach

1. It first create a GraphIndexCreator using an LLM
2. Reads text from a .txt file
3. Creates graph using the index creator
4. Runs the GraphQA chain on the graph similar to above approach

In [None]:
from langchain.indexes import GraphIndexCreator
from langchain.chains import GraphQAChain

index_creator = GraphIndexCreator(llm=llm)

with open("/home/cdsw/sample.txt") as f:
    all_text = f.read()
    
text = "\n".join(all_text.split("\n\n"))
graph = index_creator.from_text(text)

chain = GraphQAChain.from_llm(llm, graph=graph, verbose=True)
chain.run("What did Pierre Curie won?")