<a href="https://colab.research.google.com/github/Ilikepizza2/Graph-Database/blob/main/Graph_database.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install pyvis
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
import math
import torch
import IPython
from pyvis.network import Network
import numpy as np
import pandas as pd
import google.generativeai as genai
import google.ai.generativelanguage as glm

import json
from IPython.display import Markdown
import textwrap

# Used to securely store your API key
from google.colab import userdata

from IPython.display import Markdown

## Entity recognition and relationship extraction
We have the option to use two models to try each one - REBEL and Gemini.

- Rebel(rebel-lage): This is based on bart model fine tuned to do relationship extraction(RE) and entity recognition(ER).

- Gemini: This new llm by Google can also do ER and RE using specific prompts

For rebel, we use the tokenizer and prediction model from [Babelscape/rebel-large](https://huggingface.co/Babelscape/rebel-large)

Before using the rebel model, obtain your api key from hugging face and put it in secrets as `HF_TOKEN`

In [4]:
tokenizer = AutoTokenizer.from_pretrained('Babelscape/rebel-large')
model = AutoModelForSeq2SeqLM.from_pretrained('Babelscape/rebel-large')

tokenizer_config.json:   0%|          | 0.00/1.23k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/123 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/344 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.42k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

Before using the gemini model, obtain your api key from Google and put it in secrets as `GOOGLE_API_KEY`

In [5]:
API_KEY=userdata.get('GOOGLE_API_KEY')

genai.configure(api_key=API_KEY)

## Postprocessing the rebel output
Cleaning and extracting the relationships and entities from the output of the rebel model. Code taken directly from their [hugging face readme.](https://huggingface.co/Babelscape/rebel-large)

In [6]:
def extract_relations_from_model_output(text):
    relations = []
    relation, subject, relation, object_ = '', '', '', ''
    text = text.strip()
    current = 'x'
    text_replaced = text.replace("<s>", "").replace("<pad>", "").replace("</s>", "")
    for token in text_replaced.split():
        if token == "<triplet>":
            current = 't'
            if relation != '':
                relations.append({
                    'head': subject.strip(),
                    'type': relation.strip(),
                    'tail': object_.strip()
                })
                relation = ''
            subject = ''
        elif token == "<subj>":
            current = 's'
            if relation != '':
                relations.append({
                    'head': subject.strip(),
                    'type': relation.strip(),
                    'tail': object_.strip()
                })
            object_ = ''
        elif token == "<obj>":
            current = 'o'
            relation = ''
        else:
            if current == 't':
                subject += ' ' + token
            elif current == 's':
                object_ += ' ' + token
            elif current == 'o':
                relation += ' ' + token
    if subject != '' and relation != '' and object_ != '':
        relations.append({
            'head': subject.strip(),
            'type': relation.strip(),
            'tail': object_.strip()
        })
    return relations

# Knowledge Base (KB) Class

This code defines a `KB` class that represents a knowledge base. The knowledge base stores entities, relations between entities, and a document frame (df) containing information about the relations.

## Functionality

The `KB` class provides various functionalities for managing the knowledge base, including:

* Adding and removing entities and relations
* Checking if an entity or relation exists
* Saving and loading the knowledge base from JSON files
* Finding entities and relations based on specific criteria
* Constructing context from fetched documents using nearest neighbor search

## Usage

```python

# Create a KB object
kb = KB()

# Add entities and relations
kb.add_entity("Alice")
kb.add_entity("Bob")
kb.add_relation({"head": "Alice", "tail": "Bob", "type": "friends"})

# Print the KB contents
kb.print()

# Find all relations of type "friends"
relations = kb.find_by_relation("friends")
print(kb.relations_to_string(relations))

# Save the KB to JSON files
kb.save_graph()

# Load the KB from JSON files
kb.load_graph()

# Find context for entity "Alice" using nearest neighbor search
context = kb.construct_data_from_fetched_docs_nearest_k_neighbours("Alice")
print(context)

# Find the top 3 most relevant passages for the query "What is the capital of France?"
top_passages = kb.find_top_passages("What is the capital of France?", k=3)
print("Top Passages:")
for passage in top_passages:
  print(passage)

# Find the nearest 2 context entities for the entity "Paris"
context = kb.nearest_k_context("Paris", k=2)
print("Nearest Context:")
for item in context:
  print(item)
```

In [7]:
from collections import deque
import os

class KB():
    def __init__(self):
        self.relations = []
        self.entities = []
        self.adj = {}
        self.df = pd.DataFrame()
        self.model = 'models/embedding-001'

    def embed_fn(self,title, text):
      return genai.embed_content(model=self.model,
                                content=text,
                                task_type="retrieval_document",
                                title=title)["embedding"]
    def clear_graph(self):
      self.__init__()

    def save_graph(self):
      self.df.to_json("df.json")
      data = {
          "relations": self.relations,
          "entities": self.entities,
          "adj": self.adj,
          "model": self.model,
          "df": "df.json"
      }


      with open("data.json", "w") as f:
        json.dump(data, f, indent=4)

    def load_json_data(self,filename):
      if not os.path.exists(filename):
        print(f"File '{filename}' does not exist.")
        return None

      with open(filename, "r") as f:
        try:
          data = json.load(f)
          return data
        except json.JSONDecodeError:
          print(f"Error decoding JSON data from {filename}.")
          return None

    def load_graph(self):
      data = self.load_json_data('data.json')
      if data != None:
        self.relations = data["relations"]
        self.entities = data["entities"]
        self.adj = data["adj"]
        self.df = pd.read_json(data["df"])
        self.model = data["model"]

    def add_entity(self, e):
      if e not in self.entities:
        self.entities.append(e)
        self.adj[e] = []

    def merge_relations(self, r1):
        r2 = [r for r in self.relations
              if self.are_relations_equal(r1, r)][0]
        spans_to_add = [span for span in r1["meta"]["spans"]
                        if span not in r2["meta"]["spans"]]
        r2["meta"]["spans"] += spans_to_add

    def add_relation(self, r):
        if not self.exists_relation(r):
          self.relations.append(r)

          u, v, w = r["head"], r["tail"], r["type"]

          self.df = pd.concat([self.df, pd.DataFrame({"Title": w, "Text": f'{u} ---> {w} ---> {v}', "Embeddings": [self.embed_fn(w, f'{u} $$^^$$ {v}')]})], axis=0, ignore_index=True)
          if u not in self.entities:
            self.entities.append(u)
            self.adj[u] = []
          self.adj[u].append((v, w))

          if v not in self.entities:
            self.entities.append(v)
            self.adj[v] = []
          # self.adj[v].append((u, w))  # for bidirectional
        else:
          u, v, w = r["head"], r["tail"], r["type"]
          if u not in self.entities:
            self.entities.append(u)
            self.adj[u] = []
          # self.adj[u].append((v, w))
          if v not in self.entities:
            self.entities.append(v)
            self.adj[v] = []
          self.merge_relations(r)

    def are_relations_equal(self, r1, r2):
        return all(r1[attr] == r2[attr] for attr in ["head", "type", "tail"])

    def exists_relation(self, r1):
        return any(self.are_relations_equal(r1, r2) for r2 in self.relations)

    def exists_entity(self, e):
      return e in self.entities

    def print(self):
        print("Entities:")
        for e in self.entities:
          print(f"  {e}")
        print("Relations:")
        for r in self.relations:
            print(f"  {r}")

    def find_by_relation(self, relation_type):
      return [relation for relation in self.relations if relation.get('type') == relation_type]

    def relations_to_string(self,relations):
      result = ""
      for relation in relations:
          result += f"{relation['head']} {relation['type']} {relation['tail']}\n"
      return result.strip()

    def relation_search(self, relation_type):
      matched_relations = self.find_by_relation(relation_type)

      if matched_relations is None:
        return ''
      else:
        return self.relations_to_string(matched_relations)

    def find_entities_from_embeddings(self, fetched_docs):
      fetched_entities = set()

      # Iterate through each string in the list
      for item in fetched_docs:
        # Split the string into parts using the separator "-->"
        parts = item.split(" ---> ")

        # Extract the first and third parts
        source_entity = parts[0].strip()
        target_entity = parts[2].strip()

        # Add the first and third parts to the set
        fetched_entities.add(source_entity)
        fetched_entities.add(target_entity)

      return fetched_entities

    def construct_data_from_fetched_docs_nearest_k_neighbours(self, query, k_neighbours=2, k_embeddings=5):
      fetched_docs = self.find_top_passages(query, k_embeddings)
      context_string = ""
      fetched_entities = self.find_entities_from_embeddings(fetched_docs)
      context_from_entities = set()
      for entity in fetched_entities:
        res = self.nearest_k_context(entity, k_neighbours)
        if(res != None):
          context_from_entities.update(res)
      for text in context_from_entities:
        context_string= context_string+text+'\n'
      return context_string


    def find_top_passages(self,query, k=5):
      """
      Compute the distances between the query and each document in the dataframe using the dot product.
      Returns the top k most relevant passages.
      """
      embed_model = 'models/embedding-001'

      request = genai.embed_content(model=embed_model,
                                    content=query,
                                    task_type="retrieval_query")
      query_embedding = genai.embed_content(model=embed_model, content=query, task_type="retrieval_query")
      dot_products = np.dot(np.stack(self.df['Embeddings']), query_embedding["embedding"])

      # Sort passages by dot product in descending order (highest similarity first)
      sorted_idx = np.argsort(dot_products)[::-1]

      # Return the top k passages and their corresponding text
      top_passages = []
      for i in range(min(k, len(sorted_idx))):
        idx = sorted_idx[i]
        passage_text = self.df.iloc[idx]['Text']
        top_passages.append(passage_text)
      return top_passages

    def nearest_k_context(self, e, k=2):
      res = set()
      if e not in self.adj:
          return None

      visited = set()
      queue = deque([(e, 1)])  # (node, level)
      print(e)

      while queue:
          node, level = queue.popleft()
          if level > k:
              break
          visited.add(node)
          for neighbor, relation in self.adj[node]:
              if neighbor not in visited:
                  queue.append((neighbor, level + 1))
                  res.add(node + ' ' + relation + ' ' + neighbor + '. ')
                  visited.add(neighbor)

      return res


kb=KB()

# Build the graph using rebel model
This function uses the rebel model for entity extraction and adds them to the knowledge graph

In [8]:
def build_using_rebel(text, span_length=128, verbose=False):
    # tokenize whole text
    inputs = tokenizer([text], return_tensors="pt")

    # compute span boundaries
    num_tokens = len(inputs["input_ids"][0])
    if verbose:
        print(f"Input has {num_tokens} tokens")
    num_spans = math.ceil(num_tokens / span_length)
    if verbose:
        print(f"Input has {num_spans} spans")
    overlap = math.ceil((num_spans * span_length - num_tokens) /
                        max(num_spans - 1, 1))
    spans_boundaries = []
    start = 0
    for i in range(num_spans):
        spans_boundaries.append([start + span_length * i,
                                 start + span_length * (i + 1)])
        start -= overlap
    if verbose:
        print(f"Span boundaries are {spans_boundaries}")

    # transform input with spans
    tensor_ids = [inputs["input_ids"][0][boundary[0]:boundary[1]]
                  for boundary in spans_boundaries]
    tensor_masks = [inputs["attention_mask"][0][boundary[0]:boundary[1]]
                    for boundary in spans_boundaries]
    inputs = {
        "input_ids": torch.stack(tensor_ids),
        "attention_mask": torch.stack(tensor_masks)
    }

    # generate relations
    num_return_sequences = 3
    gen_kwargs = {
        "max_length": 256,
        "length_penalty": 0,
        "num_beams": 3,
        "num_return_sequences": num_return_sequences
    }
    generated_tokens = model.generate(
        **inputs,
        **gen_kwargs,
    )

    # decode relations
    decoded_preds = tokenizer.batch_decode(generated_tokens,
                                           skip_special_tokens=False)

    # create kb
    i = 0
    for sentence_pred in decoded_preds:
        current_span_index = i // num_return_sequences
        relations = extract_relations_from_model_output(sentence_pred)
        for relation in relations:
            relation["meta"] = {
                "spans": [spans_boundaries[current_span_index]]
            }
            kb.add_relation(relation)
        i += 1

    print("Knowledge Graph created!")

# Build the graph using gemini model
This function uses the Gemini model for entity extraction and adds them to the knowledge graph

In [9]:
import json
from IPython.display import Markdown
import textwrap

gemini_model = genai.GenerativeModel('gemini-pro')


def graphPrompt(input: str, metadata={}):
    SYS_PROMPT = (
        "You are a network graph maker who extracts terms and their relations from a given context. "
        "You are provided with a context chunk (delimited by ```) Your task is to extract the ontology "
        "of terms mentioned in the given context. These terms should represent the key concepts as per the context. \n"
        "Thought 1: While traversing through each sentence, Think about the key terms mentioned in it.\n"
            "\tTerms may include object, entity, location, organization, person, \n"
            "\tcondition, acronym, documents, service, concept, etc.\n"
            "\tTerms should be as atomistic as possible\n\n"
        "Thought 2: Think about how these terms can have one on one relation with other terms.\n"
            "\tTerms that are mentioned in the same sentence or the same paragraph are typically related to each other.\n"
            "\tTerms can be related to many other terms\n\n"
        "Thought 3: Find out the relation between each such related pair of terms. \n"
            "\tTry that same terms are consistent and relations should also be consistent."
        "Format your output as a list of json. Each element of the list contains a pair of terms"
        "and the relation between them, like the follwing: \n"
        "[\n"
        "   {\n"
        '       "head": "A concept from extracted ontology",\n'
        '       "type": "relationship between the two concepts, node_1 and node_2 in one or two sentences",\n'
        '       "tail": "A related concept from extracted ontology"\n'
        "   }, {...}\n"
        "]"
        "Example just for reference:"
        "[\n"
        "   {\n"
        '       "head": "Napolean Bonaparte",\n'
        '       "type": "Fought in",\n'
        '       "tail": "Battle of Waterloo"\n'
        "   },\n"
        "   {\n"
        '       "head": "France",\n'
        '       "type": "was Ruled by",\n'
        '       "tail": "Napolean Bonaparte"\n'
        "    }, \n"
        "   {\n"
        '       "head": "Bourbons",\n'
        '       "type": "restored rule",\n'
        '       "tail": "France"\n'
        "    }, {...}\n"
        "]"
    )

    USER_PROMPT = f"context: ```{input}``` \n\n output: "
    response = gemini_model.generate_content(SYS_PROMPT + ' ' + USER_PROMPT)
    try:
        result = json.loads(response.text)
    except:
        print("\n\nERROR ### Here is the buggy response: ", response.text, response.candidates, "\n\n")
        result = None

    return result

def build_using_gemini(text):
  relations = graphPrompt(text)
  if(relations != None):
    for relation in relations:
      kb.add_relation(relation)
  print("Knowledge graph created")

Generate text using Gemini
This cell generates the text for a given `topic`

In [10]:
%%time
gen_prompt = "You are a great in detail teacher. Generate content in detail full of objects and their relations to each other. The topic is: "
topic = "Knowledge graphs"
response = gemini_model.generate_content(gen_prompt+topic)

gen_text = response.text
gen_text



CPU times: user 248 ms, sys: 39.5 ms, total: 288 ms
Wall time: 18 s


'**Knowledge Graphs: A Comprehensive Guide**\n\n**Introduction**\n\nKnowledge graphs are structured representations of knowledge that enable machines to understand and reason about the world. They organize a vast collection of entities (objects, concepts, or events) and their relationships into a graph-based data structure.\n\n**Components of Knowledge Graphs**\n\nKnowledge graphs typically consist of three main components:\n\n* **Entities:** Represent real-world objects, concepts, or events. Examples include countries, cities, people, organizations, and products.\n* **Relationships:** Define the connections between entities. Common relationships include "is a," "part of," "located in," and "has property."\n* **Attributes:** Provide additional information about entities, such as name, description, or birth date.\n\n**Types of Knowledge Graphs**\n\nKnowledge graphs can vary in size and complexity. Some common types include:\n\n* **Domain-specific Knowledge Graphs:** Focus on a particula

Or you can provide the text directly

In [19]:
sample_text = """
Unveiling the Knowledge Graph: A World of Interconnected Information

Imagine a vast library, not just filled with books, but with interconnected scrolls, each containing information about a specific entity - a person, place, event, or even an abstract concept. These scrolls are linked together by threads, representing the relationships between them. This intricate tapestry of knowledge is what we call a knowledge graph.

Delving into the Structure:

    Nodes: The scrolls in our library, these are the individual entities within the knowledge graph. They can be anything from physical objects like the Eiffel Tower or the Amazon River, to abstract concepts like democracy or evolution.
    Edges: The threads that weave through the library, connecting the nodes and representing the relationships between them. These relationships can be diverse, such as:
        Is-a: This connects a specific entity to a more general category. For example, "Eiffel Tower is-a tower".
        Has-a: This indicates possession or composition. For example, "Amazon River has-a source in the Andes mountains".
        Part-of: This signifies the membership of an entity within a larger whole. For example, "France is-part-of the European Union".
        Located-in: This specifies the geographical location of an entity. For example, "Taj Mahal is-located-in Agra, India".

Building the Knowledge Graph:

    Data Sources: The scrolls in our library are not created in isolation. They are filled with information gathered from various sources, like:
        Web documents: Crawled and processed by search engines like Google.
        Databases: Structured information from various organizations.
        Human experts: Manually adding specialized knowledge.
    Machine Learning: Just like a skilled librarian organizes the scrolls, machine learning algorithms analyze the data, identify entities and relationships, and automatically populate the knowledge graph.
    Reasoning: The knowledge graph goes beyond storing facts; it allows for reasoning and inference. By analyzing the connections between entities, the graph can unveil implicit knowledge. For example, if the graph knows that "Paris is-located-in France" and "France is-in Europe", it can infer that "Paris is-in Europe" without explicitly stating it.

Applications of Knowledge Graphs:

    Search Engines: Knowledge graphs power search engines like Google by providing context to search queries and enabling them to return more relevant and informative results.
    Virtual Assistants: They empower assistants like Siri and Alexa to understand user requests better and provide accurate and comprehensive answers.
    Recommendation Systems: Knowledge graphs help recommend products, movies, or music based on users' past preferences and the relationships between different entities.
    Fraud Detection: By analyzing connections between entities, they can identify patterns indicative of fraudulent activities.

The Future of Knowledge Graphs:

As technology advances, knowledge graphs are poised to play an even greater role in various fields. They hold the potential to revolutionize how we interact with information, enabling machines to understand the world in a more nuanced and human-like way.

In conclusion, knowledge graphs are not just static repositories of data; they are dynamic networks of interconnected knowledge, constantly evolving and offering a deeper understanding of the world around us.

Knowledge Graphs and their Synergy with Retrieval-Augmented Generation: Unveiling Contextual Retrieval Power

Abstract: This paper explores the collaboration between knowledge graphs (KGs) and retrieval-augmented generation (RAG) in information retrieval tasks. KGs provide a structured and interconnected representation of knowledge, while RAG utilizes large language models (LLMs) to retrieve and generate information based on user queries. The paper delves into how KGs and RAG synergistically enhance retrieval accuracy and enable generation of contextually rich responses.

Introduction:

In the realm of information retrieval, the quest for accurate and insightful responses hinges on efficient access to relevant knowledge and its nuanced interpretation. Knowledge graphs (KGs) play a pivotal role by representing knowledge as interconnected entities and relationships, akin to a vast web of interconnected concepts. Retrieval-augmented generation (RAG), on the other hand, leverages the power of large language models (LLMs) to retrieve and generate text based on retrieved information. The interplay between these two paradigms holds immense potential for advancing information retrieval capabilities.

Knowledge Graphs: A Structured Repository of Knowledge:

KGs function as structured repositories of knowledge, analogous to meticulously organized databases. They store entities, encompassing both concrete objects (e.g., Eiffel Tower) and abstract concepts (e.g., democracy), coupled with relationships that reveal their interconnectedness. These relationships, such as "is-a," "has-a," "part-of," and "located-in," provide crucial context for understanding the significance and relevance of individual entities within the knowledge graph. For instance, the relationship "Eiffel Tower is-a tower located-in Paris, France" not only identifies the Eiffel Tower as a physical structure but also situates it geographically.

Retrieval-Augmented Generation: Enabling Contextual Retrieval and Generation:

RAG acts as a bridge between user queries and the vast knowledge stored within the KG. It employs vector search techniques to efficiently identify entities and their relationships that exhibit semantic similarity to the user's query. This retrieved information, akin to extracted evidence, is then analyzed by the LLM within the RAG framework. The LLM leverages its reasoning capabilities and draws upon the rich context provided by the KG's relationships to not only retrieve relevant factual information but also to:

    Infer new knowledge: By analyzing the interconnectedness of entities within the KG, the LLM can infer implicit knowledge not explicitly stated in the retrieved information.
    Generate diverse creative text formats: Going beyond mere factual retrieval, the LLM can generate summaries, explanations, or even narratives, drawing insights from the retrieved information and the broader context provided by the KG.

Synergy of KGs and RAG: Powering Contextual Retrieval and Generation:

The collaboration between KGs and RAG unlocks a multitude of benefits for information retrieval tasks:

    Enhanced Retrieval Accuracy: KGs guide RAG towards the most relevant information within the knowledge base, leading to more focused retrieval and improved understanding of the user's intent.
    Contextualized Responses: The rich connections within the KG empower RAG to generate responses that go beyond isolated facts, offering insights and connections that transcend the user's initial query.
    Improved Reasoning Capabilities: The combined power of the LLM's reasoning abilities and the KG's interconnected knowledge structures enables RAG to generate responses that are not explicitly present in the retrieved data, fostering deeper understanding.

Conclusion:

The synergy between KGs and RAG paves the way for a paradigm shift in information retrieval. By seamlessly integrating structured knowledge with LLM-powered retrieval and generation, this collaboration enables the provision of contextually rich and insightful responses to user queries. As KGs evolve and LLMs become more sophisticated, the potential of this partnership to revolutionize various domains, from search engines and virtual assistants to intelligent chatbots and recommendation systems, is vast and promising.
"""

# Build the graph
Build and print the graph using:
1. `build_using_gemini(sample_text)`: Use gemini to build knowledge graph from provided text
2. `build_using_gemini(gen_text)`: Use gemini to build knowledge graph from generated text
3. `build_using_rebel(sample_text)`: Use rebel to build knowledge graph from provided text

In [None]:
# use gemini to build knowledge graph from provided text
build_using_gemini(sample_text)
# use gemini to build knowledge graph from generated text
# build_using_gemini(gen_text)
# use rebel to build knowledge graph from provided text
# build_using_rebel(sample_text)
kb.print()


Demo for saving graph, clearing graph

In [None]:
kb.save_graph()
kb.clear_graph()
kb.print()

Demo for loading the saved graph

In [None]:
kb.load_graph()
kb.print()

# Query
Sample queries for keyword query, semantic keyword query and semantic query.

In [24]:
query = "edges"

semantic_bfs_query = kb.construct_data_from_fetched_docs_nearest_k_neighbours(query, k_embeddings=5, k_neighbours=1)

bfs_query = kb.nearest_k_context(query)
print(bfs_query)

print("-----------\n")

print(semantic_bfs_query)

print("-----------\n")

semantic_query = kb.find_top_passages(query=query, k=10)
print(semantic_query)

Represent relationships between entities
threads
relationships
Edges
Machine Learning
data
Data Sources
web documents, databases, and human experts
None
-----------

Machine Learning analyzes data. 
Machine Learning identifies entities and relationships. 
Edges Components within a Knowledge Graph Represent relationships between entities. 
Edges are threads. 
Data Sources Input for building Knowledge Graphs Include web documents, databases, and human experts. 
Data Sources include web documents, databases, and human experts. 
Machine Learning populates knowledge graph. 
Edges represent relationships. 
Machine Learning Technique used to build Knowledge Graphs Analyzes data, identifies entities and relationships. 

-----------

['Edges ---> are ---> threads', 'Edges ---> Components within a Knowledge Graph ---> Represent relationships between entities', 'Edges ---> represent ---> relationships', 'Machine Learning ---> analyzes ---> data', 'Data Sources ---> include ---> web documents, dat

### Code to save the knowledge graph into a html file to visualise the graph

In [21]:
def save_network_html(kb, filename="network.html"):
    # create network
    net = Network(directed=True, width="700px", height="700px", bgcolor="#eeeeee", notebook=True, cdn_resources='remote')

    # nodes
    color_entity = "#b9e6f9"
    for e in kb.entities:
        net.add_node(e, shape="circle", color=color_entity)

    # edges
    for r in kb.relations:
        net.add_edge(r["head"], r["tail"],
                    title=r["type"], label=r["type"])
    # save network
    net.repulsion(
        node_distance=200,
        central_gravity=0.2,
        spring_length=200,
        spring_strength=0.05,
        damping=0.09
    )
    net.set_edge_smooth('dynamic')
    net.show(filename, notebook=True)

## Display the graph

In [22]:
filename = "kg.html"
save_network_html(kb, filename=filename)
IPython.display.HTML(filename=filename)

kg.html


# Credits:
- [Fabio Chiusano](https://medium.com/@chiusanofabio94?source=post_page-----8dbbffb912fa--------------------------------) (from NLPlanet) has a great article on knowledge graphs and rebel model. A lot of things have been referenced and used from the [article](https://medium.com/nlplanet/building-a-knowledge-base-from-texts-a-full-practical-example-8dbbffb912fa)
- [Babelscape rebel model](https://huggingface.co/Babelscape/rebel-large) on hugging face
- [Gemini docs](https://docs.gemini.com/)