# llmgraph

[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dylanhogg/llmgraph/blob/master/notebooks/llmgraph_example.ipynb)

Create knowledge graphs with LLMs.

https://github.com/dylanhogg/llmgraph

<img src="https://github.com/dylanhogg/llmgraph/blob/main/docs/img/header.jpg?raw=true" alt="drawing" width="600px"/>

llmgraph enables you to create knowledge graphs in [GraphML](http://graphml.graphdrawing.org/), [GEXF](https://gexf.net/), and HTML formats (generated via [pyvis](https://github.com/WestHealth/pyvis)) from a given source entity Wikipedia page. The knowledge graphs are generated by extracting world knowledge from ChatGPT or other large language models (LLMs) as supported by [LiteLLM](https://github.com/BerriAI/litellm).

For a background on knowledge graphs see a [youtube overview by Computerphile](https://www.youtube.com/watch?v=PZBm7M0HGzw)

## Install llmgraph

In [13]:
# Install llmgraph from pypi (https://pypi.org/project/llmgraph/)
# (Ignore any dependency resolver issues on Google Colab, they're fine)
%pip install llmgraph -q

In [2]:
# Display installed llmgraph version
%pip list | grep llmgraph

llmgraph                         1.2.1


## Imports

In [3]:
import IPython
import os
import getpass
from pathlib import Path

## Enter your OpenAI API Key

In [4]:
# Set OPENAI_API_KEY from user input (hidden in UI via getpass function)
os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API Key")

Enter your OpenAI API Key··········


## Run llmgraph command

In [5]:
!llmgraph --help

[1m                                                                                                    [0m
[1m [0m[1;33mUsage: [0m[1mllmgraph [OPTIONS] ENTITY_TYPE ENTITY_WIKIPEDIA[0m[1m                                            [0m[1m [0m
[1m                                                                                                    [0m
 Create knowledge graphs with LLMs                                                                  
                                                                                                    
[2m╭─[0m[2m Arguments [0m[2m─────────────────────────────────────────────────────────────────────────────────────[0m[2m─╮[0m
[2m│[0m [31m*[0m    entity_type           [1;33mTEXT[0m  Entity type (e.g. movie) [2m[default: None][0m [2;31m[required][0m             [2m│[0m
[2m│[0m [31m*[0m    entity_wikipedia      [1;33mTEXT[0m  Full wikipedia link to root entity [2m[default: None][0m [2;31m[required][0m   

In [6]:
# Run llmgraph
# Note: valid `entity_type` values are found here: https://github.com/dylanhogg/llmgraph/blob/main/llmgraph/prompts.yaml
!llmgraph concepts-general https://en.wikipedia.org/wiki/Large_language_model --levels 3 --llm-model gpt-3.5-turbo --llm-temp 0.0 --no-allow-user-input

Running with [33mentity_type[0m=[32m'concepts-general'[0m, 
[33mentity_wikipedia[0m=[32m'https://en.wikipedia.org/wiki/Large_language_model'[0m, [33mentity_root[0m=[32m'Large language [0m
[32mmodel'[0m, [33mcustom_entity_root[0m=[3;91mFalse[0m, [33mlevels[0m=[1;36m3[0m, [33mllm_model[0m=[32m'gpt-3.5-turbo'[0m, [33mllm_temp[0m=[1;36m0[0m[1;36m.0[0m, 
[33moutput_folder[0m=[32m'./_output/'[0m
Reading [1;4;32mhttps://en.wikipedia.org/wiki/Large_language_model[0m
[2KProcessing [1;32mLarge language model[0m [1m([0mlevel [1;36m1[0m, total tokens [1;36m0[0m[1m)[0m
[2KReading [1;4;32mhttps://en.wikipedia.org/wiki/Transformer_[0m[1;4;32m([0m[1;4;32mmachine_learning_model[0m[1;4;32m)[0m
[2KReading [1;4;32mhttps://en.wikipedia.org/wiki/BERT_[0m[1;4;32m([0m[1;4;32mlanguage_model[0m[1;4;32m)[0m
[2KReading [1;4;32mhttps://en.wikipedia.org/wiki/GPT_[0m[1;4;32m([0m[1;4;32mlanguage_model[0m[1;4;32m)[0m
[2KReading [1;4;32mhttps

## Locate the output files

In [7]:
# Get list of book html files from the _output folder
html_files = []
graphml_files = []
for root, dirs, files in os.walk("_output"):
  if not dirs:
    html_files.extend([str(Path(root) / f) for f in files if f.endswith("fully_connected.html")])
    graphml_files.extend([str(Path(root) / f) for f in files if f.endswith(".graphml")])
html_files = sorted(html_files)
graphml_files = sorted(graphml_files)
html_file = html_files[-1]
graphml_file = graphml_files[-1]

print(html_file)
print(graphml_file)

_output/concepts-general/large-language-model/concepts-general_large-language-model_v1.2.1_level3_fully_connected.html
_output/concepts-general/large-language-model/concepts-general_large-language-model_v1.2.1_level3.graphml


In [8]:
# Uncomment these lines to download book html (or find it in the file tree on the left)
# from google.colab import files
# files.download(book_file)

## Display the network

In [9]:
import networkx as nx
import matplotlib.pyplot as plt
from pyvis.network import Network

In [10]:
# Load graphml file
G = nx.read_graphml(graphml_file)
# G = nx.read_graphml("_output/concepts-general/large-language-model/concepts-general_large-language-model_v1.2.1_level3.graphml")

# Create pyvis network for displaying
nt = Network(height="800px", width="100%", directed=True, cdn_resources="remote", notebook=True)
nt.from_nx(G)
nt.force_atlas_2based(
    spring_strength=0.03
)

In [11]:
# Display pyviz network
nt.save_graph("llmgraph.html")
IPython.display.HTML(filename="llmgraph.html")