# ArangoDB + LangChain

[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/arangodb/interactive_tutorials/blob/master/notebooks/Langchain.ipynb)

Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. However, using these LLMs in isolation is often insufficient for creating a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge.

[LangChain](https://www.langchain.com/) is a framework for developing applications powered by language models. It enables applications that are:
- Data-aware: connect a language model to other sources of data
- Agentic: allow a language model to interact with its environment

On July 25 2023, ArangoDB introduced the first release of the [ArangoGraphQAChain](https://langchain-langchain.vercel.app/docs/integrations/providers/arangodb) to the LangChain community, allowing you to leverage LLMs to provide a natural language interface for your ArangoDB data.

**Please note**: This notebook uses the LangChain `ChatOpenAI` wrapper, which requires you to have a **paid** [OpenAI API Key](https://help.openai.com/en/articles/4936850-where-do-i-find-my-secret-api-key). However, other Chat Models are available as well: https://github.com/langchain-ai/langchain/tree/master/libs/langchain/langchain/chat_models

## Setup

In [None]:
%%capture

# 1: Install the dependencies

!pip install python-arango # The ArangoDB Python Driver
!pip install adb-cloud-connector # The ArangoDB Cloud Instance provisioner
!pip install openai==0.28.1
!pip install langchain==0.0.314

In [None]:
# 2: Provision a temporary ArangoDB Cloud instance

from adb_cloud_connector import get_temp_credentials

connection = get_temp_credentials(tutorialName="LangChain")

connection

In [None]:
# 3: Instantiate the ArangoDB-Python driver

from arango import ArangoClient

client = ArangoClient(hosts=connection["url"])

db = client.db(
    connection["dbName"],
    connection["username"],
    connection["password"],
    verify=True
)

db

In [None]:
# 4: Load sample data
# We'll be relying on our Game Of Thrones dataset, representing the parent-child
# relationships of certain characters from the GoT universe

if db.has_graph("GameOfThrones"):
    db.delete_graph("GameOfThrones", drop_collections=True)


edge_definitions=[
    {
        "edge_collection": "ChildOf",
        "from_vertex_collections": ["Characters"],
        "to_vertex_collections": ["Characters"],
    }
]

documents = [
    # Starks (8)
    {"_key": "RickardStark", "name": "Rickard", "surname": "Stark", "alive": False, "age": 60, "gender": "male"},
    {"_key": "LyarraStark", "name": "Lyarra", "surname": "Stark", "alive": False, "age": 60, "gender": "female"},
    {"_key": "NedStark", "name": "Ned", "surname": "Stark", "alive": True, "age": 41, "gender": "male"},
    {"_key": "CatelynStark", "name": "Catelyn", "surname": "Stark", "alive": False, "age": 40, "gender": "female"},
    {"_key": "AryaStark", "name": "Arya", "surname": "Stark", "alive": True, "age": 11, "gender": "female"},
    {"_key": "BranStark", "name": "Bran", "surname": "Stark", "alive": True, "age": 10, "gender": "male"},
    { "_key": "RobbStark", "name": "Robb", "surname": "Stark", "alive": False, "age": 16, "gender": "male"},
    { "_key": "SansaStark", "name": "Sansa", "surname": "Stark", "alive": True, "age": 13, "gender": "female"},

    # Lannisters (4)
    { "_key": "TywinLannister", "name": "Tywin", "surname": "Lannister", "alive": False, "age": 67, "gender": "male" },
    { "_key": "JaimeLannister", "name": "Jaime", "surname": "Lannister", "alive": True, "age": 36, "gender": "male" },
    { "_key": "CerseiLannister", "name": "Cersei", "surname": "Lannister", "alive": True, "age": 36, "gender": "female" },
    { "_key": "TyrionLannister", "name": "Tyrion", "surname": "Lannister", "alive": True, "age": 32, "gender": "male" },

    # Baratheons (1)
    { "_key": "JoffreyBaratheon", "name": "Joffrey", "surname": "Baratheon", "alive": False, "age": 19, "gender": "male"},
]

edges = [
    {"_to": "Characters/NedStark", "_from": "Characters/BranStark"},
    {"_to": "Characters/NedStark", "_from": "Characters/RobbStark" },
    {"_to": "Characters/NedStark", "_from": "Characters/SansaStark" },
    {"_to": "Characters/NedStark", "_from": "Characters/AryaStark" },
    {"_to": "Characters/CatelynStark", "_from": "Characters/AryaStark"},
    {"_to": "Characters/CatelynStark", "_from": "Characters/BranStark"},
    {"_to": "Characters/CatelynStark", "_from": "Characters/RobbStark" },
    {"_to": "Characters/CatelynStark", "_from": "Characters/SansaStark" },
    {"_to": "Characters/RickardStark", "_from": "Characters/NedStark"},
    {"_to": "Characters/LyarraStark", "_from": "Characters/NedStark"},

    {"_to": "Characters/TywinLannister", "_from": "Characters/JaimeLannister" },
    {"_to": "Characters/TywinLannister", "_from": "Characters/CerseiLannister" },
    {"_to": "Characters/TywinLannister", "_from": "Characters/TyrionLannister" },
    {"_to": "Characters/CerseiLannister", "_from": "Characters/JoffreyBaratheon" },
    {"_to": "Characters/JaimeLannister", "_from": "Characters/JoffreyBaratheon" }
]

db.delete_graph("GameOfThrones", ignore_missing=True, drop_collections=True)
db.create_graph("GameOfThrones", edge_definitions)
db.collection("Characters").import_bulk(documents)
db.collection("ChildOf").import_bulk(edges)

In [None]:
# 5: Instantiate the ArangoDB-LangChain Graph wrapper

from langchain.graphs import ArangoGraph

graph = ArangoGraph(db)

graph.schema

In [None]:
# 6: Set your OpenAI API Key
# https://help.openai.com/en/articles/4936850-where-do-i-find-my-secret-api-key

import os

os.environ["OPENAI_API_KEY"] = "sk-..."

In [None]:
# 7: Instantiate the OpenAI Chat model
# Note that other models can be used as well
# Ref: https://github.com/langchain-ai/langchain/tree/master/libs/langchain/langchain/chat_models

from langchain.chat_models import ChatOpenAI

model = ChatOpenAI(temperature=0, model_name='gpt-4')

In [None]:
# 8: Instantiate the LangChain Question-Answering Chain with
# our **model** and **graph**

from langchain.chains import ArangoGraphQAChain

chain = ArangoGraphQAChain.from_llm(model, graph=graph, verbose=True)

## Prompting

In [None]:
chain.run("Who are the 2 youngest characters?")

In [None]:
chain.run("How are Bran Stark and Arya Stark related?")

In [None]:
chain.run("Who are Bran Stark’s grandparents?")

In [None]:
chain.run("Fetch me the character count for each family")

In [None]:
chain.run("What is the age difference between Rickard Stark and Arya Stark?")

In [None]:
chain.run("Wie alt ist Rickard Stark?") # (German: "How old is Rickard Stark?")

In [None]:
chain.run("What is the average age within the Stark family?")

In [None]:
chain.run("Does Bran Stark have a dead parent?")

In [None]:
chain.run("Who are Catelyn Stark's children?")

In [None]:
chain.run("Add Jon Snow, 31, a male character")

In [None]:
chain.run("Create a ChildOf edge from Jon Snow to Ned Stark")

In [None]:
chain.run("Who is related to Ned Stark?")

In [None]:
chain.run("What can you tell me about the characters?")

In [None]:
chain.run("What is the shortest path from Bran Stark to Rickard Stark?")

In [None]:
chain.run("What is th family tree of Joffrey Baratheon?")

In [None]:
chain.run("What is the relationship between Bran Stark and Rickard Stark?")

In [None]:
chain.run("Are Arya Stark and Ned Stark related?")

In [None]:
chain.run("Is Ned Stark alive?")

In [None]:
chain.run("Ned Stark has died. Update the data")

In [None]:
chain.run("How many characters are alive? How many characters are dead?")

In [None]:
chain.run("Is Arya Stark an orphan?")

## Prompt Modifiers

You can alter the values of the following `ArangoDBGraphQAChain` class variables to modify the behaviour of your chain results


In [None]:
# Notice how the following prompt returns nothing;
chain.run("Who are the grandchildren of Rickard Stark?")

In [None]:
# A simple reminder to use INBOUND (instead of OUTBOUND) returns the correct result;
chain.run("Who are the grandchildren of Rickard Stark? Remember to use INBOUND")

In [None]:
# We can solidify this pattern by making using of **chain.aql_examples**

# The AQL Examples modifier instructs the LLM to adapt its AQL-completion style
# to the user’s examples. These examples arepassed to the AQL Generation Prompt
# Template to promote few-shot-learning.

chain.aql_examples = """
# Who are the grandchildren of Rickard Stark?
WITH Characters, ChildOf
FOR v, e IN 2..2 INBOUND 'Characters/RickardStark' ChildOf
  RETURN v

# Is Ned Stark alive?
RETURN DOCUMENT('Characters/NedStark').alive
"""

# Note how we are no longer specifying the use of INBOUND
chain.run("Who is the grandchildren of Tywin Lannister?")

In [None]:
# Other modifiers include:

# Specify the maximum number of AQL Query Results to return
chain.top_k = 5

# Specify the maximum amount of AQL Generation attempts that should be made
# before returning an error
chain.max_aql_generation_attempts = 5

# Specify whether or not to return the AQL Query in the output dictionary
# Use `chain("...")` instead of `chain.run("...")` to see this change
chain.return_aql_query = True

# Specify whether or not to return the AQL JSON Result in the output dictionary
# Use `chain("...")` instead of `chain.run("...")` to see this change
chain.return_aql_result = True

## Prompting via Gradio

In [None]:
%%capture
!pip install gradio

In [None]:
import gradio as gr

demo = gr.Interface(chain.run, inputs=gr.Textbox(label="Chat"), outputs=gr.Textbox(label="Output"))

demo.launch(share=True)