Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
0f011b6
add view and analyzer schema to generate_schema()
anyxling Jul 22, 2025
0bb8eaf
Simplify analyzer schema to include only names
anyxling Jul 22, 2025
feeb4f3
distinguish arangosearch vs searchalias in view schema
anyxling Jul 23, 2025
5806621
add integration test for view and analyzer schema
anyxling Jul 23, 2025
5dd1248
updated unit test for view and analyzer schema
anyxling Jul 23, 2025
cafaf89
add view and analyzer schema
anyxling Jul 23, 2025
a350364
add view and analyzer schema assertions
anyxling Jul 23, 2025
49de59a
fix format
anyxling Jul 23, 2025
55cd3ac
fix type checking error
anyxling Jul 23, 2025
1ffd9bb
fix format
anyxling Jul 23, 2025
46bfa08
fix errors to pass lint and tests
anyxling Jul 23, 2025
8c16223
fix format
anyxling Jul 23, 2025
c1c0039
add more instructions
anyxling Jul 24, 2025
8ce2c00
sync test with generate_schema()
anyxling Jul 24, 2025
dab7c29
remove view and analyzer schema
anyxling Jul 25, 2025
5cf06a1
update to sync with generate_schema()
anyxling Jul 25, 2025
f096203
upgrade ruff
anyxling Jul 25, 2025
88d378b
update the method to retrieve analyzers
anyxling Jul 25, 2025
7be4575
sync with pyproject.toml
anyxling Jul 25, 2025
2ede5a4
add documentation
anyxling Jul 25, 2025
92def3c
run poetry lock
anyxling Jul 25, 2025
bcb80d2
fix format
anyxling Jul 25, 2025
dc00685
remove install_poetry.py from pr
anyxling Jul 28, 2025
10e6fe2
remove type ignore for imports
anyxling Jul 28, 2025
3f75c28
reword documentation
anyxling Jul 28, 2025
d3c5fc1
update analyzer schema in prompt
anyxling Jul 28, 2025
bd4b356
update analyzer schema
anyxling Jul 28, 2025
9b63b8b
Merge branch 'monika' of https://github.com/arangoml/langchain-arango…
anyxling Jul 28, 2025
7c6c882
update analyzer schema in integration test
anyxling Jul 28, 2025
b8f71a2
update analyzer schema in integration test and fix lint err
anyxling Jul 28, 2025
d54be6d
add view and analyzer schema
anyxling Jul 28, 2025
e8823ea
update analyzer schema in unit test
anyxling Jul 28, 2025
53f2f50
remove temp.py from pr
anyxling Jul 28, 2025
e504b49
remove build.log
anyxling Jul 29, 2025
26f199e
update analyzer schema in prompts
anyxling Jul 29, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions libs/arangodb/langchain_arangodb/chains/graph_qa/prompts.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,17 @@
You are given an `ArangoDB Schema`. It is a YAML Spec containing:
1. `Graph Schema`: Lists all Graphs within the ArangoDB Database Instance, along with their Edge Relationships.
2. `Collection Schema`: Lists all Collections within the ArangoDB Database Instance, along with their document/edge properties and a document/edge example.
3. `View Schema`: Lists all Views within the ArangoDB Database Instance, along with their linked collections and analyzers.
4. `Analyzer Schema`: Lists all custom-built Analyzers within the ArangoDB Database Instance, along with their properties and features. Does not mention the default ArangoDB analyzers (i.e text_en, text_fr, etc.)

You may also be given a set of `AQL Query Examples` to help you create the `AQL Query`. If provided, the `AQL Query Examples` should be used as a reference, similar to how `ArangoDB Schema` should be used.

Things you should do:
- Think step by step.
- Rely on `ArangoDB Schema` and `AQL Query Examples` (if provided) to generate the query.
- Begin the `AQL Query` by the `WITH` AQL keyword to specify all of the ArangoDB Collections required.
- If a `View Schema` is defined and contains analyzers for specific fields, prefer using the View with the `SEARCH` and `ANALYZER` clauses instead of a direct collection scan.
- Use `PHRASE(...)`, `TOKENS(...)`, or `IN TOKENS(...)` as appropriate when analyzers are available on a field.
- Return the `AQL Query` wrapped in 3 backticks (```).
- Use only the provided relationship types and properties in the `ArangoDB Schema` and any `AQL Query Examples` queries.
- Only answer to requests related to generating an AQL Query.
Expand Down Expand Up @@ -56,6 +60,8 @@
You are also given the `ArangoDB Schema`. It is a YAML Spec containing:
1. `Graph Schema`: Lists all Graphs within the ArangoDB Database Instance, along with their Edge Relationships.
2. `Collection Schema`: Lists all Collections within the ArangoDB Database Instance, along with their document/edge properties and a document/edge example.
3. `View Schema`: Lists all Views within the ArangoDB Database Instance, along with their linked collections and analyzers.
4. `Analyzer Schema`: Lists all custom-built Analyzers within the ArangoDB Database Instance, along with their properties and features. Does not mention the default ArangoDB analyzers (i.e text_en, text_fr, etc.)

You will output the `Corrected AQL Query` wrapped in 3 backticks (```). Do not include any text except the Corrected AQL Query.

Expand Down
70 changes: 68 additions & 2 deletions libs/arangodb/langchain_arangodb/graphs/arangodb_graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,22 @@
)
from langchain_arangodb.graphs.graph_store import GraphStore

DEFAULT_ANALYZERS = {
"text_fr",
"text_pt",
"identity",
"text_de",
"text_zh",
"text_fi",
"text_it",
"text_no",
"text_nl",
"text_es",
"text_ru",
"text_en",
"text_sv",
}


def get_arangodb_client(
url: Optional[str] = None,
Expand Down Expand Up @@ -85,7 +101,9 @@ class ArangoGraph(GraphStore):
in a string. If the string is longer than this limit, a string
describing the string will be used in the schema instead. Default is 256.
:type schema_string_limit: int

:param schema_include_views: Whether to include ArangoDB Views and Analyzers as
part of the schema passed to the AQL Generation prompt. Default is False.
:type schema_include_views: bool
:return: None
:rtype: None
:raises ArangoClientError: If the ArangoDB client cannot be created.
Expand Down Expand Up @@ -116,6 +134,7 @@ def __init__(
schema_include_examples: bool = True,
schema_list_limit: int = 32,
schema_string_limit: int = 256,
schema_include_views: bool = False,
) -> None:
"""
Initializes the ArangoGraph instance.
Expand All @@ -132,6 +151,7 @@ def __init__(
schema_include_examples,
schema_list_limit,
schema_string_limit,
schema_include_views,
)

@property
Expand Down Expand Up @@ -224,6 +244,7 @@ def generate_schema(
include_examples: bool = True,
list_limit: int = 32,
schema_string_limit: int = 256,
schema_include_views: bool = False,
) -> Dict[str, List[Dict[str, Any]]]:
"""
Generates the schema of the ArangoDB Database and returns it
Expand All @@ -247,6 +268,9 @@ def generate_schema(
in a string. If the string is longer than this limit, a string
describing the string will be used in the schema instead. Default is 128.
:type schema_string_limit: int
:param schema_include_views: Whether to include ArangoDB Views and Analyzers as
part of the schema passed to the AQL Generation prompt. Default is False.
:type schema_include_views: bool
:return: A dictionary containing the graph schema and collection schema.
:rtype: Dict[str, List[Dict[str, Any]]]
:raises ValueError: If the sample ratio is not between 0 and 1.
Expand All @@ -257,6 +281,10 @@ def generate_schema(
if not 0 <= sample_ratio <= 1:
raise ValueError("**sample_ratio** value must be in between 0 to 1")

#####
# Step 1: Generate Graph Schema
####

graph_schema: List[Dict[str, Any]] = []
if graph_name:
# Fetch a single graph
Expand All @@ -283,6 +311,10 @@ def generate_schema(
for collection in self.db.collections() # type: ignore
}

#####
# Step 2: Generate Collection Schema
####

# Stores the schema of every ArangoDB Document/Edge collection
collection_schema: List[Dict[str, Any]] = []
for collection in self.db.collections(): # type: ignore
Expand Down Expand Up @@ -324,7 +356,41 @@ def generate_schema(

collection_schema.append(collection_schema_entry)

return {"graph_schema": graph_schema, "collection_schema": collection_schema}
if not schema_include_views:
return {
"graph_schema": graph_schema,
"collection_schema": collection_schema,
}

#####
# Step 3: Generate View Schema
#####

view_schema: List[Dict[str, Any]] = []
for view in self.db.views(): # type: ignore
view_name = view["name"]
view_type = view["type"]
view_info = self.db.view(view_name)
key = "links" if view_type == "arangosearch" else "indexes"
view_schema.append(
{"name": view_name, "type": view_type, key: view_info.get(key, [])} # type: ignore
)

#####
# Step 4: Generate Analyzer Schema
#####

analyzer_schema: List[Dict[str, Any]] = []
for a in self.db.analyzers(): # type: ignore
if a["name"] not in DEFAULT_ANALYZERS:
analyzer_schema.append({a["name"]: a["properties"]})

return {
"graph_schema": graph_schema,
"collection_schema": collection_schema,
"view_schema": view_schema,
"analyzer_schema": analyzer_schema,
}

def query(
self,
Expand Down
Loading