diff --git a/mage/data/query-modules/python/llm-util/memgraph-lab-schema.png b/mage/data/query-modules/python/llm-util/memgraph-lab-schema.png new file mode 100644 index 00000000000..c9346f019de Binary files /dev/null and b/mage/data/query-modules/python/llm-util/memgraph-lab-schema.png differ diff --git a/mage/query-modules/python/llm-util.md b/mage/query-modules/python/llm-util.md new file mode 100644 index 00000000000..d39053f5303 --- /dev/null +++ b/mage/query-modules/python/llm-util.md @@ -0,0 +1,288 @@ +--- +id: llm-util +title: llm_util +sidebar_label: llm_util +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; +import RunOnSubgraph from '../../templates/_run_on_subgraph.mdx'; + +export const Highlight = ({children, color}) => ( + + {children} + +); + + +[![docs-source](https://img.shields.io/badge/source-llm_util-FB6E00?logo=github&style=for-the-badge)](https://github.com/memgraph/mage/blob/main/python/json_util.py) + +| Trait | Value | +| ------------------- | ----------------------------------------------------- | +| **Module type** | **util** | +| **Implementation** | **Python** | +| **Parallelism** | **sequential** | + +## Procedures + + + +### schema(output_type) + +The `schema()` procedure generates the graph database schema in a **prompt-ready** or **raw** format. The prompt-ready format is optimized to describe the database schema in words best recognized by large language models (LLMs). The raw format offers all the necessary information about the graph schema in a format that can be customized for later use with LLMs. + +#### Input: + +* `output_type: str (default='prompt_ready')` ➡ By default, the graph schema will include additional context and it will be prompt-ready. If set to 'raw', it will produce a simpler version that can be adjusted for the prompt. + +#### Output: + +* `schema: mgp.Any` ➡ `str` containing prompt-ready graph schema description in a format suitable for large language models (LLMs), or `mgp.List` containing information on graph schema in raw format which can customized for LLMs. + +#### Usage: +Get **prompt-ready graph schema**: +```cypher +CALL llm_util.schema() YIELD schema RETURN schema; +``` +or +```cypher +CALL llm_util.schema('prompt_ready') YIELD schema RETURN schema; +``` + +Get **raw graph schema**: +```cypher +CALL llm_util.schema('raw') YIELD schema RETURN schema; +``` + +:::note +The `output_type` is case-insensitive. +::: + + +## Example - Get prompt-ready graph schema + + + + + Create a graph by running the following Cypher query: + + +```cypher +CREATE (n:Person {name: "Kate", age: 27})-[:IS_FRIENDS_WITH {since: "2023-06-21"}]->(m:Person:Student {name: "James", age: 30, year: "second"})-[:STUDIES_AT]->(:University {name: "University of Zagreb"}) CREATE (p:Person:Student {name: "Anthony", age: 25})-[:STUDIES_AT]->(:University {name: "University of Vienna"}) +WITH n, m +CREATE (n)-[:LIVES_IN]->(:City {name: "Zagreb"})<-[:LIVES_IN]-(m); +``` + + + +The schema of the created graph can be seen in Memgraph Lab, under the Graph Schema tab: + +
+ +
+ +
+ + + +Once the graph is created, run the following code to call the schema procedure: + + +```cypher +CALL llm_util.schema() YIELD schema RETURN schema; +``` + +or + +```cypher +CALL llm_util.schema('prompt_ready') YIELD schema RETURN schema; +``` + + + + + +Below is the result of running the schema procedure: + + +``` +Node properties are the following: +Node name: 'Person', Node properties: [{'property': 'name', 'type': 'str'}, {'property': 'age', 'type': 'int'}, {'property': 'year', 'type': 'str'}] +Node name: 'Student', Node properties: [{'property': 'name', 'type': 'str'}, {'property': 'age', 'type': 'int'}, {'property': 'year', 'type': 'str'}] +Node name: 'University', Node properties: [{'property': 'name', 'type': 'str'}] +Node name: 'City', Node properties: [{'property': 'name', 'type': 'str'}] + +Relationship properties are the following: +Relationship Name: 'IS_FRIENDS_WITH', Relationship Properties: [{'property': 'since', 'type': 'str'}] + +The relationships are the following: +['(:Person)-[:IS_FRIENDS_WITH]->(:Person)'] +['(:Person)-[:IS_FRIENDS_WITH]->(:Student)'] +['(:Person)-[:LIVES_IN]->(:City)'] +['(:Person)-[:STUDIES_AT]->(:University)'] +['(:Student)-[:STUDIES_AT]->(:University)'] +['(:Student)-[:LIVES_IN]->(:City)'] +``` + + + + + +
+ +## Example - Get raw graph schema + + + + + Create a graph by running the following Cypher query: + + +```cypher +CREATE (n:Person {name: "Kate", age: 27})-[:IS_FRIENDS_WITH {since: "2023-06-21"}]->(m:Person:Student {name: "James", age: 30, year: "second"})-[:STUDIES_AT]->(:University {name: "University of Zagreb"}) CREATE (p:Person:Student {name: "Anthony", age: 25})-[:STUDIES_AT]->(:University {name: "University of Vienna"}) +WITH n, m +CREATE (n)-[:LIVES_IN]->(:City {name: "Zagreb"})<-[:LIVES_IN]-(m); +``` + + + +The schema of the created graph can be seen in Memgraph Lab, under the Graph Schema tab: + +
+ +
+ +
+ + + +Once the graph is created, run the following code to call the schema procedure: + + +```cypher +CALL llm_util.schema('raw') YIELD schema RETURN schema; +``` + + + + + +Below is the result of running the schema procedure: + + +``` +{ + "node_props": { + "City": [ + { + "property": "name", + "type": "str" + } + ], + "Person": [ + { + "property": "name", + "type": "str" + }, + { + "property": "age", + "type": "int" + }, + { + "property": "year", + "type": "str" + } + ], + "Student": [ + { + "property": "name", + "type": "str" + }, + { + "property": "age", + "type": "int" + }, + { + "property": "year", + "type": "str" + } + ], + "University": [ + { + "property": "name", + "type": "str" + } + ] + }, + "rel_props": { + "IS_FRIENDS_WITH": [ + { + "property": "since", + "type": "str" + } + ] + }, + "relationships": [ + { + "end": "Person", + "start": "Person", + "type": "IS_FRIENDS_WITH" + }, + { + "end": "Student", + "start": "Person", + "type": "IS_FRIENDS_WITH" + }, + { + "end": "City", + "start": "Person", + "type": "LIVES_IN" + }, + { + "end": "University", + "start": "Person", + "type": "STUDIES_AT" + }, + { + "end": "University", + "start": "Student", + "type": "STUDIES_AT" + }, + { + "end": "City", + "start": "Student", + "type": "LIVES_IN" + } + ] +} +``` + + + + + +
diff --git a/mage/templates/_mage_spells.mdx b/mage/templates/_mage_spells.mdx index e429c89813c..89fa2bddaf4 100644 --- a/mage/templates/_mage_spells.mdx +++ b/mage/templates/_mage_spells.mdx @@ -54,6 +54,7 @@ | [graph_util](/mage/query-modules/cpp/graph-util) | C++ | A module with common graph algorithms and graph manipulation utilities | | [import_util](/mage/query-modules/python/import-util) | Python | A module for importing data from different formats (JSON). | | [json_util](/mage/query-modules/python/json-util) | Python | A module for loading JSON from a local file or remote address. | +| [llm_util](/mage/query-modules/python/llm-util) | Python | A module that contains procedures describing graphs in a format best suited for large language models (LLMs). | | [meta_util](/mage/query-modules/python/meta-util) | Python | A module that contains procedures describing graphs on a meta-level. | | [migrate](/mage/query-modules/python/migrate) | Python | A module that can access data from a MySQL, SQL Server or Oracle database. | | rust_example | Rust | Example of a basic module with input parameters forwarding, made in Rust. | diff --git a/sidebars/sidebarsMAGE.js b/sidebars/sidebarsMAGE.js index 4f2141e4d08..d00540faf7b 100644 --- a/sidebars/sidebarsMAGE.js +++ b/sidebars/sidebarsMAGE.js @@ -46,6 +46,7 @@ module.exports = { "query-modules/cpp/katz-centrality-online", "query-modules/python/kmeans", "query-modules/python/link-prediction-with-gnn", + "query-modules/python/llm-util", "query-modules/python/max-flow", "query-modules/python/meta-util", "query-modules/python/node-classification-with-gnn",