Skip to content

Extensions

Jakob Voß edited this page Aug 5, 2024 · 21 revisions

This page lists possible extensions to PG format probably not to be included in version 1.0.0 of the PG specification:

Model extensions

Hierarchies

Some formats (GraphML, GXL, Dot...) support graph hierarchies or subgraphs.

Possible syntax in PG-JSONL:

a -> b
b {
  c ->
    a # line folding getting tricky!
}

or

a -> b

>> GRAPH b
c ->
 a
>>

Possible syntax in PG-JSON (part of the graph):

{
  "nodes" {
    { "id": "a", "labels": [], "properties": {} },
    { "id": "b", "labels": [], "properties": {} },
    { "id": "b", "labels": [], "properties": {}, "parents": ["b"] }
  }
}

Possible syntax in PG-JSONL (part of the graph):

{"type":"node", "id":"c", "labels":[], "properties":{}, "parents":["b"]}

Most formats only allow one parent to establish mono-hierarchy but at least GEXF supports mulit-hierarchy!

Graph attributes

Some formats support graphs having properties also known as attributes. Possible syntax:

: tile: "example graph"
  created: 2014
: "url": http://example.org/

a
b -> c

or

^title: "example graph"
^created: 2024
^"url": http://example.org/

or

>> ABOUT

title: "example graph"
created: 2024
"url": http://example.org/

>> GRAPH

Possible syntax in PG-JSON

{
  "properties": {
    "title": ["example graph"],
    "created": [2014]
  },
  "nodes": [...],
  "edges": [...]  
}

Possible syntax in PG-JSONL

{"type":"graph","properties":{"created": [2014],"title": ["example graph"]}

Discussion: https://github.com/orgs/pg-format/discussions/31

Data types

Values could have an additional data type

a :person born:"2000"^year

Possible encoding in PG-JSON(L):

{
  "id": "a",
  "labels": ["person"],
  "properties": {
    "born": [
      { "value": "2000", "type": "year" }
    ]
  }
}

Hypergraphs

Some formats (GraphML, GXL) support multi-edges also known as hyper-graphs: a muti-edge allows to connect more than two nodes.

Some possible syntax variants in PG format:

a\b -> c :e key:value      # directed hyperedge from a and b to c
a\b\c :e key:value         # undirected hyperedge connecting a b c
a \ b \ c :e key:value     # same with spaces

Possible syntax in PG-JSON(L)

{
  "connects": [ "a", "b", "c" ], "labels": [ "e" ], "properties": { "key": [ "value" ] }
}
{
  "from": [ "a", "b" ], "to": [ "c" ], "labels": [ "e" ], "properties": { "key": [ "value" ] }
}

Graph schemas

Some formats require or allow schemas (set of allowed/required properties/labels, cardinality...). A schema could be given with graph attributes or with a custom syntax. Graph schemas are required by several database systems, e.g. the Graph Query Language (GQL) ISO/IEC 39075:2024 supports two syntax forms as described here. Possible syntax in PG format for the example given there:

<<
  Person: :Person :TaxPayer name:STRING dob:DATE taxNo:STRING
  City    :City             name:STRING state:STRING country:STRING
  Company :Company          name:STRING description:STRING

  # Edge identifiers are optional but included here
  LivesIn:        Person  -> City :LIVES_IN        since:DATE
  HeadquartersIn: Company -> City :HEADQUARTERS_IN since:DATE
  WorksFor:       Person  -> Company :WORKS_FOR    since:DATE
  MarriedTo:      Person -- Person :MARRIED_TO     since:DATE until:DATE
>>

Schemas should also be able to express key properties, e.g. like this:

<< Person :Person name:STRING id:ID-STRING >>

Alternative syntax:

>> SCHEMA
...

>> GRAPH

See Neo4J CSV Header files for another example of simple schema format.

The schema would just be an additional schema key in PG-JSON, holding a graph of its own.

PG syntax extensions

Statement separator

Multiple statements in one line:

a | b -> c | d -> e

Inline edge labels

a -:type-> b    # same as:  a -> b :type
a - :type -> b  # dito
a -:type--> b   # same as:  a -> b- :type

a -:type- b     # same as:  a -- b :type
a -:type-- b    # same as:  a -- b- :type

Pathes

Not sure whether this is an addition to the data model or just syntax shortcut to write multiple edges with same labels and properties:

a -> b -> c -> d :follows

Reverse direction

No change to data model but pg syntax. Allow reverse direction, as known in Cypher but not allowed in most other graph formats:

a -> b
b <- a # equivalent

Verbatim strings

With ` and no escape sequences like defined in CommonMark syntax.

Queries or templates

An expression in parenthesis could be used as identifier, label, property, value:

(a:person) -> (b:person) # get all edges between two nodes of label "person" 

Problem: () are no reserved characters.

Node labels/properties within in path

Just an idea for PG syntax extension:

(x :person) -> (y: person) :friend
x :person -> y :person # is the last :person label of y or of the edge?
x :person -[:edge]-> y :person # alternative form

as shortcut für

x :person
y: person
x -> y :friend

Problem: ()[] are no reserved characters.