Skip to content

Python driver: ANTLR4 agtype parser crashes (AttributeError) on vertices with complex properties #2367

@uesleilima

Description

@uesleilima

Describe the bug

The ANTLR4-based parseAgeValue() function in builder.py crashes with AttributeError: 'NoneType' object has no attribute 'stop' when deserializing certain vertex records whose properties contain large arrays, long text fields with special characters, or deeply nested structures.

The crash originates in the ANTLR4 visitor (ResultVisitor) when the parser tree contains None nodes that the visitor does not guard against. This makes the driver unable to parse valid agtype values returned by the AGE engine.

How are you accessing AGE?

Python driver (psycopg3)

Steps to Reproduce

import age
import psycopg

conn = psycopg.connect("host=localhost port=5432 dbname=postgres user=postgres password=postgres",
                        cursor_factory=age.age.ClientCursor)
age.setUpAge(conn, "test_graph")

# Create a vertex with a large tags array and a description with special characters
with conn.cursor() as cur:
    cur.execute("""
        SELECT * FROM cypher('test_graph', $$
            CREATE (n:TestNode {
                name: 'test',
                tags: ['tag1', 'tag2', 'tag3', 'tag4', 'tag5', 'tag6', 'tag7',
                       'tag8', 'tag9', 'tag10', 'tag11', 'tag12'],
                description: 'A long description with special chars: \"quotes\", backslashes \\, newlines, and unicode: àéîõü — em-dash, ellipsis…'
            })
            RETURN n
        $$) AS (n agtype)
    """)
    conn.commit()

# Now try to read it back through the driver's ANTLR parser
cursor = age.execCypher(conn, "test_graph",
                        "MATCH (n:TestNode) WHERE n.name = 'test' RETURN n",
                        cols=["n"])
for row in cursor:  # <-- crashes here
    print(row)

Expected behavior

The vertex should be parsed and returned as a Vertex object without errors.

Actual behavior

AttributeError: 'NoneType' object has no attribute 'stop'

The traceback points to the ANTLR4 visitor/parser internals. The exact property combination that triggers the crash varies, but large arrays and strings with escaped characters are common triggers.

Current workaround

We catch exceptions during cursor iteration and fall back to raw SQL execution via ag_catalog.cypher(...) with ::text casts on every column. This bypasses the AgeLoader → ANTLR4 parseAgeValue() path entirely and returns plain strings that we parse manually with json.loads() after stripping ::vertex, ::edge, and ::path suffixes:

# Fallback: bypass ANTLR4 parser entirely
col_defs = ", ".join(f"{c} agtype" for c in cols)
text_casts = ", ".join(f"{c}::text" for c in cols)
sql = f"SELECT {text_casts} FROM ag_catalog.cypher('{graph}', $$ {cypher} $$) AS ({col_defs})"

with conn.cursor() as cur:
    cur.execute(sql)
    for row in cur.fetchall():
        # Manual parsing: strip "::vertex" suffix and json.loads()
        cleaned = re.sub(r"::(vertex|edge|path)$", "", str(row[0]))
        parsed = json.loads(cleaned)

Suggested fix

Add null-guards in ResultVisitor methods (e.g., visitPair, visitAgValue, visitStringValue) to handle cases where the ANTLR4 parser tree contains None child nodes. Alternatively, regenerate the parser with a newer ANTLR4 version (>= 4.13.x) which may handle edge cases in the grammar more robustly.

Environment

  • Apache AGE: 1.5.0 and 1.6.0
  • Python driver: master branch (psycopg3 version, rev 5f5b744)
  • antlr4-python3-runtime: 4.11.1 (also reproduced with 4.13.2)
  • psycopg: 3.2.x
  • Python: 3.13

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions