Skip to content

[BUG] JSQLParser Version : 5.3 : LATERAL VIEW with three or more column aliases silently mis-parses extras as cross-join tables #2433

@otherhy

Description

@otherhy

Failing SQL Feature:

When using a LATERAL VIEW with three or more column aliases, only the first two aliases are correctly absorbed into LateralView.columnAlias. The third and subsequent aliases are silently mis-parsed as additional comma-separated tables in the FROM clause (implicit cross joins) and end up in PlainSelect.joins as bare Table items.

I noticed that closed issue #2088 addressed the case of a LATERAL VIEW with two column aliases. However, that fix appears to cover only the two-alias scenario; the grammar production for the column-alias list is still capped at two identifiers, so any case with three or more aliases is still broken in the same way described in #2088. Hive/Spark allow arbitrary-length alias lists (for example, json_tuple typically yields many output columns), so this remaining gap is hit easily in real workloads.

The parse does not raise a syntax error — the failure is silent. Worse, Statement.toString() re-emits a textually identical SQL string (because the leaked aliases get re-printed as the joined table list), which makes the broken AST very hard to detect by round-trip inspection.

SQL Example:

SELECT a
FROM t
LATERAL VIEW json_tuple(j, 'a', 'b', 'c', 'd', 'e', 'f', 'g')
x AS c1, c2, c3, c4, c5, c6, c7;

In this query, c1 and c2 are correctly registered as column aliases of the lateral view. However, c3, c4, c5, c6, c7 are incorrectly interpreted as five separate tables joined to the FROM clause. Reflective inspection of the parsed AST shows:

  • PlainSelect.fromItem = Table("t")
  • PlainSelect.joins = [Join(simple=true, right=Table("c3")), ..., Join(simple=true, right=Table("c7"))] ← unexpected
  • LateralView.columnAlias.aliasColumns.size() = 2 ← should be 7

The expected behavior is that all of c1, c2, c3, c4, c5, c6, c7 are recognized as column aliases of the lateral view, with PlainSelect.joins being null (or empty), and no implicit table joins implied.

For comparison, the parenthesized form LATERAL VIEW ... x AS (c1, c2, ..., c7) is rejected outright with a ParseException at the (, so there is no alternative syntax that currently works for more than two aliases.

Boundary behavior observed (jsqlparser 5.3):

Aliases Result


AS c1, c2 OK — both absorbed into columnAlias.aliasColumns
AS c1, c2, c3 WRONG — c3 becomes a join Table
AS c1, c2, ..., cN (N >= 3) WRONG — c3..cN become join Tables
AS (c1, c2, ..., cN) ERROR — ParseException at "("

Software Information:

  • JSqlParser version: 5.3
  • Database: Spark SQL (Hive-style LATERAL VIEW)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions