Skip to content

feat(sql): accept SAP ADS dialect ({static} hint, [bracket] tables, table alias, UPPER()/LOWER() in WHERE, WHERE 1=1, comma-join)#48

Merged
FiveTechSoft merged 4 commits into
FiveTechSoft:mainfrom
Admnwk:pr/sql-ads-dialect
Jun 23, 2026
Merged

feat(sql): accept SAP ADS dialect ({static} hint, [bracket] tables, table alias, UPPER()/LOWER() in WHERE, WHERE 1=1, comma-join)#48
FiveTechSoft merged 4 commits into
FiveTechSoft:mainfrom
Admnwk:pr/sql-ads-dialect

Conversation

@Admnwk

@Admnwk Admnwk commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Accept SAP ADS SQL dialect in the engine's SQL parser/executor

Real-world driver: a Harbour/FiveWin ERP that ran on the original Advantage
Client Engine (ace64.dll) and now links OpenADS. Several queries that are
valid ADS SQL and worked with the original ACE were rejected by the
OpenADS parser/executor. These are the same query off one search screen:

SELECT {static} * FROM [articulo.dat] AS a
 WHERE 1 = 1
   AND UPPER(a.creffacart) <> 'N'
   AND UPPER(a.crefhabart) <> 'N'
   AND UPPER(a.cnombreart) LIKE 'A%'
 ORDER BY a.cnombreart

This PR teaches the parser/executor five ADS constructs. Each is additive and
gated, so existing queries are unaffected.

Construct Before After
SELECT {static} ... cursor hint 7200 expected '*' or column name hint consumed and ignored (a no-op; OpenADS picks its own access path)
FROM [articulo.dat] bracketed free-table 7200 expected table name read_identifier_or_filename() now handles [...], mirroring read_identifier()
FROM <table> AS <alias> (and bare alias) alias left unconsumed → WHERE/ORDER BY dropped alias consumed and recorded in SelectStmt.table_alias; qualified a.col already resolves by column name
UPPER(col) / LOWER(col) on the WHERE LHS AE_COLUMN_NOT_FOUND (read as a column) parsed as WhereCmp.lhs_fn; the executor folds the cell before comparison, literal verbatim — faithful to ADS semantics, applied across every WHERE evaluator (single-table, UPDATE, DELETE, projection-full, aggregate/HAVING, JOIN-condition, plus the connection nocase path)
WHERE 1 = 1 constant boilerplate AE_COLUMN_NOT_FOUND (read 1 as a column) a digit-led LHS (xBase columns never start with a digit) is evaluated at parse time and folded to an always-true / always-false node

Design note on UPPER()/LOWER()

The fold is applied to the column value only and the literal is left as written
(UPPER(col).compare(lit)), so UPPER(col) = 'n' correctly matches nothing —
matching true ADS semantics rather than a blanket case-insensitive compare. It
is wired into all WHERE evaluators and is a no-op when no function is
present, so there is no behaviour change for existing queries.

Tests

  • Parser unit tests for each construct, plus the full search query above.
  • Engine tests against real .dbf tables: SELECT scan, LIKE, UPDATE/DELETE
    row scope under UPPER(), constant-predicate folding, and the complete
    search query end-to-end.
  • Full unit suite green (584 cases / 50,878 assertions) with zero
    regression on this branch.

Known follow-up (not in this PR)

Comma joins — FROM a, b WHERE a.x = b.y — are still unsupported (only explicit
INNER/LEFT/RIGHT/FULL JOIN ... ON). That is a multi-table FROM feature worth a
separate PR; the immediate workaround is to rewrite such queries to explicit
INNER JOIN ... ON.

🤖 Generated with Claude Code

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces compatibility features for the SAP ADS SQL dialect, including support for optimizer hints, bracket-quoted table names, table aliases, UPPER/LOWER case-folding on the WHERE clause left-hand side, and constant predicate folding (e.g., 1=1). The review feedback highlights a performance regression where string copying occurs unconditionally during row comparisons even when no case-folding is required. Additionally, the feedback suggests expanding the constant predicate parser to support signed and decimal numbers (e.g., -1 or .5) instead of only single digits, and adding corresponding test cases.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/abi/ace_exports.cpp
Comment on lines +10158 to +10159
cmp = apply_where_fn(v.value().as_string, lhs_fn)
.compare(lit);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Unconditionally calling apply_where_fn on every row comparison introduces a significant performance regression for all existing queries where lhs_fn is WhereFn::None (the default). Since apply_where_fn takes the string by value, it copies and allocates memory for the string on every single row comparison, even when no case-folding is requested.\n\nWe should check if lhs_fn is WhereFn::None first to completely bypass the function call and avoid any string copying/allocation in the common case.

                        cmp = (lhs_fn == openads::sql::WhereFn::None)\n                            ? v.value().as_string.compare(lit)\n                            : apply_where_fn(v.value().as_string, lhs_fn).compare(lit);

Comment thread src/sql/parser.cpp
Comment on lines +220 to +224
bool peek_digit() {
skip_ws();
return pos_ < s_.size() &&
std::isdigit(static_cast<unsigned char>(s_[pos_]));
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The peek_digit() helper only checks if the next non-whitespace character is a decimal digit. This fails to recognize valid constant predicates that start with a sign (e.g., WHERE -1 = -1 or WHERE +1 = +1) or a decimal point (e.g., WHERE .5 = .5), causing them to fail parsing with an "expected column name" error.\n\nWe should generalize this helper to peek_numeric_literal() to correctly identify all valid numeric literals.

    bool peek_numeric_literal() {\n        skip_ws();\n        std::size_t p = pos_;\n        if (p < s_.size() && (s_[p] == '+' || s_[p] == '-')) ++p;\n        if (p < s_.size() && std::isdigit(static_cast<unsigned char>(s_[p]))) return true;\n        if (p + 1 < s_.size() && s_[p] == '.' && std::isdigit(static_cast<unsigned char>(s_[p + 1]))) return true;\n        return false;\n    }

Comment thread src/sql/parser.cpp
// never start with a digit, so a digit-led LHS is a literal. Evaluate
// it at parse time and fold to an empty AND (always-true) / empty OR
// (always-false) node so the executor never looks up a "1" column.
if (c.peek_digit()) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update the constant predicate check to use the new peek_numeric_literal() helper to support signed and decimal constant predicates.

Suggested change
if (c.peek_digit()) {
if (c.peek_numeric_literal()) {

Comment on lines +228 to +235
TEST_CASE("ADS dialect: constant predicate WHERE 1 = 1 folds to always-true") {
auto r = parse_select("SELECT * FROM x WHERE 1 = 1");
REQUIRE(r.has_value());
REQUIRE(r.value().where != nullptr);
// Always-true is encoded as an AND node with no children.
CHECK(r.value().where->kind == WhereExpr::Kind::And);
CHECK(r.value().where->children.empty());
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Add test cases to verify that signed and decimal constant predicates (e.g., -1 = -1, +1 = +1, .5 = .5) are correctly parsed and folded.

TEST_CASE(\"ADS dialect: constant predicate WHERE 1 = 1 folds to always-true\") {\n    auto r = parse_select(\"SELECT * FROM x WHERE 1 = 1\");\n    REQUIRE(r.has_value());\n    REQUIRE(r.value().where != nullptr);\n    // Always-true is encoded as an AND node with no children.\n    CHECK(r.value().where->kind == WhereExpr::Kind::And);\n    CHECK(r.value().where->children.empty());\n}\n\nTEST_CASE(\"ADS dialect: constant predicate WHERE -1 = -1 folds to always-true\") {\n    auto r = parse_select(\"SELECT * FROM x WHERE -1 = -1\");\n    REQUIRE(r.has_value());\n    REQUIRE(r.value().where != nullptr);\n    CHECK(r.value().where->kind == WhereExpr::Kind::And);\n    CHECK(r.value().where->children.empty());\n}\n\nTEST_CASE(\"ADS dialect: constant predicate WHERE .5 = .5 folds to always-true\") {\n    auto r = parse_select(\"SELECT * FROM x WHERE .5 = .5\");\n    REQUIRE(r.has_value());\n    REQUIRE(r.value().where != nullptr);\n    CHECK(r.value().where->kind == WhereExpr::Kind::And);\n    CHECK(r.value().where->children.empty());\n}

@Admnwk Admnwk changed the title feat(sql): accept SAP ADS dialect ({static} hint, [bracket] tables, table alias, UPPER()/LOWER() in WHERE, WHERE 1=1) feat(sql): accept SAP ADS dialect ({static} hint, [bracket] tables, table alias, UPPER()/LOWER() in WHERE, WHERE 1=1, comma-join) Jun 23, 2026
@Admnwk

Admnwk commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

@Admnwk Admnwk force-pushed the pr/sql-ads-dialect branch from d72541c to 55ff13e Compare June 23, 2026 13:33
@Admnwk

Admnwk commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

Hi Antonio 👋 — rebased this onto current main now that #31 landed (that's what was causing the conflict). Clean 4-commit history, full suite green locally: 599/599 test cases, 50984 assertions, 0 failures. Ready whenever you'd like to take a look. Thanks my friend 🙏

Admnwk and others added 4 commits June 23, 2026 17:25
…OM, table alias

Three legacy SAP ADS constructs the strict-SQL parser used to reject:

- SELECT {static} ... cursor/optimizer hint after SELECT (consumed, no-op)

- FROM [articulo.dat] bracket-quoted table/file name (read_identifier_or_filename now mirrors read_identifier's bracket handling)

- FROM <table> AS <alias> (and bare-alias form, guarded vs clause keywords); recorded in SelectStmt.table_alias

Qualified column refs <alias>.<col> already resolve by column name, so the alias is informational. Full unit suite 560/560 (45015 assertions) green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ialect)

ADS apps write case-insensitive filters as UPPER(col) <> 'N' / UPPER(col) LIKE 'A%'. The parser rejected a function on the LHS of a WHERE comparison.

parse_cmp now recognises UPPER(<col>)/LOWER(<col>) (paren-gated, so a column literally named UPPER still parses) and records it as WhereCmp.lhs_fn.

The executor applies the fold to the cell value before comparison (literal verbatim — faithful to ADS UPPER semantics, unlike a plain nocase mapping which also matches mismatched-case literals). Wired into every WHERE evaluator: single-table, UPDATE, DELETE, SELECT-full, aggregate/HAVING and JOIN-condition paths, plus the connection nocase path.

Tests: parser (UPPER/LOWER/bare/LIKE) + engine (SELECT scan, LIKE, UPDATE/DELETE row scope). Full suite 566/566 (45070 assertions) green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
SQL generators emit 'WHERE 1 = 1 AND ...' boilerplate to simplify clause concatenation. OpenADS read the leading '1' as a column name and failed at execution with AE_COLUMN_NOT_FOUND.

parse_cmp now detects a digit-led LHS (xBase columns never start with a digit), evaluates the <num> <op> <num> comparison at parse time, and folds it to an empty AND (always-true) or empty OR (always-false) node the executor already handles.

Tests: parser (1=1 true / 1=2 false / 1=1 AND pred) + engine (all rows / no rows / mixed) and the full Russoft/Zerus search query end-to-end. Full suite 571/571 green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Legacy SAP ADS apps write joins as a comma list with the equality
predicate in the WHERE — `FROM a, b WHERE a.x = b.y` — instead of
`INNER JOIN ... ON`. The strict-SQL parser silently dropped everything
after the comma, so these queries returned the wrong (un-joined) result.

Lower the two-table case onto the existing single inner_join AST:

- Parser accepts `FROM a, b` (each table with an optional AS/bare alias),
  then after the WHERE is parsed lifts the single `col = col` equality
  (recorded as Eq + is_outer_ref) into the JoinClause and blanks that
  node to an always-true empty-AND, leaving the rest of the WHERE as a
  row filter. Clear errors for: no equi-predicate (cartesian), more than
  one join key (composite), three+ comma tables, and mixing comma-join
  with an explicit JOIN.

- Executor adds an orientation fallback: alias stripping (`a.x` -> `x`)
  loses the table binding, so when the straight left/right column
  assignment does not resolve but the swapped one does, use the swap.
  Purely additive — it only fires on inputs the strict path already
  rejects, and also makes explicit `JOIN ON b.y = a.x` lenient like ADS.

Reuses the proven join executor end to end; no new multi-table path.

Tests: 8 parser cases + 3 executor cases (basic comma-join == explicit
INNER JOIN, joined-table-first orientation, residual WHERE filter).
Full unit suite green, zero regression.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@FiveTechSoft FiveTechSoft merged commit ca41b97 into FiveTechSoft:main Jun 23, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants