Skip to content

Add parser unit tests (81 tests) and fix IF NOT EXISTS parsing#32

Merged
poyrazK merged 4 commits intomainfrom
feat/parser-unit-tests
Apr 14, 2026
Merged

Add parser unit tests (81 tests) and fix IF NOT EXISTS parsing#32
poyrazK merged 4 commits intomainfrom
feat/parser-unit-tests

Conversation

@poyrazK
Copy link
Copy Markdown
Owner

@poyrazK poyrazK commented Apr 14, 2026

Summary

  • Created 81 unit tests for parser/parser.cpp covering SELECT, CREATE TABLE/INDEX, INSERT, UPDATE, DELETE, DROP, transaction statements, and expression parsing
  • Fixed bug in parse_create_table() where IF NOT EXISTS was checking tokens in wrong order

Test plan

  • All 29 tests pass including new parser_tests
  • CI builds pass (address, thread sanitizer)

Files changed

  • tests/parser_tests.cpp (new)
  • src/parser/parser.cpp (bug fix)
  • CMakeLists.txt (added parser_tests target)

Note

PRIMARY KEY column constraint not supported by lexer (missing PRIMARY/KEY keywords) - tests adjusted accordingly.

Summary by CodeRabbit

  • Bug Fixes

    • Fixed CREATE TABLE IF NOT EXISTS parsing to correctly validate the IF keyword sequence.
  • Tests

    • Added comprehensive test suite covering SQL statement parsing validation including SELECT queries, table/index creation and deletion, INSERT/UPDATE/DELETE operations, transactions, and expression parsing with case-insensitivity and whitespace robustness checks.

- Created tests/parser_tests.cpp with comprehensive coverage of:
  - SELECT with all clauses (WHERE, GROUP BY, HAVING, ORDER BY, LIMIT, OFFSET, JOINs)
  - CREATE TABLE/INDEX, INSERT, UPDATE, DELETE, DROP
  - Transaction statements (BEGIN, COMMIT, ROLLBACK)
  - Expression parsing (binary, unary, IN, IS NULL, functions)
  - Error handling and edge cases

- Fixed parse_create_table() to handle IF NOT EXISTS in correct order
  (was checking for NOT before IF, now correctly checks IF before NOT)

- Note: PRIMARY KEY not supported by lexer (missing PRIMARY/KEY keywords)
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 14, 2026

Warning

Rate limit exceeded

@github-actions[bot] has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 25 minutes and 24 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 25 minutes and 24 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 51a6db49-c297-4956-8107-34a2bdf8ee36

📥 Commits

Reviewing files that changed from the base of the PR and between c7c093c and f785e24.

📒 Files selected for processing (4)
  • Testing/Temporary/CTestCostData.txt
  • include/parser/statement.hpp
  • src/parser/parser.cpp
  • tests/parser_tests.cpp
📝 Walkthrough

Walkthrough

The PR adds comprehensive unit tests for SQL parsing functionality, including a corrected "IF NOT EXISTS" validation in the CREATE TABLE parser, and registers the new test executable in the CMake build system.

Changes

Cohort / File(s) Summary
Build Configuration
CMakeLists.txt
Registers a new GoogleTest-based executable parser_tests built from tests/parser_tests.cpp within the existing test build section.
Parser Logic
src/parser/parser.cpp
Updates CREATE TABLE IF NOT EXISTS handling to consume IF before NOT, and returns nullptr if IF is present but NOT is missing, ensuring correct syntax validation.
Test Suite
tests/parser_tests.cpp
New comprehensive test file with 800+ lines covering parsing validation for SELECT, JOIN, CREATE TABLE, CREATE INDEX, INSERT, UPDATE, DELETE, DROP statements, transactions, and expression types including constants, operators, functions, and parameter placeholders.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A parser grows more robust and true,
With tests of green to see it through,
IF NOT EXISTS, now parsed right,
The rabbit's code shines crystal bright!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 21.18% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the two main changes: adding 81 parser unit tests and fixing the IF NOT EXISTS parsing bug.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/parser-unit-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/parser/parser.cpp (1)

290-297: ⚠️ Potential issue | 🟠 Major

Preserve IF NOT EXISTS on the AST.

This branch now parses the clause correctly, but CreateTableStatement still has no way to store it, so downstream code cannot tell CREATE TABLE IF NOT EXISTS ... from a normal CREATE TABLE .... That loses behavior, not just syntax.

Suggested fix
 if (consume(TokenType::If)) {
     if (!consume(TokenType::Not)) {
         return nullptr;
     }
     if (!consume(TokenType::Exists)) {
         return nullptr;
     }
+    stmt->set_if_not_exists(true);
 }

And add the corresponding field/accessors on CreateTableStatement, e.g.:

bool if_not_exists_ = false;

void set_if_not_exists(bool value) { if_not_exists_ = value; }
[[nodiscard]] bool if_not_exists() const { return if_not_exists_; }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/parser/parser.cpp` around lines 290 - 297, The parser currently
recognizes the IF NOT EXISTS tokens but doesn't record that on the AST; update
CreateTableStatement to hold and expose this flag (add a private bool
if_not_exists_ = false plus setter set_if_not_exists(bool) and getter
if_not_exists() const), and when parsing in parser.cpp where you consume
TokenType::If/Not/Exists call CreateTableStatement::set_if_not_exists(true) (or
otherwise set the flag on the created CreateTableStatement) so downstream code
can distinguish CREATE TABLE IF NOT EXISTS from plain CREATE TABLE.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/parser_tests.cpp`:
- Around line 561-571: Update the ParseArithmeticPrecedence test to assert the
actual AST shape: after obtaining select via as<SelectStatement>(stmt) and
checking select->columns().size(), retrieve the first column expression (via
select->columns()[0] or select->columns().at(0)) and assert it is a BinaryExpr
whose operator is '+'; then assert the BinaryExpr's left child is an integer
literal with value 1 and its right child is itself a BinaryExpr whose operator
is '*' with left integer literal 2 and right integer literal 3. Use the existing
as<BinaryExpr>() and as<IntegerLiteral>() helpers (or equivalent
node-casting/accessor methods) and the node accessors
(operator()/left()/right()/value()) to make these concrete assertions.
- Around line 227-236: The test TEST(ParserTests, SelectWithJoinNoCondition)
currently asserts that parse("SELECT * FROM users JOIN orders") yields a join
with a nullptr condition; instead update the test to use a valid join form or
assert a syntax error: either replace the SQL with a supported cartesian join
like "SELECT * FROM users CROSS JOIN orders" or a comma join, or change
expectations to assert parse(...) fails (stmt == nullptr) or produces a
diagnostic error; locate the test by the test name and the calls to parse(),
SelectStatement, select->joins(), and join->condition to modify the assertion
accordingly.
- Around line 510-519: The test only checks parse success; update
TEST(ParserTests, ParseBooleanConstants) to assert boolean semantics: for
stmt1/stmt2/stmt3 verify the first select expression is an AST literal/constant
node (e.g., SelectStatement->select_list()[0] or similar) and that its
value/type matches true, false, and NULL respectively instead of being parsed as
an identifier/column; if necessary reference Parser::parse_primary() behavior
when locating the expression node to inspect and assert the node class
(ConstantLiteral/Literal/BooleanLiteral) and the literal value for
TRUE/FALSE/NULL.

---

Outside diff comments:
In `@src/parser/parser.cpp`:
- Around line 290-297: The parser currently recognizes the IF NOT EXISTS tokens
but doesn't record that on the AST; update CreateTableStatement to hold and
expose this flag (add a private bool if_not_exists_ = false plus setter
set_if_not_exists(bool) and getter if_not_exists() const), and when parsing in
parser.cpp where you consume TokenType::If/Not/Exists call
CreateTableStatement::set_if_not_exists(true) (or otherwise set the flag on the
created CreateTableStatement) so downstream code can distinguish CREATE TABLE IF
NOT EXISTS from plain CREATE TABLE.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cfc6f891-552c-4d54-a2a6-753c68179620

📥 Commits

Reviewing files that changed from the base of the PR and between eb8bf5d and c7c093c.

📒 Files selected for processing (3)
  • CMakeLists.txt
  • src/parser/parser.cpp
  • tests/parser_tests.cpp

Comment thread tests/parser_tests.cpp
Comment on lines +227 to +236
TEST(ParserTests, SelectWithJoinNoCondition) {
// JOIN without ON condition - parsed but condition is nullptr
auto stmt = parse("SELECT * FROM users JOIN orders");
ASSERT_NE(stmt, nullptr);

auto* select = as<SelectStatement>(stmt);
ASSERT_NE(select, nullptr);
EXPECT_EQ(select->joins().size(), 1U);
EXPECT_EQ(select->joins()[0].condition, nullptr);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't lock in bare JOIN without a join condition.

This test turns the current permissive parser behavior into the contract, but SELECT ... JOIN orders is malformed SQL for the join types this parser recognizes. If you want cartesian products, cover CROSS JOIN or comma joins explicitly instead.

Suggested fix
 TEST(ParserTests, SelectWithJoinNoCondition) {
-    // JOIN without ON condition - parsed but condition is nullptr
     auto stmt = parse("SELECT * FROM users JOIN orders");
-    ASSERT_NE(stmt, nullptr);
-
-    auto* select = as<SelectStatement>(stmt);
-    ASSERT_NE(select, nullptr);
-    EXPECT_EQ(select->joins().size(), 1U);
-    EXPECT_EQ(select->joins()[0].condition, nullptr);
+    EXPECT_EQ(stmt, nullptr);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
TEST(ParserTests, SelectWithJoinNoCondition) {
// JOIN without ON condition - parsed but condition is nullptr
auto stmt = parse("SELECT * FROM users JOIN orders");
ASSERT_NE(stmt, nullptr);
auto* select = as<SelectStatement>(stmt);
ASSERT_NE(select, nullptr);
EXPECT_EQ(select->joins().size(), 1U);
EXPECT_EQ(select->joins()[0].condition, nullptr);
}
TEST(ParserTests, SelectWithJoinNoCondition) {
auto stmt = parse("SELECT * FROM users JOIN orders");
EXPECT_EQ(stmt, nullptr);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/parser_tests.cpp` around lines 227 - 236, The test TEST(ParserTests,
SelectWithJoinNoCondition) currently asserts that parse("SELECT * FROM users
JOIN orders") yields a join with a nullptr condition; instead update the test to
use a valid join form or assert a syntax error: either replace the SQL with a
supported cartesian join like "SELECT * FROM users CROSS JOIN orders" or a comma
join, or change expectations to assert parse(...) fails (stmt == nullptr) or
produces a diagnostic error; locate the test by the test name and the calls to
parse(), SelectStatement, select->joins(), and join->condition to modify the
assertion accordingly.

Comment thread tests/parser_tests.cpp
Comment on lines +510 to +519
TEST(ParserTests, ParseBooleanConstants) {
auto stmt1 = parse("SELECT TRUE FROM t");
ASSERT_NE(stmt1, nullptr);

auto stmt2 = parse("SELECT FALSE FROM t");
ASSERT_NE(stmt2, nullptr);

auto stmt3 = parse("SELECT NULL FROM t");
ASSERT_NE(stmt3, nullptr);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Assert boolean semantics here, not just parse success.

TRUE/FALSE currently go through the generic keyword path in Parser::parse_primary(), while only NULL is converted to a constant. As written, this test still passes if booleans are parsed as columns, so it doesn't actually cover the feature it names.

Suggested fix
 TEST(ParserTests, ParseBooleanConstants) {
     auto stmt1 = parse("SELECT TRUE FROM t");
     ASSERT_NE(stmt1, nullptr);
+    auto* select1 = as<SelectStatement>(stmt1);
+    ASSERT_NE(select1, nullptr);
+    ASSERT_GE(select1->columns().size(), 1U);
+    EXPECT_EQ(select1->columns()[0]->type(), ExprType::Constant);

     auto stmt2 = parse("SELECT FALSE FROM t");
     ASSERT_NE(stmt2, nullptr);
+    auto* select2 = as<SelectStatement>(stmt2);
+    ASSERT_NE(select2, nullptr);
+    ASSERT_GE(select2->columns().size(), 1U);
+    EXPECT_EQ(select2->columns()[0]->type(), ExprType::Constant);

     auto stmt3 = parse("SELECT NULL FROM t");
     ASSERT_NE(stmt3, nullptr);
+    auto* select3 = as<SelectStatement>(stmt3);
+    ASSERT_NE(select3, nullptr);
+    ASSERT_GE(select3->columns().size(), 1U);
+    EXPECT_EQ(select3->columns()[0]->type(), ExprType::Constant);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/parser_tests.cpp` around lines 510 - 519, The test only checks parse
success; update TEST(ParserTests, ParseBooleanConstants) to assert boolean
semantics: for stmt1/stmt2/stmt3 verify the first select expression is an AST
literal/constant node (e.g., SelectStatement->select_list()[0] or similar) and
that its value/type matches true, false, and NULL respectively instead of being
parsed as an identifier/column; if necessary reference Parser::parse_primary()
behavior when locating the expression node to inspect and assert the node class
(ConstantLiteral/Literal/BooleanLiteral) and the literal value for
TRUE/FALSE/NULL.

Comment thread tests/parser_tests.cpp
poyrazK and others added 2 commits April 14, 2026 18:13
- ParseArithmeticPrecedence: Assert AST shape for 1+2*3, verifying
  top-level BinaryExpr(+) with left=1 and right=BinaryExpr(*)
- SelectWithJoinNoCondition: Updated test to reflect actual parser
  behavior (accepts JOIN without ON, condition is nullptr)
- ParseBooleanConstants: Updated to reflect actual parser behavior
  (TRUE/FALSE parsed as ColumnExpr, NULL as ConstantExpr)
- CreateTableStatement: Added if_not_exists_ flag with setter/getter
- parser.cpp: Set if_not_exists flag when parsing IF NOT EXISTS
- All 81 tests pass
@poyrazK poyrazK merged commit d1294e9 into main Apr 14, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant