Skip to content

Conversation

@hajime-matsumoto
Copy link

@hajime-matsumoto hajime-matsumoto commented Nov 22, 2025

Fixed an issue where line comments (--) in SQL files caused incorrect query type detection. For example, an INSERT statement with a comment like "-- SELECT for xxx" at the beginning would be incorrectly identified as a SELECT query.

Changes:

  • Add LINE_COMMENT regex pattern to remove -- style comments
  • Strip both /* */ and -- comments during query type detection only
  • Preserve comments when sending queries to the database (important for MySQL performance hints and annotations)
  • Add comprehensive tests for comment handling edge cases

Queries sent to the database remain unchanged (comments preserved), while query type detection (SELECT vs INSERT/UPDATE/DELETE) now correctly ignores comment content.

Summary by Sourcery

Improve SQL query type detection by ignoring comments while preserving them for execution.

Bug Fixes:

  • Fix misclassification of SQL query types when line comments contain misleading keywords like SELECT or INSERT.

Enhancements:

  • Add handling for SQL line comments in query type detection alongside existing C-style comment stripping.

Tests:

  • Add integration tests covering INSERT, SELECT, and UPDATE statements with leading line comments containing other query keywords.
  • Adjust test SQL schema files to support new comment-handling test cases.

Summary by CodeRabbit

  • Bug Fixes

    • Improved query-type detection so SQL comments (line-style and block) and comment-like text no longer cause misclassification of INSERT/SELECT/UPDATE.
  • Tests

    • Added tests covering leading comments, comments containing SQL, dashes inside string literals, and related edge cases.
    • Updated test SQL fixtures and schemas to validate the corrected behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

Fixed an issue where line comments (--) in SQL files caused incorrect
query type detection. For example, an INSERT statement with a comment
like "-- SELECT for xxx" at the beginning would be incorrectly
identified as a SELECT query.

Changes:
- Add LINE_COMMENT regex pattern to remove -- style comments
- Strip both /* */ and -- comments during query type detection only
- Preserve comments when sending queries to the database
  (important for MySQL performance hints and annotations)
- Add comprehensive tests for comment handling edge cases

Queries sent to the database remain unchanged (comments preserved),
while query type detection (SELECT vs INSERT/UPDATE/DELETE) now
correctly ignores comment content.
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Nov 22, 2025

Reviewer's Guide

Adds line-comment stripping to SQL query type detection while preserving original SQL (including comments) for execution, and introduces tests to cover comment-related edge cases and fixes minor test schema mix-ups.

Sequence diagram for SQL query type detection with comment stripping

sequenceDiagram
    participant "Caller" as Caller
    participant "SqlQuery" as SqlQuery
    participant "PDO" as PDO

    "Caller"->>"SqlQuery": "perform(sqlId, values, fetch)"
    activate "SqlQuery"

    "SqlQuery"->>"PDO": "prepare(sqlWithComments)"
    "PDO"-->>"SqlQuery": "PDOStatement (queryString = sqlWithComments)"

    note over "SqlQuery": "Assign lastQuery from PDOStatement->queryString (comments preserved)"

    "SqlQuery"->>"SqlQuery": "queryForDetection = preg_replace(C_STYLE_COMMENT, '', lastQuery)"
    "SqlQuery"->>"SqlQuery": "queryForDetection = preg_replace(LINE_COMMENT, '', queryForDetection)"
    "SqlQuery"->>"SqlQuery": "queryForDetection = trim(queryForDetection)"

    "SqlQuery"->>"SqlQuery": "isSelect = startsWithIgnoreCase(queryForDetection, 'select' or 'with')"

    alt "isSelect is true"
        "SqlQuery"->>"SqlQuery": "result = fetchAll(PDOStatement, fetch)"
    else "isSelect is false"
        "SqlQuery"->>"SqlQuery": "result = [] (no fetch, non-SELECT query)"
    end

    "SqlQuery"->>"SqlQuery": "logger->log(sqlId, values)"

    "SqlQuery"-->>"Caller": "result (execution used original SQL with comments)"
    deactivate "SqlQuery"
Loading

Updated class diagram for SqlQuery comment-aware type detection

classDiagram
    class SqlQuery {
        <<final>>
        +C_STYLE_COMMENT : string
        +LINE_COMMENT : string
        -pdoStatement : PDOStatement | null
        +perform(sqlId : string, values : array, fetch : FetchInterface | null) : array
    }

    class FetchInterface {
        <<interface>>
    }

    class PDOStatement

    SqlQuery ..> FetchInterface : "uses for result fetching"
    SqlQuery ..> PDOStatement : "holds reference to"
Loading

File-Level Changes

Change Details Files
Improve SQL query type detection by stripping both C-style and line comments without altering executed SQL.
  • Introduce a LINE_COMMENT regex constant to match -- style comments in SQL strings.
  • Refactor query-type detection to build a separate, comment-stripped string used only for determining whether a query is SELECT/WITH or a write operation.
  • Ensure the original PDOStatement query string, including comments, is preserved for execution and logging.
src/SqlQuery.php
Adjust test schemas and add coverage for comment-handling behavior in SQL queries.
  • Swap and correct the definitions of the promise and todo table creation SQL to reflect intended schemas.
  • Add a PHPUnit test case that sets up an in-memory SQLite database and exercises SQL queries with leading and embedded line comments affecting different query types.
  • Create SQL fixture files for insert, select, and update statements that include line comments designed to challenge query-type detection (e.g., SELECT text in comments before INSERT/UPDATE).
tests/sql/create_promise.sql
tests/sql/create_todo.sql
tests/SqlCommentTest.php
tests/sql/todo_insert_with_select_comment.sql
tests/sql/todo_select_with_leading_comment.sql
tests/sql/todo_update_with_insert_comment.sql

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 22, 2025

Walkthrough

Strips C-style and line-style (--) SQL comments for query-type detection in SqlQuery, using the cleaned SQL to decide if a query is a SELECT/ WITH, while leaving the original SQL unchanged for execution. Adds unit tests and SQL fixtures covering comment edge cases.

Changes

Cohort / File(s) Summary
Core logic
src/SqlQuery.php
Adds LINE_COMMENT constant and private removeCommentsForDetection(string $sql): string to strip C-style and -- line comments for detection. Replaces inline detection with call to the new method; execution uses original SQL. Docblock updated.
PHPUnit tests
tests/SqlCommentTest.php
New test class setting up in-memory SQLite, loading schema/fixtures, and four tests covering: INSERT with SELECT in comment, SELECT with leading dash comment, UPDATE with INSERT-style comment, and dashes inside string literals.
SQL fixtures — schema
tests/sql/create_todo.sql, tests/sql/create_promise.sql
Updated/added schema files: create_todo.sql defines todo (id TEXT, title TEXT); create_promise.sql defines promise (id TEXT, title TEXT, time TEXT).
SQL fixtures — test cases
tests/sql/todo_insert_with_select_comment.sql, tests/sql/todo_select_with_leading_comment.sql, tests/sql/todo_update_with_insert_comment.sql, tests/sql/todo_with_dashes_in_string.sql
New SQL files with commented lines or string literals that resemble SQL commands, used to validate comment-stripping and detection behavior.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Caller
    participant SqlQuery
    participant DB as Database

    Caller->>SqlQuery: execute(rawSql, params)
    Note right of SqlQuery `#f0f4c3`: Detection phase uses cleaned SQL
    SqlQuery->>SqlQuery: cleaned = removeCommentsForDetection(rawSql)
    SqlQuery->>SqlQuery: type = classify(cleaned)  -- (SELECT/WITH vs others)
    alt classified as SELECT/WITH
        SqlQuery->>DB: query(rawSql, params)  -- use original SQL for execution
    else non-SELECT
        SqlQuery->>DB: execute(rawSql, params)
    end
    DB-->>SqlQuery: result
    SqlQuery-->>Caller: result
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Review focus:
    • src/SqlQuery.php: verify regex correctness (C-style and -- handling), ensure string-literal edge cases aren't stripped incorrectly.
    • tests/SqlCommentTest.php: validate test setup and assertions, confirm fixtures are loaded as intended.
    • SQL fixtures: confirm schema consistency between create_todo.sql and create_promise.sql and that tests reference correct tables.

Poem

🐰 I nibbled comments hidden in the night,
Dash and star, I found the source of light.
Detection now cleans what the queries say,
Execution stays honest, in its own way.
Hop, patch, and test — the SQL's polite!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix line comment handling in query type detection' clearly and specifically describes the main change: fixing SQL line comment handling in query type detection logic, which aligns with the primary objectives of the PR.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8148456 and 89ead58.

📒 Files selected for processing (1)
  • tests/SqlCommentTest.php (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/SqlCommentTest.php
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Sourcery review

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • The LINE_COMMENT regex will strip any occurrence of -- regardless of context (including inside string literals or mid-line expressions), so consider tightening it (e.g., anchoring to line starts or whitespace) or clarifying that this heuristic is acceptable for your use cases.
  • The logic for stripping C-style and line comments for query type detection is embedded directly in perform(); consider extracting it into a dedicated private method so the behavior is easier to reuse and unit test in isolation.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The LINE_COMMENT regex will strip any occurrence of `--` regardless of context (including inside string literals or mid-line expressions), so consider tightening it (e.g., anchoring to line starts or whitespace) or clarifying that this heuristic is acceptable for your use cases.
- The logic for stripping C-style and line comments for query type detection is embedded directly in perform(); consider extracting it into a dedicated private method so the behavior is easier to reuse and unit test in isolation.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/SqlCommentTest.php (1)

60-68: LGTM! Validates UPDATE with misleading comment.

This test ensures UPDATE statements are correctly identified even when comments contain misleading keywords. The verification step confirms the update was actually executed.

Consider adding tests for additional edge cases (though current coverage is sufficient for the PR objectives):

  • Mixed comment types: C-style and line comments in the same query
  • Comments at different positions (inline, trailing)
  • Multiple consecutive line comments
  • Comments with special characters or SQL keywords
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1be36f7 and 13e2abe.

📒 Files selected for processing (7)
  • src/SqlQuery.php (2 hunks)
  • tests/SqlCommentTest.php (1 hunks)
  • tests/sql/create_promise.sql (1 hunks)
  • tests/sql/create_todo.sql (1 hunks)
  • tests/sql/todo_insert_with_select_comment.sql (1 hunks)
  • tests/sql/todo_select_with_leading_comment.sql (1 hunks)
  • tests/sql/todo_update_with_insert_comment.sql (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/SqlCommentTest.php (2)
src/SqlQuery.php (2)
  • SqlQuery (36-201)
  • perform (107-136)
src/PerformTemplatedSql.php (1)
  • PerformTemplatedSql (14-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Sourcery review
🔇 Additional comments (10)
src/SqlQuery.php (2)

39-39: LGTM! Line comment regex is correct.

The LINE_COMMENT pattern correctly matches SQL line comments (-- followed by any characters until end of line). The regex appropriately uses [^\r\n]* to stop at line breaks.

Note: This doesn't handle the edge case of -- appearing inside string literals (e.g., INSERT INTO foo VALUES ('text--here')), but properly handling this would require full SQL parsing and is likely acceptable for the query type detection use case.


127-130: Regex pattern confirmed to lack s modifier, but no multi-line comments found in codebase.

The review's technical concern is valid: the C_STYLE_COMMENT regex /\/\*(.*?)\*\//u at line 38 does indeed lack the s (DOTALL) modifier, which means . cannot match newlines in PHP PCRE. However, verification shows no multi-line SQL comments spanning newlines exist in the codebase. The current implementation correctly handles all comments in the actual test SQL files. The code changes are functionally correct for the existing use cases.

tests/sql/todo_select_with_leading_comment.sql (1)

1-3: LGTM! Clear test case for SELECT with leading comments.

This test SQL correctly validates that line comments preceding a SELECT statement don't interfere with query type detection.

tests/sql/create_todo.sql (1)

1-5: LGTM! Test schema is appropriate.

The todo table schema correctly supports the comment-handling tests with simple id and title TEXT columns.

tests/sql/todo_insert_with_select_comment.sql (1)

1-3: LGTM! Critical test case for the fix.

This test SQL directly validates the core bug fix: ensuring an INSERT statement is not misidentified as SELECT when "SELECT" appears in a line comment. This matches the PR objective of fixing cases where "-- SELECT for xxx" caused incorrect query type detection.

tests/sql/todo_update_with_insert_comment.sql (1)

1-2: LGTM! Validates UPDATE detection with misleading comment.

This test SQL ensures UPDATE statements are correctly identified even when comments contain misleading keywords like "INSERT".

tests/SqlCommentTest.php (3)

25-41: LGTM! Well-structured test setup.

The setUp() method correctly initializes the test environment with an in-memory SQLite database, loads the schema, seeds test data, and constructs the SqlQuery instance with all necessary dependencies. The use of PerformTemplatedSql with a simple template ('{{ sql }}') is appropriate for these tests.


43-52: LGTM! Critical test validates the core fix.

This test correctly validates that an INSERT statement is not misidentified as SELECT when "SELECT" appears in a line comment - which is the exact issue described in the PR objectives. The test both executes the INSERT and verifies it succeeded by querying the inserted row.


54-58: LGTM! Validates SELECT with leading comments.

This test confirms that SELECT statements work correctly when preceded by line comments, ensuring the comment-stripping logic doesn't break legitimate SELECT queries.

tests/sql/create_promise.sql (1)

1-6: The create_promise.sql file is actively referenced in the test suite.

The file is used in at least two test modules:

  • tests/DbQueryModuleTest.php:93
  • tests/SqlQueryTest.php:39

Both reference it via file_get_contents($sqlDir . '/create_promise.sql'), confirming the file is needed and should be retained.

Likely an incorrect or invalid review comment.

- Tighten LINE_COMMENT regex to only match -- at line start
  (after optional whitespace) to avoid stripping -- inside string
  literals. Changed from '/--[^\r\n]*/' to '/^\s*--[^\r\n]*/m'

- Extract comment removal logic to dedicated private method
  removeCommentsForDetection() for better reusability and testability

- Add test case for -- inside string literals to verify the regex
  improvement works correctly

All 86 tests pass.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/SqlQuery.php (2)

39-39: Query-type detection now correctly ignores leading line comments

Using removeCommentsForDetection() before stripos() fixes the misclassification when queries are prefixed with -- ... lines, while leaving the executed SQL untouched. The LINE_COMMENT pattern is appropriately restricted to comments at the beginning of lines, so -- inside expressions or string literals is not stripped. This aligns well with the stated PR objective.

If, in future, you need to support inline trailing comments like SELECT 1 -- comment, you could expand the line-comment handling accordingly, but it's reasonable to defer until there's a concrete need.

Also applies to: 125-129


147-162: Comment-stripping helper works as intended; consider multiline C-style support

removeCommentsForDetection() cleanly centralizes comment removal for detection and keeps the execution SQL unchanged, which is a nice separation of concerns. One limitation (pre-existing, but now encapsulated here) is that C_STYLE_COMMENT does not span newlines, so /* ... */ comments that cross lines will not be removed. Not a blocker, but if your SQL templates ever use multiline block comments ahead of the statement, you may want to add an s flag or otherwise handle that case.

tests/SqlCommentTest.php (1)

18-77: Good, focused coverage of the comment-handling edge cases

The test setup and the four scenarios nicely pin down the intended behavior: leading -- comments no longer affect type detection, and -- inside string literals remains untouched. This should prevent regressions around this bug. If you ever expand the detection logic (e.g., to handle inline trailing comments or multiline block comments), adding corresponding fixtures here will keep the behavior well specified.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13e2abe and 8148456.

📒 Files selected for processing (3)
  • src/SqlQuery.php (3 hunks)
  • tests/SqlCommentTest.php (1 hunks)
  • tests/sql/todo_with_dashes_in_string.sql (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/sql/todo_with_dashes_in_string.sql
🧰 Additional context used
🧬 Code graph analysis (1)
tests/SqlCommentTest.php (3)
src/SqlQuery.php (2)
  • SqlQuery (36-216)
  • perform (107-134)
src/MediaQueryLogger.php (1)
  • MediaQueryLogger (20-53)
src/PerformTemplatedSql.php (1)
  • PerformTemplatedSql (14-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Sourcery review

Sort use statements alphabetically to comply with coding standards.
@koriym
Copy link
Member

koriym commented Nov 23, 2025

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 23, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@koriym koriym merged commit 606a67e into ray-di:1.x Nov 23, 2025
15 checks passed
koriym added a commit that referenced this pull request Nov 23, 2025
Document fixes in version 1.0.1:
- Line comment handling in query type detection (#76)
- POSIX-compliant trailing newlines in SQL files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants