-
Notifications
You must be signed in to change notification settings - Fork 8
Fix line comment handling in query type detection #76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix line comment handling in query type detection #76
Conversation
Fixed an issue where line comments (--) in SQL files caused incorrect query type detection. For example, an INSERT statement with a comment like "-- SELECT for xxx" at the beginning would be incorrectly identified as a SELECT query. Changes: - Add LINE_COMMENT regex pattern to remove -- style comments - Strip both /* */ and -- comments during query type detection only - Preserve comments when sending queries to the database (important for MySQL performance hints and annotations) - Add comprehensive tests for comment handling edge cases Queries sent to the database remain unchanged (comments preserved), while query type detection (SELECT vs INSERT/UPDATE/DELETE) now correctly ignores comment content.
Reviewer's GuideAdds line-comment stripping to SQL query type detection while preserving original SQL (including comments) for execution, and introduces tests to cover comment-related edge cases and fixes minor test schema mix-ups. Sequence diagram for SQL query type detection with comment strippingsequenceDiagram
participant "Caller" as Caller
participant "SqlQuery" as SqlQuery
participant "PDO" as PDO
"Caller"->>"SqlQuery": "perform(sqlId, values, fetch)"
activate "SqlQuery"
"SqlQuery"->>"PDO": "prepare(sqlWithComments)"
"PDO"-->>"SqlQuery": "PDOStatement (queryString = sqlWithComments)"
note over "SqlQuery": "Assign lastQuery from PDOStatement->queryString (comments preserved)"
"SqlQuery"->>"SqlQuery": "queryForDetection = preg_replace(C_STYLE_COMMENT, '', lastQuery)"
"SqlQuery"->>"SqlQuery": "queryForDetection = preg_replace(LINE_COMMENT, '', queryForDetection)"
"SqlQuery"->>"SqlQuery": "queryForDetection = trim(queryForDetection)"
"SqlQuery"->>"SqlQuery": "isSelect = startsWithIgnoreCase(queryForDetection, 'select' or 'with')"
alt "isSelect is true"
"SqlQuery"->>"SqlQuery": "result = fetchAll(PDOStatement, fetch)"
else "isSelect is false"
"SqlQuery"->>"SqlQuery": "result = [] (no fetch, non-SELECT query)"
end
"SqlQuery"->>"SqlQuery": "logger->log(sqlId, values)"
"SqlQuery"-->>"Caller": "result (execution used original SQL with comments)"
deactivate "SqlQuery"
Updated class diagram for SqlQuery comment-aware type detectionclassDiagram
class SqlQuery {
<<final>>
+C_STYLE_COMMENT : string
+LINE_COMMENT : string
-pdoStatement : PDOStatement | null
+perform(sqlId : string, values : array, fetch : FetchInterface | null) : array
}
class FetchInterface {
<<interface>>
}
class PDOStatement
SqlQuery ..> FetchInterface : "uses for result fetching"
SqlQuery ..> PDOStatement : "holds reference to"
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
WalkthroughStrips C-style and line-style ( Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller
participant SqlQuery
participant DB as Database
Caller->>SqlQuery: execute(rawSql, params)
Note right of SqlQuery `#f0f4c3`: Detection phase uses cleaned SQL
SqlQuery->>SqlQuery: cleaned = removeCommentsForDetection(rawSql)
SqlQuery->>SqlQuery: type = classify(cleaned) -- (SELECT/WITH vs others)
alt classified as SELECT/WITH
SqlQuery->>DB: query(rawSql, params) -- use original SQL for execution
else non-SELECT
SqlQuery->>DB: execute(rawSql, params)
end
DB-->>SqlQuery: result
SqlQuery-->>Caller: result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- The LINE_COMMENT regex will strip any occurrence of
--regardless of context (including inside string literals or mid-line expressions), so consider tightening it (e.g., anchoring to line starts or whitespace) or clarifying that this heuristic is acceptable for your use cases. - The logic for stripping C-style and line comments for query type detection is embedded directly in perform(); consider extracting it into a dedicated private method so the behavior is easier to reuse and unit test in isolation.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The LINE_COMMENT regex will strip any occurrence of `--` regardless of context (including inside string literals or mid-line expressions), so consider tightening it (e.g., anchoring to line starts or whitespace) or clarifying that this heuristic is acceptable for your use cases.
- The logic for stripping C-style and line comments for query type detection is embedded directly in perform(); consider extracting it into a dedicated private method so the behavior is easier to reuse and unit test in isolation.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/SqlCommentTest.php (1)
60-68: LGTM! Validates UPDATE with misleading comment.This test ensures UPDATE statements are correctly identified even when comments contain misleading keywords. The verification step confirms the update was actually executed.
Consider adding tests for additional edge cases (though current coverage is sufficient for the PR objectives):
- Mixed comment types: C-style and line comments in the same query
- Comments at different positions (inline, trailing)
- Multiple consecutive line comments
- Comments with special characters or SQL keywords
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
src/SqlQuery.php(2 hunks)tests/SqlCommentTest.php(1 hunks)tests/sql/create_promise.sql(1 hunks)tests/sql/create_todo.sql(1 hunks)tests/sql/todo_insert_with_select_comment.sql(1 hunks)tests/sql/todo_select_with_leading_comment.sql(1 hunks)tests/sql/todo_update_with_insert_comment.sql(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/SqlCommentTest.php (2)
src/SqlQuery.php (2)
SqlQuery(36-201)perform(107-136)src/PerformTemplatedSql.php (1)
PerformTemplatedSql(14-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Sourcery review
🔇 Additional comments (10)
src/SqlQuery.php (2)
39-39: LGTM! Line comment regex is correct.The
LINE_COMMENTpattern correctly matches SQL line comments (--followed by any characters until end of line). The regex appropriately uses[^\r\n]*to stop at line breaks.Note: This doesn't handle the edge case of
--appearing inside string literals (e.g.,INSERT INTO foo VALUES ('text--here')), but properly handling this would require full SQL parsing and is likely acceptable for the query type detection use case.
127-130: Regex pattern confirmed to lacksmodifier, but no multi-line comments found in codebase.The review's technical concern is valid: the C_STYLE_COMMENT regex
/\/\*(.*?)\*\//uat line 38 does indeed lack thes(DOTALL) modifier, which means.cannot match newlines in PHP PCRE. However, verification shows no multi-line SQL comments spanning newlines exist in the codebase. The current implementation correctly handles all comments in the actual test SQL files. The code changes are functionally correct for the existing use cases.tests/sql/todo_select_with_leading_comment.sql (1)
1-3: LGTM! Clear test case for SELECT with leading comments.This test SQL correctly validates that line comments preceding a SELECT statement don't interfere with query type detection.
tests/sql/create_todo.sql (1)
1-5: LGTM! Test schema is appropriate.The
todotable schema correctly supports the comment-handling tests with simple id and title TEXT columns.tests/sql/todo_insert_with_select_comment.sql (1)
1-3: LGTM! Critical test case for the fix.This test SQL directly validates the core bug fix: ensuring an INSERT statement is not misidentified as SELECT when "SELECT" appears in a line comment. This matches the PR objective of fixing cases where "-- SELECT for xxx" caused incorrect query type detection.
tests/sql/todo_update_with_insert_comment.sql (1)
1-2: LGTM! Validates UPDATE detection with misleading comment.This test SQL ensures UPDATE statements are correctly identified even when comments contain misleading keywords like "INSERT".
tests/SqlCommentTest.php (3)
25-41: LGTM! Well-structured test setup.The
setUp()method correctly initializes the test environment with an in-memory SQLite database, loads the schema, seeds test data, and constructs theSqlQueryinstance with all necessary dependencies. The use ofPerformTemplatedSqlwith a simple template ('{{ sql }}') is appropriate for these tests.
43-52: LGTM! Critical test validates the core fix.This test correctly validates that an INSERT statement is not misidentified as SELECT when "SELECT" appears in a line comment - which is the exact issue described in the PR objectives. The test both executes the INSERT and verifies it succeeded by querying the inserted row.
54-58: LGTM! Validates SELECT with leading comments.This test confirms that SELECT statements work correctly when preceded by line comments, ensuring the comment-stripping logic doesn't break legitimate SELECT queries.
tests/sql/create_promise.sql (1)
1-6: Thecreate_promise.sqlfile is actively referenced in the test suite.The file is used in at least two test modules:
tests/DbQueryModuleTest.php:93tests/SqlQueryTest.php:39Both reference it via
file_get_contents($sqlDir . '/create_promise.sql'), confirming the file is needed and should be retained.Likely an incorrect or invalid review comment.
- Tighten LINE_COMMENT regex to only match -- at line start (after optional whitespace) to avoid stripping -- inside string literals. Changed from '/--[^\r\n]*/' to '/^\s*--[^\r\n]*/m' - Extract comment removal logic to dedicated private method removeCommentsForDetection() for better reusability and testability - Add test case for -- inside string literals to verify the regex improvement works correctly All 86 tests pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/SqlQuery.php (2)
39-39: Query-type detection now correctly ignores leading line commentsUsing
removeCommentsForDetection()beforestripos()fixes the misclassification when queries are prefixed with-- ...lines, while leaving the executed SQL untouched. TheLINE_COMMENTpattern is appropriately restricted to comments at the beginning of lines, so--inside expressions or string literals is not stripped. This aligns well with the stated PR objective.If, in future, you need to support inline trailing comments like
SELECT 1 -- comment, you could expand the line-comment handling accordingly, but it's reasonable to defer until there's a concrete need.Also applies to: 125-129
147-162: Comment-stripping helper works as intended; consider multiline C-style support
removeCommentsForDetection()cleanly centralizes comment removal for detection and keeps the execution SQL unchanged, which is a nice separation of concerns. One limitation (pre-existing, but now encapsulated here) is thatC_STYLE_COMMENTdoes not span newlines, so/* ... */comments that cross lines will not be removed. Not a blocker, but if your SQL templates ever use multiline block comments ahead of the statement, you may want to add ansflag or otherwise handle that case.tests/SqlCommentTest.php (1)
18-77: Good, focused coverage of the comment-handling edge casesThe test setup and the four scenarios nicely pin down the intended behavior: leading
--comments no longer affect type detection, and--inside string literals remains untouched. This should prevent regressions around this bug. If you ever expand the detection logic (e.g., to handle inline trailing comments or multiline block comments), adding corresponding fixtures here will keep the behavior well specified.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/SqlQuery.php(3 hunks)tests/SqlCommentTest.php(1 hunks)tests/sql/todo_with_dashes_in_string.sql(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- tests/sql/todo_with_dashes_in_string.sql
🧰 Additional context used
🧬 Code graph analysis (1)
tests/SqlCommentTest.php (3)
src/SqlQuery.php (2)
SqlQuery(36-216)perform(107-134)src/MediaQueryLogger.php (1)
MediaQueryLogger(20-53)src/PerformTemplatedSql.php (1)
PerformTemplatedSql(14-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Sourcery review
Sort use statements alphabetically to comply with coding standards.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Document fixes in version 1.0.1: - Line comment handling in query type detection (#76) - POSIX-compliant trailing newlines in SQL files 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Fixed an issue where line comments (--) in SQL files caused incorrect query type detection. For example, an INSERT statement with a comment like "-- SELECT for xxx" at the beginning would be incorrectly identified as a SELECT query.
Changes:
Queries sent to the database remain unchanged (comments preserved), while query type detection (SELECT vs INSERT/UPDATE/DELETE) now correctly ignores comment content.
Summary by Sourcery
Improve SQL query type detection by ignoring comments while preserving them for execution.
Bug Fixes:
Enhancements:
Tests:
Summary by CodeRabbit
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.