Add rule parser#38
Conversation
Reviewer's GuideThis PR extends the parser to recognize Datalog-style rules by collecting rule spans during token parsing, integrating them into the CST builder, exposing a new Class diagram for new Rule AST type and Root::rules methodclassDiagram
class Root {
+Vec<Rule> rules()
}
class Rule {
+syntax: SyntaxNode<DdlogLanguage>
+syntax() SyntaxNode<DdlogLanguage>
}
Root --> "*" Rule : contains
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Warning Rate limit exceeded@leynos has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 7 minutes and 7 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
Summary by CodeRabbit
Summary by CodeRabbit
WalkthroughThe parser was extended to support rule declarations, including identifying their spans, representing them in the CST and typed AST, and exposing them via new methods. Documentation was updated to show the Haskell parser code for rules, and new tests were added to verify rule parsing and round-trip printing. Changes
Sequence Diagram(s)sequenceDiagram
participant Source as Source Code
participant Lexer as Lexer
participant Parser as Parser
participant CST as CST Builder
participant AST as Typed AST
Source->>Lexer: Tokenise input
Lexer->>Parser: Provide tokens
Parser->>Parser: collect_rule_spans()
Parser->>CST: build_green_tree(..., rules)
CST->>AST: Wrap rule nodes as Rule structs
AST->>Parser: rules() returns Rule nodes
Possibly related PRs
Poem
✨ Finishing Touches
🧪 Generate Unit Tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes - here's some feedback:
- Consider refactoring the atom/literal and whitespace parser combinators into shared utilities to reduce duplication between rule parsing and existing relation/index logic.
- You may want to extend the new Rule AST type with methods to directly extract the head atom and body literals for easier downstream processing.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider refactoring the atom/literal and whitespace parser combinators into shared utilities to reduce duplication between rule parsing and existing relation/index logic.
- You may want to extend the new Rule AST type with methods to directly extract the head atom and body literals for easier downstream processing.
## Individual Comments
### Comment 1
<location> `tests/parser.rs:534` </location>
<code_context>
+ "SystemAlert(\"System is now online.\")."
+}
+
+#[rstest]
+fn simple_rule_parsed(simple_rule: &str) {
+ let parsed = parse(simple_rule);
+ assert!(parsed.errors().is_empty());
</code_context>
<issue_to_address>
Missing tests for invalid or malformed rule declarations.
Please add tests to verify the parser correctly returns errors for invalid or malformed rule declarations, such as missing components or incorrect syntax.
</issue_to_address>
<suggested_fix>
<<<<<<< SEARCH
#[rstest]
fn simple_rule_parsed(simple_rule: &str) {
let parsed = parse(simple_rule);
assert!(parsed.errors().is_empty());
let rules = parsed.root().rules();
assert_eq!(rules.len(), 1);
let Some(rule) = rules.first() else {
panic!("rule missing");
};
assert_eq!(pretty_print(rule.syntax()), simple_rule);
}
=======
#[rstest]
fn simple_rule_parsed(simple_rule: &str) {
let parsed = parse(simple_rule);
assert!(parsed.errors().is_empty());
let rules = parsed.root().rules();
assert_eq!(rules.len(), 1);
let Some(rule) = rules.first() else {
panic!("rule missing");
};
assert_eq!(pretty_print(rule.syntax()), simple_rule);
}
#[test]
fn invalid_rule_missing_head() {
let input = ":- User(user_id, username, _).";
let parsed = parse(input);
assert!(!parsed.errors().is_empty(), "Expected errors for missing head");
}
#[test]
fn invalid_rule_missing_body() {
let input = "UserLogin(username, session_id) :- .";
let parsed = parse(input);
assert!(!parsed.errors().is_empty(), "Expected errors for missing body");
}
#[test]
fn invalid_rule_no_colon_dash() {
let input = "UserLogin(username, session_id) User(user_id, username, _).";
let parsed = parse(input);
assert!(!parsed.errors().is_empty(), "Expected errors for missing ':-'");
}
#[test]
fn invalid_rule_missing_period() {
let input = "UserLogin(username, session_id) :- User(user_id, username, _)";
let parsed = parse(input);
assert!(!parsed.errors().is_empty(), "Expected errors for missing period at end");
}
#[test]
fn invalid_rule_garbage() {
let input = "This is not a rule!";
let parsed = parse(input);
assert!(!parsed.errors().is_empty(), "Expected errors for completely invalid input");
}
>>>>>>> REPLACE
</suggested_fix>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/parser.rs (1)
570-618: Comprehensive error case coverage.These tests properly verify that the parser reports errors for various invalid rule forms, addressing the need for negative test cases. The coverage includes all major error scenarios.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
src/parser/mod.rs(11 hunks)tests/parser.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
`**/*.rs`: Document public APIs using Rustdoc comments (`///`) so documentation ...
**/*.rs: Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Place function attributes after doc comments.
Do not usereturnin single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallow.
Prefer.expect()over.unwrap().
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Use explicit version ranges inCargo.tomland keep dependencies up-to-date.
Userstestfixtures for shared setup.
Replace duplicated tests with#[rstest(...)]parameterised cases.
Prefermockallfor mocks/stubs.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
tests/parser.rssrc/parser/mod.rs
`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments must use en-GB-oxendict spelling and grammar.
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
tests/parser.rssrc/parser/mod.rs
🧬 Code Graph Analysis (1)
tests/parser.rs (1)
src/parser/mod.rs (10)
parse(151-172)errors(140-142)rules(954-960)syntax(882-884)syntax(972-974)syntax(979-993)syntax(1021-1023)syntax(1098-1100)syntax(1287-1289)syntax(1387-1389)
🔇 Additional comments (8)
src/parser/mod.rs (6)
76-88: LGTM!The
ident()parser correctly handles identifiers with optional whitespace padding, following the established combinator pattern.
90-115: Well-structured atom parser implementation.The parser correctly handles rule atoms with optional argument lists, properly consuming all tokens between parentheses including nested structures.
252-274: Good refactoring to use the newident()parser.The replacement of inline whitespace handling with the
ident()parser reduces code duplication whilst maintaining the same parsing behaviour.
602-626: Well-implemented rule declaration parser.The parser correctly handles the Datalog rule syntax
Head :- Body.with optional body, comma-separated literals, and proper whitespace handling.
952-960: Consistent implementation of rule collection method.The
rules()method follows the established pattern for collecting typed AST nodes, properly filtering byN_RULEkind.
1378-1456: Robust implementation of Rule AST type.The
Rulestruct and its methods correctly extract the head atom and body literals from the syntax tree, with proper handling of delimiters and whitespace.tests/parser.rs (2)
519-533: Good test fixture coverage for rule variations.The fixtures appropriately cover simple rules, multi-literal rules, and facts, providing a solid foundation for testing the rule parser.
534-568: Well-structured tests for valid rule parsing.The tests properly verify successful parsing, correct rule count, and round-trip preservation, following the established testing patterns.
Summary
Root::ruleswith newRuleAST typebuild_green_treeTesting
make fmtmake lintmake testmake markdownlinthttps://chatgpt.com/codex/tasks/task_e_6861d1d0dc548322bc8413fbd7bcfff1
Summary by Sourcery
Implement full support for parsing Datalog rules by collecting rule spans, integrating them into the CST and AST, and verifying correctness through new tests and documentation updates.
New Features:
Documentation:
Tests: