Add rule parser by leynos · Pull Request #38 · leynos/ddlint

leynos · 2025-06-30T00:18:15Z

Summary

implement span collection and CST node for rules
expose Root::rules with new Rule AST type
parse rules in build_green_tree
test rule parsing for simple, multi-literal, and fact rules
document rule grammar snippet in parser analysis

Testing

make fmt
make lint
make test
make markdownlint

https://chatgpt.com/codex/tasks/task_e_6861d1d0dc548322bc8413fbd7bcfff1

Summary by Sourcery

Implement full support for parsing Datalog rules by collecting rule spans, integrating them into the CST and AST, and verifying correctness through new tests and documentation updates.

New Features:

Add parsing support for Datalog rule declarations in the parser
Introduce a new Rule AST node and expose Root::rules to retrieve parsed rules

Documentation:

Document the rule grammar snippet in the parser analysis

Tests:

Add unit tests for simple, multi-literal, and fact rule parsing

sourcery-ai · 2025-06-30T00:18:19Z

Reviewer's Guide

This PR extends the parser to recognize Datalog-style rules by collecting rule spans during token parsing, integrating them into the CST builder, exposing a new Rule AST type via Root::rules, and verifying parsing through dedicated tests and documentation updates.

Class diagram for new Rule AST type and Root::rules method

classDiagram
    class Root {
        +Vec<Rule> rules()
    }
    class Rule {
        +syntax: SyntaxNode<DdlogLanguage>
        +syntax() SyntaxNode<DdlogLanguage>
    }
    Root --> "*" Rule : contains

File-Level Changes

Change	Details	Files
Augment parser to collect and build CST nodes for rules	Extend `parse` and `parse_tokens` signatures to include `rule_spans` Introduce `collect_rule_spans` using a custom parser and `SpanCollector` dispatch Update `build_green_tree` to accept, assert, and emit `N_RULE` nodes	`src/parser/mod.rs`
Introduce `Rule` AST type and expose via `Root::rules`	Add `Root::rules` method filtering for `N_RULE` syntax nodes Define `Rule` struct with `syntax()` accessor	`src/parser/mod.rs`
Add comprehensive tests for rule parsing	Create fixtures for simple, multi-literal, and fact rules Write `rstest` cases asserting no errors and correct pretty-printed output	`tests/parser.rs`
Document rule grammar in Haskell parser analysis	Insert concise Haskell snippet showing `rule` production using `Rule` and `dot`	`docs/haskell-parser-analysis.md`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

coderabbitai · 2025-06-30T00:18:21Z

Warning

Rate limit exceeded

@leynos has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 7 minutes and 7 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between cde4dd4 and 7482488.

📒 Files selected for processing (1)

src/parser/mod.rs (11 hunks)

Summary by CodeRabbit

New Features
- Added support for parsing and representing rule declarations in the parser, including extraction of rule heads and body literals.
- Users can now retrieve all rules from parsed files.
Documentation
- Expanded documentation to include detailed examples of rule grammar and Haskell parser code.
Tests
- Introduced comprehensive tests for rule parsing, covering valid and invalid rule formats to ensure correctness.

Summary by CodeRabbit

New Features
- Added support for recognising and representing rule declarations in the parser, including their integration into the syntax tree and typed AST.
Documentation
- Expanded documentation to include the exact Haskell parser code snippet for rule declarations.
Tests
- Introduced new tests and fixtures to verify correct parsing and round-trip printing of rules with various structures.

Walkthrough

The parser was extended to support rule declarations, including identifying their spans, representing them in the CST and typed AST, and exposing them via new methods. Documentation was updated to show the Haskell parser code for rules, and new tests were added to verify rule parsing and round-trip printing.

Changes

File(s)	Change Summary
docs/haskell-parser-analysis.md	Expanded documentation for the `rule` grammar rule by including the Haskell parser code snippet.
src/parser/mod.rs	Added rule declaration support: span collection, CST/AST integration, new `Rule` struct and `rules()` method.
tests/parser.rs	Added fixtures and tests for parsing rules, covering simple, multi-literal, and fact rules with round-trip checks.

Sequence Diagram(s)

sequenceDiagram
    participant Source as Source Code
    participant Lexer as Lexer
    participant Parser as Parser
    participant CST as CST Builder
    participant AST as Typed AST

    Source->>Lexer: Tokenise input
    Lexer->>Parser: Provide tokens
    Parser->>Parser: collect_rule_spans()
    Parser->>CST: build_green_tree(..., rules)
    CST->>AST: Wrap rule nodes as Rule structs
    AST->>Parser: rules() returns Rule nodes

Possibly related PRs

Introduce basic parser skeleton #10: Adds the initial parser skeleton, which this PR extends by supporting rule declarations.
Add index declaration parser #36: Adds parser and CST/AST support for index declarations, similar to this PR's handling of rule declarations.

Poem

A rabbit hopped through parser land,
Now rules are parsed just as planned!
With spans and nodes both green and neat,
And tests to make the round-trip sweet.
In docs and code, the rules now shine—
Another hop, another line!
🐇✨

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate Unit Tests

Create PR with Unit Tests
Post Copyable Unit Tests in Comment
Commit Unit Tests in branch codex/implement-rule-parser-for-ddlog

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai auto-generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

sourcery-ai

Hey @leynos - I've reviewed your changes - here's some feedback:

Consider refactoring the atom/literal and whitespace parser combinators into shared utilities to reduce duplication between rule parsing and existing relation/index logic.
You may want to extend the new Rule AST type with methods to directly extract the head atom and body literals for easier downstream processing.

Prompt for AI Agents

Please address the comments from this code review:
## Overall Comments
- Consider refactoring the atom/literal and whitespace parser combinators into shared utilities to reduce duplication between rule parsing and existing relation/index logic.
- You may want to extend the new Rule AST type with methods to directly extract the head atom and body literals for easier downstream processing.

## Individual Comments

### Comment 1
<location> `tests/parser.rs:534` </location>
<code_context>
+    "SystemAlert(\"System is now online.\")."
+}
+
+#[rstest]
+fn simple_rule_parsed(simple_rule: &str) {
+    let parsed = parse(simple_rule);
+    assert!(parsed.errors().is_empty());
</code_context>

<issue_to_address>
Missing tests for invalid or malformed rule declarations.

Please add tests to verify the parser correctly returns errors for invalid or malformed rule declarations, such as missing components or incorrect syntax.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
#[rstest]
fn simple_rule_parsed(simple_rule: &str) {
    let parsed = parse(simple_rule);
    assert!(parsed.errors().is_empty());
    let rules = parsed.root().rules();
    assert_eq!(rules.len(), 1);
    let Some(rule) = rules.first() else {
        panic!("rule missing");
    };
    assert_eq!(pretty_print(rule.syntax()), simple_rule);
}
=======
#[rstest]
fn simple_rule_parsed(simple_rule: &str) {
    let parsed = parse(simple_rule);
    assert!(parsed.errors().is_empty());
    let rules = parsed.root().rules();
    assert_eq!(rules.len(), 1);
    let Some(rule) = rules.first() else {
        panic!("rule missing");
    };
    assert_eq!(pretty_print(rule.syntax()), simple_rule);
}

#[test]
fn invalid_rule_missing_head() {
    let input = ":- User(user_id, username, _).";
    let parsed = parse(input);
    assert!(!parsed.errors().is_empty(), "Expected errors for missing head");
}

#[test]
fn invalid_rule_missing_body() {
    let input = "UserLogin(username, session_id) :- .";
    let parsed = parse(input);
    assert!(!parsed.errors().is_empty(), "Expected errors for missing body");
}

#[test]
fn invalid_rule_no_colon_dash() {
    let input = "UserLogin(username, session_id) User(user_id, username, _).";
    let parsed = parse(input);
    assert!(!parsed.errors().is_empty(), "Expected errors for missing ':-'");
}

#[test]
fn invalid_rule_missing_period() {
    let input = "UserLogin(username, session_id) :- User(user_id, username, _)";
    let parsed = parse(input);
    assert!(!parsed.errors().is_empty(), "Expected errors for missing period at end");
}

#[test]
fn invalid_rule_garbage() {
    let input = "This is not a rule!";
    let parsed = parse(input);
    assert!(!parsed.errors().is_empty(), "Expected errors for completely invalid input");
}
>>>>>>> REPLACE

</suggested_fix>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

tests/parser.rs (1)

570-618: Comprehensive error case coverage.

These tests properly verify that the parser reports errors for various invalid rule forms, addressing the need for negative test cases. The coverage includes all major error scenarios.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd74e65 and cde4dd4.

📒 Files selected for processing (2)

src/parser/mod.rs (11 hunks)
tests/parser.rs (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

`**/*.rs`: Document public APIs using Rustdoc comments (`///`) so documentation ...

**/*.rs: Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Place function attributes after doc comments.
Do not use return in single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Prefer .expect() over .unwrap().
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Use explicit version ranges in Cargo.toml and keep dependencies up-to-date.
Use rstest fixtures for shared setup.
Replace duplicated tests with #[rstest(...)] parameterised cases.
Prefer mockall for mocks/stubs.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.

📄 Source: CodeRabbit Inference Engine (AGENTS.md)

List of files the instruction was applied to:

tests/parser.rs
src/parser/mod.rs

`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

Adhere to single responsibility and CQRS

Place function attributes after doc comments.

Do not use return in single-line functions.

Move conditionals with >2 branches into a predicate function.

Avoid unsafe unless absolutely necessary.

Every module must begin with a //! doc comment that explains the module's purpose and utility.

Comments must use en-GB-oxendict spelling and grammar.

Lints must not be silenced except as a last resort.

#[allow] is forbidden.

Only narrowly scoped #[expect(lint, reason = "...")] is allowed.

No lint groups, no blanket or file-wide suppression.

Include FIXME: with link if a fix is expected.

Use rstest fixtures for shared setup and to avoid repetition between tests.

Replace duplicated tests with #[rstest(...)] parameterised cases.

Prefer mockall for mocks/stubs.

Prefer .expect() over .unwrap()

Ensure that any API or behavioural changes are reflected in the documentation in docs/

Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

⚙️ Source: CodeRabbit Configuration File

List of files the instruction was applied to:

tests/parser.rs
src/parser/mod.rs

🧬 Code Graph Analysis (1)

tests/parser.rs (1)

src/parser/mod.rs (10)

parse (151-172)

errors (140-142)

rules (954-960)

syntax (882-884)

syntax (972-974)

syntax (979-993)

syntax (1021-1023)

syntax (1098-1100)

syntax (1287-1289)

syntax (1387-1389)

🔇 Additional comments (8)

src/parser/mod.rs (6)

76-88: LGTM!

The ident() parser correctly handles identifiers with optional whitespace padding, following the established combinator pattern.

90-115: Well-structured atom parser implementation.

The parser correctly handles rule atoms with optional argument lists, properly consuming all tokens between parentheses including nested structures.

252-274: Good refactoring to use the new ident() parser.

The replacement of inline whitespace handling with the ident() parser reduces code duplication whilst maintaining the same parsing behaviour.

602-626: Well-implemented rule declaration parser.

The parser correctly handles the Datalog rule syntax Head :- Body. with optional body, comma-separated literals, and proper whitespace handling.

952-960: Consistent implementation of rule collection method.

The rules() method follows the established pattern for collecting typed AST nodes, properly filtering by N_RULE kind.

1378-1456: Robust implementation of Rule AST type.

The Rule struct and its methods correctly extract the head atom and body literals from the syntax tree, with proper handling of delimiters and whitespace.

tests/parser.rs (2)

519-533: Good test fixture coverage for rule variations.

The fixtures appropriately cover simple rules, multi-literal rules, and facts, providing a solid foundation for testing the rule parser.

534-568: Well-structured tests for valid rule parsing.

The tests properly verify successful parsing, correct rule count, and round-trip preservation, following the established testing patterns.

Add basic rule parser

cd74e65

leynos added the codex label Jun 30, 2025 — with ChatGPT Codex Connector

sourcery-ai Bot reviewed Jun 30, 2025

View reviewed changes

Comment thread tests/parser.rs

coderabbitai Bot approved these changes Jun 30, 2025

View reviewed changes

Refine rule parser and add invalid rule tests

cde4dd4

coderabbitai Bot requested changes Jun 30, 2025

View reviewed changes

Comment thread src/parser/mod.rs Outdated

Refactor rule span collection

7482488

coderabbitai Bot approved these changes Jun 30, 2025

View reviewed changes

leynos merged commit bac8511 into main Jul 2, 2025
2 checks passed

leynos deleted the codex/implement-rule-parser-for-ddlog branch July 2, 2025 21:10

This was referenced Jul 3, 2025

Implement function parser #40

Merged

Refactor parser span handling #48

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add rule parser#38

Add rule parser#38
leynos merged 3 commits into
mainfrom
codex/implement-rule-parser-for-ddlog

leynos commented Jun 30, 2025 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot commented Jun 30, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai Bot commented Jun 30, 2025 •

edited

Loading

Rate limit exceeded

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

leynos commented Jun 30, 2025 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Summary by Sourcery

Uh oh!

sourcery-ai Bot commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Class diagram for new Rule AST type and Root::rules method

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai Bot commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Summary by CodeRabbit

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

leynos commented Jun 30, 2025 •

edited by sourcery-ai Bot

Loading

sourcery-ai Bot commented Jun 30, 2025 •

edited

Loading

coderabbitai Bot commented Jun 30, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)