Skip to content

Add rule parser#38

Merged
leynos merged 3 commits into
mainfrom
codex/implement-rule-parser-for-ddlog
Jul 2, 2025
Merged

Add rule parser#38
leynos merged 3 commits into
mainfrom
codex/implement-rule-parser-for-ddlog

Conversation

@leynos
Copy link
Copy Markdown
Owner

@leynos leynos commented Jun 30, 2025

Summary

  • implement span collection and CST node for rules
  • expose Root::rules with new Rule AST type
  • parse rules in build_green_tree
  • test rule parsing for simple, multi-literal, and fact rules
  • document rule grammar snippet in parser analysis

Testing

  • make fmt
  • make lint
  • make test
  • make markdownlint

https://chatgpt.com/codex/tasks/task_e_6861d1d0dc548322bc8413fbd7bcfff1

Summary by Sourcery

Implement full support for parsing Datalog rules by collecting rule spans, integrating them into the CST and AST, and verifying correctness through new tests and documentation updates.

New Features:

  • Add parsing support for Datalog rule declarations in the parser
  • Introduce a new Rule AST node and expose Root::rules to retrieve parsed rules

Documentation:

  • Document the rule grammar snippet in the parser analysis

Tests:

  • Add unit tests for simple, multi-literal, and fact rule parsing

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Jun 30, 2025

Reviewer's Guide

This PR extends the parser to recognize Datalog-style rules by collecting rule spans during token parsing, integrating them into the CST builder, exposing a new Rule AST type via Root::rules, and verifying parsing through dedicated tests and documentation updates.

Class diagram for new Rule AST type and Root::rules method

classDiagram
    class Root {
        +Vec<Rule> rules()
    }
    class Rule {
        +syntax: SyntaxNode<DdlogLanguage>
        +syntax() SyntaxNode<DdlogLanguage>
    }
    Root --> "*" Rule : contains
Loading

File-Level Changes

Change Details Files
Augment parser to collect and build CST nodes for rules
  • Extend parse and parse_tokens signatures to include rule_spans
  • Introduce collect_rule_spans using a custom parser and SpanCollector dispatch
  • Update build_green_tree to accept, assert, and emit N_RULE nodes
src/parser/mod.rs
Introduce Rule AST type and expose via Root::rules
  • Add Root::rules method filtering for N_RULE syntax nodes
  • Define Rule struct with syntax() accessor
src/parser/mod.rs
Add comprehensive tests for rule parsing
  • Create fixtures for simple, multi-literal, and fact rules
  • Write rstest cases asserting no errors and correct pretty-printed output
tests/parser.rs
Document rule grammar in Haskell parser analysis
  • Insert concise Haskell snippet showing rule production using Rule and dot
docs/haskell-parser-analysis.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 30, 2025

Warning

Rate limit exceeded

@leynos has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 7 minutes and 7 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between cde4dd4 and 7482488.

📒 Files selected for processing (1)
  • src/parser/mod.rs (11 hunks)

Summary by CodeRabbit

  • New Features

    • Added support for parsing and representing rule declarations in the parser, including extraction of rule heads and body literals.
    • Users can now retrieve all rules from parsed files.
  • Documentation

    • Expanded documentation to include detailed examples of rule grammar and Haskell parser code.
  • Tests

    • Introduced comprehensive tests for rule parsing, covering valid and invalid rule formats to ensure correctness.

Summary by CodeRabbit

  • New Features
    • Added support for recognising and representing rule declarations in the parser, including their integration into the syntax tree and typed AST.
  • Documentation
    • Expanded documentation to include the exact Haskell parser code snippet for rule declarations.
  • Tests
    • Introduced new tests and fixtures to verify correct parsing and round-trip printing of rules with various structures.

Walkthrough

The parser was extended to support rule declarations, including identifying their spans, representing them in the CST and typed AST, and exposing them via new methods. Documentation was updated to show the Haskell parser code for rules, and new tests were added to verify rule parsing and round-trip printing.

Changes

File(s) Change Summary
docs/haskell-parser-analysis.md Expanded documentation for the rule grammar rule by including the Haskell parser code snippet.
src/parser/mod.rs Added rule declaration support: span collection, CST/AST integration, new Rule struct and rules() method.
tests/parser.rs Added fixtures and tests for parsing rules, covering simple, multi-literal, and fact rules with round-trip checks.

Sequence Diagram(s)

sequenceDiagram
    participant Source as Source Code
    participant Lexer as Lexer
    participant Parser as Parser
    participant CST as CST Builder
    participant AST as Typed AST

    Source->>Lexer: Tokenise input
    Lexer->>Parser: Provide tokens
    Parser->>Parser: collect_rule_spans()
    Parser->>CST: build_green_tree(..., rules)
    CST->>AST: Wrap rule nodes as Rule structs
    AST->>Parser: rules() returns Rule nodes
Loading

Possibly related PRs

Poem

A rabbit hopped through parser land,
Now rules are parsed just as planned!
With spans and nodes both green and neat,
And tests to make the round-trip sweet.
In docs and code, the rules now shine—
Another hop, another line!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate Unit Tests
  • Create PR with Unit Tests
  • Post Copyable Unit Tests in Comment
  • Commit Unit Tests in branch codex/implement-rule-parser-for-ddlog

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai auto-generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @leynos - I've reviewed your changes - here's some feedback:

  • Consider refactoring the atom/literal and whitespace parser combinators into shared utilities to reduce duplication between rule parsing and existing relation/index logic.
  • You may want to extend the new Rule AST type with methods to directly extract the head atom and body literals for easier downstream processing.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider refactoring the atom/literal and whitespace parser combinators into shared utilities to reduce duplication between rule parsing and existing relation/index logic.
- You may want to extend the new Rule AST type with methods to directly extract the head atom and body literals for easier downstream processing.

## Individual Comments

### Comment 1
<location> `tests/parser.rs:534` </location>
<code_context>
+    "SystemAlert(\"System is now online.\")."
+}
+
+#[rstest]
+fn simple_rule_parsed(simple_rule: &str) {
+    let parsed = parse(simple_rule);
+    assert!(parsed.errors().is_empty());
</code_context>

<issue_to_address>
Missing tests for invalid or malformed rule declarations.

Please add tests to verify the parser correctly returns errors for invalid or malformed rule declarations, such as missing components or incorrect syntax.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
#[rstest]
fn simple_rule_parsed(simple_rule: &str) {
    let parsed = parse(simple_rule);
    assert!(parsed.errors().is_empty());
    let rules = parsed.root().rules();
    assert_eq!(rules.len(), 1);
    let Some(rule) = rules.first() else {
        panic!("rule missing");
    };
    assert_eq!(pretty_print(rule.syntax()), simple_rule);
}
=======
#[rstest]
fn simple_rule_parsed(simple_rule: &str) {
    let parsed = parse(simple_rule);
    assert!(parsed.errors().is_empty());
    let rules = parsed.root().rules();
    assert_eq!(rules.len(), 1);
    let Some(rule) = rules.first() else {
        panic!("rule missing");
    };
    assert_eq!(pretty_print(rule.syntax()), simple_rule);
}

#[test]
fn invalid_rule_missing_head() {
    let input = ":- User(user_id, username, _).";
    let parsed = parse(input);
    assert!(!parsed.errors().is_empty(), "Expected errors for missing head");
}

#[test]
fn invalid_rule_missing_body() {
    let input = "UserLogin(username, session_id) :- .";
    let parsed = parse(input);
    assert!(!parsed.errors().is_empty(), "Expected errors for missing body");
}

#[test]
fn invalid_rule_no_colon_dash() {
    let input = "UserLogin(username, session_id) User(user_id, username, _).";
    let parsed = parse(input);
    assert!(!parsed.errors().is_empty(), "Expected errors for missing ':-'");
}

#[test]
fn invalid_rule_missing_period() {
    let input = "UserLogin(username, session_id) :- User(user_id, username, _)";
    let parsed = parse(input);
    assert!(!parsed.errors().is_empty(), "Expected errors for missing period at end");
}

#[test]
fn invalid_rule_garbage() {
    let input = "This is not a rule!";
    let parsed = parse(input);
    assert!(!parsed.errors().is_empty(), "Expected errors for completely invalid input");
}
>>>>>>> REPLACE

</suggested_fix>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread tests/parser.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
tests/parser.rs (1)

570-618: Comprehensive error case coverage.

These tests properly verify that the parser reports errors for various invalid rule forms, addressing the need for negative test cases. The coverage includes all major error scenarios.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd74e65 and cde4dd4.

📒 Files selected for processing (2)
  • src/parser/mod.rs (11 hunks)
  • tests/parser.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
`**/*.rs`: Document public APIs using Rustdoc comments (`///`) so documentation ...

**/*.rs: Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Place function attributes after doc comments.
Do not use return in single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Prefer .expect() over .unwrap().
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Use explicit version ranges in Cargo.toml and keep dependencies up-to-date.
Use rstest fixtures for shared setup.
Replace duplicated tests with #[rstest(...)] parameterised cases.
Prefer mockall for mocks/stubs.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.

📄 Source: CodeRabbit Inference Engine (AGENTS.md)

List of files the instruction was applied to:

  • tests/parser.rs
  • src/parser/mod.rs
`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments must use en-GB-oxendict spelling and grammar.

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

⚙️ Source: CodeRabbit Configuration File

List of files the instruction was applied to:

  • tests/parser.rs
  • src/parser/mod.rs
🧬 Code Graph Analysis (1)
tests/parser.rs (1)
src/parser/mod.rs (10)
  • parse (151-172)
  • errors (140-142)
  • rules (954-960)
  • syntax (882-884)
  • syntax (972-974)
  • syntax (979-993)
  • syntax (1021-1023)
  • syntax (1098-1100)
  • syntax (1287-1289)
  • syntax (1387-1389)
🔇 Additional comments (8)
src/parser/mod.rs (6)

76-88: LGTM!

The ident() parser correctly handles identifiers with optional whitespace padding, following the established combinator pattern.


90-115: Well-structured atom parser implementation.

The parser correctly handles rule atoms with optional argument lists, properly consuming all tokens between parentheses including nested structures.


252-274: Good refactoring to use the new ident() parser.

The replacement of inline whitespace handling with the ident() parser reduces code duplication whilst maintaining the same parsing behaviour.


602-626: Well-implemented rule declaration parser.

The parser correctly handles the Datalog rule syntax Head :- Body. with optional body, comma-separated literals, and proper whitespace handling.


952-960: Consistent implementation of rule collection method.

The rules() method follows the established pattern for collecting typed AST nodes, properly filtering by N_RULE kind.


1378-1456: Robust implementation of Rule AST type.

The Rule struct and its methods correctly extract the head atom and body literals from the syntax tree, with proper handling of delimiters and whitespace.

tests/parser.rs (2)

519-533: Good test fixture coverage for rule variations.

The fixtures appropriately cover simple rules, multi-literal rules, and facts, providing a solid foundation for testing the rule parser.


534-568: Well-structured tests for valid rule parsing.

The tests properly verify successful parsing, correct rule count, and round-trip preservation, following the established testing patterns.

Comment thread src/parser/mod.rs Outdated
@leynos leynos merged commit bac8511 into main Jul 2, 2025
2 checks passed
@leynos leynos deleted the codex/implement-rule-parser-for-ddlog branch July 2, 2025 21:10
This was referenced Jul 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant