Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Documentation on the rule engine #149

Closed
gskorokhod opened this issue Jan 26, 2024 · 9 comments
Closed

[RFC] Documentation on the rule engine #149

gskorokhod opened this issue Jan 26, 2024 · 9 comments
Assignees
Labels
area::conjure-oxide Related to conjure_oxide. kind::documentation Improvements or additions to documentation

Comments

@gskorokhod
Copy link
Contributor

gskorokhod commented Jan 26, 2024

Hi!

I have written some basic documentation on how we're planning to handle rules and rule application to help myself and other people who are just getting into the Rust side of the project and want to work on the rule engine.

Please comment any corrections / anything that I've missed in this issue!


Overview

  • Our starting point is the AST for the Essence code, in JSON format. It is taken from the output of conjure pretty.

  • This AST is parsed into a Model struct. The parsing logic is contained in conjure_oxide/src/parse.rs.

  • The Model struct contains:

    • the symbol table (variables)
    • the constraints that need to be satisfied (constraints) as a Vector<Expression>
  • The Expression type is an enum that represents an expression tree

  • The original expression trees are rewritten into a format that the different solvers can "understand" (e.g. boolean expressions in CNF for SAT solvers). We use a rule engine for that.

  • Conceptually, Rule structs have an application method that takes an expression and either produces a rewritten expression (if the rule is applicable) or an error (if it is not applicable)

  • To rewrite the expressions, we walk the expression tree, find rules that can be applied for each expression, and apply them until there are no more rules to apply.

  • Where multiple rules are applicable, we want:

    • A way to set a higher or lower priority for every given rule
    • An arbitrary function that "decides" which rule to apply
  • When a rule is applied to an Expression in the expression tree, this expression is rewritten and all its ancestor nodes are marked as "dirty" (i.e. we need to check them again to make sure that no new rules have appeared that can be applied)

  • Rules are stored in the RULES_DISTRIBUTED_SLICE object.

  • The #[register_rule] macro is used to register rules from a Rust function. We use it to define rules and add them to the rule table.

  • We call the rewrite function on our Model to recursively apply rules to expressions to simplify them, until there are no more rules to apply

  • After the rewrite is complete, we can convert the resulting model to the specific representation that is required by the solver (e.g. CNF for KissSAT)

The rewrite function

The rewrite function takes a model, gets rules from the RULES_DISTRIBUTED_SLICE, and recursively applies rules to the expression tree until no more rules can be applied

Algorithm

  1. Traverse the expression tree using depth-first search
  2. For every expression:
    • Loop over the list of all rules
    • Try to apply rules to the copy
    • If the Rule::application() function errors, the rule is not applicable
    • Otherwise, save the result of the Rule::application()
  3. Save a list of all rules that are applicable for the current expression
  4. If there is at least one such rule:
    • Use the select function to choose the rule to apply (this may be based on rule priority and other metadata)
    • Replace the original expression with this rule's application result
    • Go back to the root Expression and traverse the tree again (step 1)
  5. If there are no applicable rules, traverse the current expression's subexpressions (step 2)
  6. When there are no more expressions that have applicable rules, the algorithm is complete

Note: we mutate the expression tree in place instead of copying all of it, as this would be inefficient

Optimisation

  • When we visit an Expression in the expression tree, mark it as "clean"
  • When we apply a rule to an expression, mark it and all of its ancestors as "dirty"
  • Only try applying rules to a given node if it is "dirty", otherwise skip it

This works because if we have already found that no rules can be applied to an expression, this will still be the case for as long as this expression doesn't change (so we don't need to retry this check multiple times).
However, if its sub-expressions have been changed, we need to check whether any new rules can now be applied to the expression (and all its parents)

The Rule struct

The Rule struct represents a single rule in the rule engine

#[derive(Clone, Debug)]
pub struct Rule<'a> {
	pub name: &'a str,
	pub application: fn(&Expression) -> Result<Expression, RuleApplicationError>,

Fields

  • name - a human-readable name for the rule
  • application - a function that tries to apply the rule to a given expression, and either returns the rewritten expression or errors

Note

NB: the logic for checking whether a rule is applicable and actually applying it is the same, so, for efficiency reasons, we do not have separate methods for these tasks. Instead, we try to apply the rule right away, and save the result if the rule turns out to be applicable to the given expression.

The register_rule macro

The #[register_rule] macro is a procedural macro that decorates a Rust function, constructs a Rule struct from it, and adds the struct to the centralised rule table (RULES_DISTRIBUTE_SLICE)

Explanation

Essentially, the macro:

  • Takes in Rust tokens that represent the function (can be thought of as a part of the Rust program's AST)
  • Constructs a Rule struct and stores it in GEN_CONJURE_<function identifier>
  • Writes the original function in the rule's application field
  • Writes the function's identifier in the rule's name field
  • Appends this struct to the rule table

Example

#[register_rule]
fn my_rule(expr: &Expression) -> Result<Expression, RuleApplicationError> {
	// Rule application  logic here
}

Would produce:

Rule {
	name: "my_rule",
	application: (expr: &Expression) -> Result<Expression, RuleApplicationError> { ... }
}

And append it to RULES_DISTRIBUTED_SLICE

The centralised rule table (RULES_DISTRIBUTED_SLICE)

RULES_DISTRIBUTED_SLICE represents our rule table that contains all the Rule structs that have been defined across our project.

Explanation

RULES_DISTRIBUTED_SLICE is a Rust object that represents a region in memory, similar to an array.

  • We can initialise it from multiple different files (by adding Rule structs to it - either manually or by invoking the #[register_rule] macro.
  • At link time, all the data will be written into the same location in memory, and different files in out project can import and use this object as normal
  • This allows us to have a more decentralised structure for our project, e.g. putting rules for different solvers into different crates and then linking it all together when the project is built.
@gskorokhod gskorokhod added kind::documentation Improvements or additions to documentation area::conjure-oxide Related to conjure_oxide. labels Jan 26, 2024
@gskorokhod gskorokhod self-assigned this Jan 26, 2024
@gskorokhod
Copy link
Contributor Author

@niklasdewally
Copy link
Contributor

niklasdewally commented Jan 31, 2024

Idea from the meeting ( from @lixitrixi @gskorokhod):

We currently do:

constraints = Vec<Expression>

Why don't we do

constraints: Expression

and use the And Constraint.

@gskorokhod
Copy link
Contributor Author

gskorokhod commented Jan 31, 2024

Nik and Felix:
Don't need to detail the implementation of the rule registry, opaque interface.

@gskorokhod
Copy link
Contributor Author

The register_rule macro should also be opaque - move its documentation into the actual file (it's also an implementation detail)

@gskorokhod
Copy link
Contributor Author

Question to consider: what is our process for testing the rules?

@gskorokhod
Copy link
Contributor Author

Do integration tests currently do code coverage? (they should)

@gskorokhod
Copy link
Contributor Author

Integration tests are probably a way to test rules without writing separate tests for all of them

@niklasdewally
Copy link
Contributor

Question to consider: what is our process for testing the rules?

My thought is that the essence file integration tests + coverage would be enough?

(a very important and good question though. @ozgurakgun )

@ozgurakgun
Copy link
Contributor

agreed. it's very hard to test individual rules. and integration testing should contribute to the coverage stats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area::conjure-oxide Related to conjure_oxide. kind::documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants