[RFC] Documentation on the rule engine #149

gskorokhod · 2024-01-26T12:43:32Z

Hi!

I have written some basic documentation on how we're planning to handle rules and rule application to help myself and other people who are just getting into the Rust side of the project and want to work on the rule engine.

Please comment any corrections / anything that I've missed in this issue!

Overview

Our starting point is the AST for the Essence code, in JSON format. It is taken from the output of conjure pretty.
This AST is parsed into a Model struct. The parsing logic is contained in conjure_oxide/src/parse.rs.
The Model struct contains:
- the symbol table (variables)
- the constraints that need to be satisfied (constraints) as a Vector<Expression>
The Expression type is an enum that represents an expression tree
The original expression trees are rewritten into a format that the different solvers can "understand" (e.g. boolean expressions in CNF for SAT solvers). We use a rule engine for that.
Conceptually, Rule structs have an application method that takes an expression and either produces a rewritten expression (if the rule is applicable) or an error (if it is not applicable)
To rewrite the expressions, we walk the expression tree, find rules that can be applied for each expression, and apply them until there are no more rules to apply.
Where multiple rules are applicable, we want:
- A way to set a higher or lower priority for every given rule
- An arbitrary function that "decides" which rule to apply
When a rule is applied to an Expression in the expression tree, this expression is rewritten and all its ancestor nodes are marked as "dirty" (i.e. we need to check them again to make sure that no new rules have appeared that can be applied)
Rules are stored in the RULES_DISTRIBUTED_SLICE object.
The #[register_rule] macro is used to register rules from a Rust function. We use it to define rules and add them to the rule table.
We call the rewrite function on our Model to recursively apply rules to expressions to simplify them, until there are no more rules to apply
After the rewrite is complete, we can convert the resulting model to the specific representation that is required by the solver (e.g. CNF for KissSAT)

The rewrite function

The rewrite function takes a model, gets rules from the RULES_DISTRIBUTED_SLICE, and recursively applies rules to the expression tree until no more rules can be applied

Algorithm

Traverse the expression tree using depth-first search
For every expression:
- Loop over the list of all rules
- Try to apply rules to the copy
- If the Rule::application() function errors, the rule is not applicable
- Otherwise, save the result of the Rule::application()
Save a list of all rules that are applicable for the current expression
If there is at least one such rule:
- Use the select function to choose the rule to apply (this may be based on rule priority and other metadata)
- Replace the original expression with this rule's application result
- Go back to the root Expression and traverse the tree again (step 1)
If there are no applicable rules, traverse the current expression's subexpressions (step 2)
When there are no more expressions that have applicable rules, the algorithm is complete

Note: we mutate the expression tree in place instead of copying all of it, as this would be inefficient

Optimisation

When we visit an Expression in the expression tree, mark it as "clean"
When we apply a rule to an expression, mark it and all of its ancestors as "dirty"
Only try applying rules to a given node if it is "dirty", otherwise skip it

This works because if we have already found that no rules can be applied to an expression, this will still be the case for as long as this expression doesn't change (so we don't need to retry this check multiple times).
However, if its sub-expressions have been changed, we need to check whether any new rules can now be applied to the expression (and all its parents)

The Rule struct

The Rule struct represents a single rule in the rule engine

#[derive(Clone, Debug)]
pub struct Rule<'a> {
	pub name: &'a str,
	pub application: fn(&Expression) -> Result<Expression, RuleApplicationError>,

Fields

name - a human-readable name for the rule
application - a function that tries to apply the rule to a given expression, and either returns the rewritten expression or errors

Note

NB: the logic for checking whether a rule is applicable and actually applying it is the same, so, for efficiency reasons, we do not have separate methods for these tasks. Instead, we try to apply the rule right away, and save the result if the rule turns out to be applicable to the given expression.

The register_rule macro

The #[register_rule] macro is a procedural macro that decorates a Rust function, constructs a Rule struct from it, and adds the struct to the centralised rule table (RULES_DISTRIBUTE_SLICE)

Explanation

Essentially, the macro:

Takes in Rust tokens that represent the function (can be thought of as a part of the Rust program's AST)
Constructs a Rule struct and stores it in GEN_CONJURE_<function identifier>
Writes the original function in the rule's application field
Writes the function's identifier in the rule's name field
Appends this struct to the rule table

Example

#[register_rule]
fn my_rule(expr: &Expression) -> Result<Expression, RuleApplicationError> {
	// Rule application  logic here
}

Would produce:

Rule {
	name: "my_rule",
	application: (expr: &Expression) -> Result<Expression, RuleApplicationError> { ... }
}

And append it to RULES_DISTRIBUTED_SLICE

The centralised rule table (RULES_DISTRIBUTED_SLICE)

RULES_DISTRIBUTED_SLICE represents our rule table that contains all the Rule structs that have been defined across our project.

Explanation

RULES_DISTRIBUTED_SLICE is a Rust object that represents a region in memory, similar to an array.

We can initialise it from multiple different files (by adding Rule structs to it - either manually or by invoking the #[register_rule] macro.
At link time, all the data will be written into the same location in memory, and different files in out project can import and use this object as normal
This allows us to have a more decentralised structure for our project, e.g. putting rules for different solvers into different crates and then linking it all together when the project is built.

The text was updated successfully, but these errors were encountered:

gskorokhod · 2024-01-26T19:05:49Z

@niklasdewally @Kieranoski702 @lixitrixi

niklasdewally · 2024-01-31T14:48:57Z

Idea from the meeting ( from @lixitrixi @gskorokhod):

We currently do:

constraints = Vec<Expression>

Why don't we do

constraints: Expression

and use the And Constraint.

gskorokhod · 2024-01-31T14:50:03Z

Nik and Felix:
Don't need to detail the implementation of the rule registry, opaque interface.

gskorokhod · 2024-01-31T14:52:49Z

The register_rule macro should also be opaque - move its documentation into the actual file (it's also an implementation detail)

gskorokhod · 2024-01-31T14:53:44Z

Question to consider: what is our process for testing the rules?

gskorokhod · 2024-01-31T14:55:00Z

Do integration tests currently do code coverage? (they should)

gskorokhod · 2024-01-31T14:55:35Z

Integration tests are probably a way to test rules without writing separate tests for all of them

niklasdewally · 2024-01-31T14:55:39Z

Question to consider: what is our process for testing the rules?

My thought is that the essence file integration tests + coverage would be enough?

(a very important and good question though. @ozgurakgun )

ozgurakgun · 2024-01-31T19:51:14Z

agreed. it's very hard to test individual rules. and integration testing should contribute to the coverage stats.

gskorokhod added kind::documentation Improvements or additions to documentation area::conjure-oxide Related to conjure_oxide. labels Jan 26, 2024

gskorokhod self-assigned this Jan 26, 2024

gskorokhod closed this as completed Feb 18, 2024

This was referenced Mar 31, 2024

Adding a way to see the parents of expressions - Help Needed! #282

Open

Rewriter Optimization #287

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Documentation on the rule engine #149

[RFC] Documentation on the rule engine #149

gskorokhod commented Jan 26, 2024 •

edited

gskorokhod commented Jan 26, 2024

niklasdewally commented Jan 31, 2024 •

edited

gskorokhod commented Jan 31, 2024 •

edited by niklasdewally

gskorokhod commented Jan 31, 2024

gskorokhod commented Jan 31, 2024

gskorokhod commented Jan 31, 2024

gskorokhod commented Jan 31, 2024

niklasdewally commented Jan 31, 2024

ozgurakgun commented Jan 31, 2024

[RFC] Documentation on the rule engine #149

[RFC] Documentation on the rule engine #149

Comments

gskorokhod commented Jan 26, 2024 • edited

Overview

The rewrite function

Algorithm

Optimisation

The Rule struct

Fields

The register_rule macro

Explanation

Example

The centralised rule table (RULES_DISTRIBUTED_SLICE)

Explanation

gskorokhod commented Jan 26, 2024

niklasdewally commented Jan 31, 2024 • edited

gskorokhod commented Jan 31, 2024 • edited by niklasdewally

gskorokhod commented Jan 31, 2024

gskorokhod commented Jan 31, 2024

gskorokhod commented Jan 31, 2024

gskorokhod commented Jan 31, 2024

niklasdewally commented Jan 31, 2024

ozgurakgun commented Jan 31, 2024

gskorokhod commented Jan 26, 2024 •

edited

niklasdewally commented Jan 31, 2024 •

edited

gskorokhod commented Jan 31, 2024 •

edited by niklasdewally