Simpler unifiers for equations with fewer variables on one of the sides by robertkleffner · Pull Request #2 · flix/buwb

robertkleffner · 2022-01-07T01:00:52Z

Fixes #1

Goal

Boolean unification does not have unique MGUs, but should prefer to generate minimized and 'easy to understand' unifiers when possible. Boolean minimization techniques help in many scenarios, but there is no one technique that outperforms others in all cases. The change proposed by this PR handles a small but important special case: when one side of an equation is just a variable not contained in the other side of the equation.

Results

Without special case:

a = b || c || d || f || e
    yields [a --> ((b ∨ (c ∨ (d ∨ f))) ∨ e) ∨ (a ∧ ((b ∨ (c ∨ (d ∨ f))) ∨ e)); b --> b; ...]
c = a || b || d || f || e
    yields [
        a --> (((c ∨ (d ∨ (e ∨ f))) ∧ ¬((((b ∧ (c ∨ (d ∨ (e ∨ f)))) ∨ d) ∨ f) ∨ e)) ∨ ((¬c ∧ (¬d ∧ (¬e ∧ ¬f))) ∧ ((((b ∧ (c ∨ (d ∨ (e ∨ f)))) ∨ d) ∨ f) ∨ e))) ∨ (a ∧ (c ∨ (d ∨ (e ∨ f))));
        b --> b ∧ (c ∨ (d ∨ (e ∨ f)));
        c --> c ∨ (d ∨ (e ∨ f));
        d --> d;
        ...]
c = a || b || d
    yields [a --> (c ∨ d) ∧ (a ∨ (¬b ∧ ¬d)); b --> b ∧ (c ∨ d); c --> c ∨ d; d --> d]

With special case:

a = b || c || d || f || e
    yields [a --> ((b ∨ (c ∨ d)) ∨ e) ∨ f]
c = a || b || d || f || e
    yields [c --> ((a ∨ (b ∨ d)) ∨ f) ∨ e]
c = a || b || d
    yields [c --> a ∨ (b ∨ d)]

Partial proof sketch

Conjecture: Given a Boolean equation X = Eqn, where the variable X is not free in Eqn, then [X --> Eqn] is a valid MGU for X = Eqn.

Let X be a Boolean variable, Eqn a Boolean equation in which X is not free, and let S be the substitution [X --> Eqn]. Given that X is not free in Eqn, we know that applying S to Eqn will have no effect. Applying S to X will yield Eqn, and so applying S to X = Eqn will yield Eqn = Eqn, a syntactic equality of Boolean equations.

Possibilities

It is possible that the special case is applied too conservatively. Currently, given X = Eqn, the special case is only applied if X ∉ free(Eqn). Maybe there is a similar special case that applies when X ∈ free(Eqn), however the presence of negations of X in Eqn make the case more complicated.

robertkleffner · 2022-01-07T01:02:00Z

I do development in Gitpod, which explains the presence of .gitpod.yml. I can remove this if desired.

magnus-madsen · 2022-01-07T07:50:51Z

Cool! Ill take a look soon (a bit busy these days)

I do development in Gitpod, which explains the presence of .gitpod.yml. I can remove this if desired.

That's fine.

magnus-madsen · 2022-01-07T07:51:52Z

@jacopol Do you want to take a look too?

magnus-madsen · 2022-01-07T07:53:25Z

(If this is sound, we should probably add it to Flix too- right around here: https://github.com/flix/flix/blob/master/main/src/ca/uwaterloo/flix/language/phase/unification/BoolUnification.scala#L40)

robertkleffner · 2022-01-08T20:01:07Z

I've made a generalization of the special case discussed earlier. It handles the previous case, and a few more complex ones. This one so far seems vastly more useful in my type inference work, and maybe points in an interesting direction for a syntactically driven method of Boolean unification in general.

Goal

The previous special case handled equations of the form X = Eqn where X ∉ free(Eqn). While this is a common case in type-inference, where fresh variables are generated frequently, it was not much help when unifying more complex equations. A very common occurrence in my unification problems was an equation of the form x || y = a || b || c || d, as well as x && y = a && b && c && d. Depending on how one structures these equations in the syntax, the predicted unifier can differ, but can basically just be seen as matchers, with the smaller side (x || y) being substituted and the larger side (a || b || c || d) remaining untouched.

Any unification of two Boolean formulas where free(Eqn1) ∩ free(Eqn2) == {} has the potential to be seen as simple syntactic matching, with ||, &&, and ¬ seen as term constructors. If the syntactic matching process generates a substitution, that is used as the unifier. Otherwise, the algorithm continues onward to SVE like the last implementation.

Results

The results of the previous implementation continue to apply, as a single variable will trivially match with any Boolean equation in which it is not free.

Additionally, we get nice unifiers for some more complex equations. Without the special case:

c || h  =  b || a || d || f
    yields [
        a --> (((c ∨ (d ∨ (f ∨ h))) ∧ ¬(((b ∧ (c ∨ (d ∨ (f ∨ h)))) ∨ d) ∨ f)) ∨ ((¬c ∧ (¬d ∧ (¬f ∧ ¬h))) ∧ (((b ∧ (c ∨ (d ∨ (f ∨ h)))) ∨ d) ∨ f))) ∨ (a ∧ (c ∨ (d ∨ (f ∨ h))));
        b --> b ∧ (c ∨ (d ∨ (f ∨ h)));
        c --> c ∨ (¬h ∧ (d ∨ f));
        d --> d;
        ...]
c || h || g  =  b || a || d || f || e
    yields [
        a --> huge (16 lines on my screen);
        b --> less huge (6 lines on my screen);
        c --> (((((g ∨ h) ∧ (¬d ∧ (¬e ∧ ¬f))) ∨ ((¬g ∧ ¬h) ∧ (d ∨ (e ∨ f)))) ∧ (¬g ∧ ¬h)) ∧ (¬g ∧ ¬h)) ∨ c;
        d --> d;
        ...]

With the special case:

c || h  =  b || a || d || f
    yields [c --> a ∨ (b ∨ d); h --> f]
c || h || g  =  b || a || d || f || e
    yields [c --> a ∨ (b ∨ d); h --> f; g --> e]

My inductive proof skills are rusty enough that proving this is sound is difficult for me at the moment. But it seems pretty sure right now, provided the implementation of syntactic matching has no bugs.

Possibilities

As mentioned before, this seems like it could be extended into a syntactic method for Boolean unification. I haven't fully fleshed it out, but I'll give it a try and if it seems promising (and generates smaller unifiers) I'll revisit with a separate PR.

Thanks again for making this project available! It's been nice to have an area to focus solely on the Boolean aspect of type systems.

robertkleffner · 2022-01-09T00:07:20Z

It seems the new special case generates a unifier that looks most general, but actually isn't in a number of cases. I will adjust it to be slightly narrower so that it continues to handle the cases posted above, but with a correct most-general unifier.

magnus-madsen · 2022-01-09T10:24:11Z

This is a very cool idea! I will take a closer look tomorrow.

magnus-madsen · 2022-01-09T10:24:25Z

It seems the new special case generates a unifier that looks most general, but actually isn't in a number of cases. I will adjust it to be slightly narrower so that it continues to handle the cases posted above, but with a correct most-general unifier.

Because of a bug in the implementation or a fundamental issue?

magnus-madsen · 2022-01-09T12:33:25Z

The key quote:

Any unification of two Boolean formulas where free(Eqn1) ∩ free(Eqn2) == {} has the potential to be seen as simple syntactic matching, with ||, &&, and ¬ seen as term constructors. If the syntactic matching process generates a substitution, that is used as the unifier. Otherwise, the algorithm continues onward to SVE like the last implementation.

I think there are two questions here:

When the free variables are disjoint, is the syntactic unifier the most-general?
Does the syntactic unification procedure lead to smaller unifiers than SVE + simplifications?

The last question can be resolved experimentally. My guess is that - as the examples suggest - syntactic unification would lead to smaller formulas!

The most interesting question is then whether (1) is true. It feels like something that should be true, but I have not seen it mentioned in the literature, which is a bit puzzling. I am now playing with some examples to convince myself one way or the other.

One final question:

Syntactic unification can be improved if we have some "normal forms" / rewrite rules. For example, (a or b) or c does not syntactically unify with a or (b or c) but of course they are the same.

robertkleffner · 2022-01-09T17:49:51Z

In it's current state I believe the PR can generate a less-general unifier. At least, I couldn't figure out how to construct the unifier generated by the online variant from the one generated on my branch:

¬a || b = c || d
    on main: [a --> (a ∧ b) ∨ (¬c ∧ ¬d); b --> b ∧ (c ∨ d); c --> c; d --> d]
    on PR:   [c --> ¬a; d --> b]

There's maybe something here but I think it needs work. I'm almost inclined to revert my last commit because the previous special case seems to be on far more solid ground.

For the smaller unifier generated by the current commit, it is clear that the substitution makes the equations equal, but the truth table is entirely different from the one generated by existing master. I'm not sure exactly how relevant that is, because different alpha-equal equations can have different truth tables even comparing on master alone:

running both the following on flix.dev/buwb

a && b && c = b && c && d && e
    yields [a --> (b ∧ (c ∧ (d ∧ e))) ∨ (a ∧ (¬b ∨ (¬c ∨ (d ∧ e)))); b --> b; c --> c; ...]
       and truth table of 5 variables with two R rows set to true (variable a can be false or true, all others must be true)
z && b && c = b && c && d && e
    yields [b --> b ∧ (¬c ∨ ((¬d ∨ (¬e ∨ z)) ∧ (¬z ∨ (d ∧ e)))); c --> c; d --> d; ...]
       and truth table of 5 variables with one R row set to true (all variables must be true)

So I think I need to understand a little more clearly what 'more general' means and implies in the context of Boolean equations, because it's not as intuitive as for syntactic unification in non-Boolean terms.

It seems intuitive that equations like the &&-row ones above (and their ||-row counterparts) at least should have compact and simple unifiers. But there's that difference between the two truth tables, and the differing result after applying the substitution to each equation. 1) generates b ∧ c ∧ d ∧ e = b ∧ c ∧ d ∧ e after minimization, whereas 2) generates b ∧ c ∧ d ∧ e ∧ z = b ∧ c ∧ d ∧ e ∧ z.

Any guidance on those differences? I'm looking through Term Rewriting and All That independently but any other insights are most welcome.

robertkleffner · 2022-01-09T19:26:40Z

Reverted after making a branch to maintain the experiment for now.

jacopol · 2022-01-10T12:06:19Z

I'm excited about this direction. We should indeed be able to solve simple standard unification problems efficiently.

However, what would happen to this example: a ∨ b = p ∨ p?
Syntactic unification says a=p and b=p, so they are equal.
However, semantic unification also allows p = a ∨ b, in which a and b don't have to be equal.
So the syntactic unifier is not sufficiently general in this case.

robertkleffner · 2022-01-10T19:35:48Z

@jacopol Yes, you're right. That scenario was due to my not understanding the full implications of 'more general' in the context of Boolean equations. After some time revisiting the definition and a few examples more carefully, I now see why the syntactic approach is difficult to pursue.

I had another idea for simpler unifiers for &&-row and ||-row limited equations, which yields a simple substitution that makes the equations the same as a Lowenheim-generated MGU would. However, after playing around with it I couldn't 100% figure out a way to show that the simple substitution was at least as general as the Lowenheim MGU.

Even worse, after looking at Term Rewriting and All That, the &&-row and ||-row style equations seem to be related to ACI-unification if one wants to restrict the substitution to only contain && or || terms. And because ACI-unification has worse properties IIRC (not unitary?), I have temporarily suspended looking into this form.

I appreciate both of you taking the time to look into this (thanks for roping in Jaco @magnus-madsen). I'll try to make a better formalism to show that the current special case for equations with a single variable on one side generates true most-general unifiers.

magnus-madsen · 2022-01-10T21:51:40Z

While the idea of syntactic unification may not work, both Jaco and I agreed that it is an interesting approach! So thanks for bringing it up; it lead to some interesting discussion offline.

We also agree with the sentiment of focusing on unification of conjunctions (or disjunctions). Although we don't have any insights yet.

(As a minor comment, while the syntactic approach does not give most general unifiers, we did discuss that there is probably no need for requiring the LHS and RHS to have disjoint variables because the syntactic "occurs check" already accounts for that scenario.)

One other thing, in case you did not consider it, you/we could chose to represent Boolean formulas with BDDs or in the Boolean ring (as a polynomial). Note that the latter has a unique representation! That's an alternative to Boolean minimization.

A better unification procedure would of course still be the holy grail...

robertkleffner · 2022-01-11T18:22:42Z

Quick thought: according to Term Rewriting and All That, Lowenheim's algorithm for unification generates MGUs from any unifier for a Boolean equation. In the book they start with a ground substitution, which ends up leading to big, intimidating MGUs that are harder to minimize. What if, as a starting solution in Lowenheim's method, one were to use a syntactic unifier?

I have a branch on my fork that implements Lowenheim's method. I'll give this idea a shot, and if it doesn't work, at least I'll have another PR adding an option to use Lowenheim's method in the workbench.

magnus-madsen · 2022-01-11T18:43:07Z

Do you have a good description / implementation of Lowenheim's method somewhere?

robertkleffner · 2022-01-11T21:23:05Z

I do! See PR #3. I can merge that with this PR and close that other PR, but I figured it would be nice to be able to review it independently. Lowenheim's method is surprisingly difficult to find any information about online.

robertkleffner · 2022-01-14T06:09:21Z

WRT the special case here, I may have made it unnecessary, so long as @jacopol's suggestion to start SVE with the single variable side is included.

Experimentally it seems adding complementarity and absorption simplification checks to mkOr and mkAnd handles all the cases I've tried the same as the special case would, i.e. a == b || c || d || e || f yields the expected (minimized) MGU straight from SVE without the special case. The exception is when the single variable side is not chosen first for elimination in SVE, e.g. z == b || c || d || e || f. So in combination with simplification, @jacopol's suggestion should get the same results as the special case even for large numbers of variables.

It does cause the mkOr and mkAnd functions to contain more cases, similar to how Flix does currently IIRC. Interestingly, this may mean that Boolean unification in Flix already handles the special case! I'll try a few more examples but so far it even seems to handle the case when x ∈ free(rhs) quite well, an improvement upon the proposed change of this PR.

If it turns out well I might change this PR to simply implement a few more simplification laws in mkOr and mkAnd and move the single variable check to SVE as suggested. As an added benefit, those simplification steps would also apply in a greater number of scenarios than the current special case.

…s minimal in a lot of common cases.

…ated twice by SVE.

robertkleffner · 2022-01-15T00:30:26Z

No more special case

Implemented @jacopol's suggestion along with some Boolean identity simplifications in mkOr and mkAnd. The single free variable being eliminated first does 90% of the work. The remaining work is some cases for complement and absorption in mkOr and mkAnd. Added some more complex examples to this one, showing it's more general than the previous special case.

Example Result Comparisons

a = b || c || d || e || f || g
    CURRENT: [
        a --> (((((((((c ∨ (d ∨ (e ∨ f))) ∨ g) ∧ ¬((c ∨ (d ∨ (e ∨ f))) ∨ g)) ∨ b) ∨ c) ∨ d) ∨ e) ∨ f) ∨ g) ∨ (a ∧ (((((((((c ∨ (d ∨ (e ∨ f))) ∨ g) ∧ ¬((c ∨ (d ∨ (e ∨ f))) ∨ g)) ∨ b) ∨ c) ∨ d) ∨ e) ∨ f) ∨ g));
        b --> (((c ∨ (d ∨ (e ∨ f))) ∨ g) ∧ ¬((c ∨ (d ∨ (e ∨ f))) ∨ g)) ∨ b;
        c --> c;
        ...]
    THIS PR: [
        a --> (((b ∨ (c ∨ d)) ∨ e) ∨ f) ∨ g;
        b --> b;
        c --> c;
        ...]

z = b || c || d || e || f || g
    CURRENT: [
        b --> ((z ∧ ¬((((((((z ∧ ¬((z ∧ (d ∨ (e ∨ f))) ∨ (g ∧ z))) ∨ (¬z ∧ ((z ∧ (d ∨ (e ∨ f))) ∨ (g ∧ z)))) ∧ ¬z) ∨ (c ∧ z)) ∨ (d ∧ z)) ∨ (e ∧ z)) ∨ (f ∧ z)) ∨ (g ∧ z))) ∨ (¬z ∧ ((((((((z ∧ ¬((z ∧ (d ∨ (e ∨ f))) ∨ (g ∧ z))) ∨ (¬z ∧ ((z ∧ (d ∨ (e ∨ f))) ∨ (g ∧ z)))) ∧ ¬z) ∨ (c ∧ z)) ∨ (d ∧ z)) ∨ (e ∧ z)) ∨ (f ∧ z)) ∨ (g ∧ z)))) ∨ (b ∧ z);
        c --> (((z ∧ ¬((z ∧ (d ∨ (e ∨ f))) ∨ (g ∧ z))) ∨ (¬z ∧ ((z ∧ (d ∨ (e ∨ f))) ∨ (g ∧ z)))) ∧ ¬z) ∨ (c ∧ z);
        d --> d ∧ z;
        e --> e ∧ z;
        ...]
    THIS PR: [
        z --> (((b ∨ (c ∨ d)) ∨ e) ∨ f) ∨ g;
        b --> b;
        c --> c;
        d --> d;
        ...]

¬z = ¬b || c && d || ¬e
    CURRENT: [
        b --> z ∨ (b ∧ (¬e ∨ (z ∨ (c ∧ d))));
        c --> c ∧ (¬d ∨ ¬z);
        d --> d;
        e --> e ∨ z;
        z --> z;]
    THIS PR: [
        z --> ¬((¬b ∨ (c ∧ d)) ∨ ¬e);
        b --> b;
        c --> c;
        ...]

¬z = ¬b || z && d || ¬e
    CURRENT: [
        b --> z ∨ (b ∧ ¬e);
        d --> d ∧ ¬z;
        e --> e ∨ z;
        z --> z;]
    THIS PR: [
        z --> b ∧ (¬d ∧ e);
        b --> b ∧ (¬d ∨ ¬e);
        d --> d;
        e --> e;]

…rk for all free variable counts. A couple additional identities.

magnus-madsen · 2022-02-13T18:35:53Z

@robertkleffner Is there any way to reach you privately? Perhaps you could DM me on https://gitter.im/flix/Lobby ?

robertkleffner changed the title ~~Special case for simpler unifiers~~ Fixes #1 - Special case for simpler unifiers Jan 7, 2022

robertkleffner changed the title ~~Fixes #1 - Special case for simpler unifiers~~ Special case for simpler unifiers Jan 7, 2022

Implement Jaco's idea, with some Boolean identities that make the MGU…

83f267a

…s minimal in a lot of common cases.

robertkleffner changed the title ~~Special case for simpler unifiers~~ Simpler unifiers for equations with only one variable free in at least one of the sides Jan 14, 2022

Make sure duplicate vars from both sides of equation don't get elimin…

e90c4e7

…ated twice by SVE.

robertkleffner added 4 commits January 15, 2022 00:35

Cleanup after looking at diff.

cc18208

Slight generalization of Jaco's idea. Easy minimization cases that wo…

fcdc1fb

…rk for all free variable counts. A couple additional identities.

Merge branch 'master' into master

4c8b372

Fix bug in making a minimization table accessor.

d4ed132

robertkleffner changed the title ~~Simpler unifiers for equations with only one variable free in at least one of the sides~~ Simpler unifiers for equations with fewer variables on one of the sides Jan 15, 2022

Merge branch 'flix:master' into master

578032c

Conversation

robertkleffner commented Jan 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robertkleffner commented Jan 7, 2022

Uh oh!

magnus-madsen commented Jan 7, 2022

Uh oh!

magnus-madsen commented Jan 7, 2022

Uh oh!

magnus-madsen commented Jan 7, 2022

Uh oh!

robertkleffner commented Jan 8, 2022

Uh oh!

robertkleffner commented Jan 9, 2022

Uh oh!

magnus-madsen commented Jan 9, 2022

Uh oh!

magnus-madsen commented Jan 9, 2022

Uh oh!

magnus-madsen commented Jan 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robertkleffner commented Jan 9, 2022

Uh oh!

robertkleffner commented Jan 9, 2022

Uh oh!

jacopol commented Jan 10, 2022

Uh oh!

robertkleffner commented Jan 10, 2022

Uh oh!

magnus-madsen commented Jan 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robertkleffner commented Jan 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

magnus-madsen commented Jan 11, 2022

Uh oh!

robertkleffner commented Jan 11, 2022

Uh oh!

robertkleffner commented Jan 14, 2022

Uh oh!

robertkleffner commented Jan 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

magnus-madsen commented Feb 13, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

robertkleffner commented Jan 7, 2022 •

edited

Loading

magnus-madsen commented Jan 9, 2022 •

edited

Loading

magnus-madsen commented Jan 10, 2022 •

edited

Loading

robertkleffner commented Jan 11, 2022 •

edited

Loading

robertkleffner commented Jan 15, 2022 •

edited

Loading