Datatypes RFC- Clarifying `union` vs `set` #309

oflatt · 2023-12-01T20:27:24Z

No description provided.

saulshanabrook

My understanding is that the current behavior would be clear if we didn't allow functions that return eqsorts (any sorts defined in the program) to have merge or default. union could be used on any two eqsorts, whereas set could also be used on functions that return primitives (all non eqsorts). default on a function that returns eqsorts could be rewritten with an initial union, so the real sticking point is how we allow defining merge attributes on functions that return eqsorts.

Is that your understanding as well?

If so, I might find it helpful to start by looking at the use cases so far with merge used, which is mainly in the eggcc example, to see how this drives the desired behavioral design space?

I suggest this because I personally am still unclear from any sort of first principles of the semantics what would be intuitively correct or obvious.

I do think I have a handle now on the current behavior of how this interacts, like by looking at this example:

(sort Math)

(function i (i64) Math)
(function add (Math Math ) Math)

(function f () Math :merge (add old new))
(set (f) (i 10))
;;(union (f (i 3)) (i 1))
(union (f) (i 11))
(set (f) (i 12))
(union (i 12) (f))
(set (f) (i 13))
;(union (f (i 3)) (i 10))

oflatt · 2023-12-01T22:09:32Z

Edit: read more closely

Yes, the main difficult part is the combination of merge function and eqsort as the output.
I propose making them different from datatypes to avoid confusion and give guarantees about datatypes

oflatt · 2023-12-01T22:14:27Z

I think of functions like f from your example as storing one e-class as the output.
It's not equal to this output in the equivalence relation- it's storing it as the output, and it might change due to the merge function.
So your visualization is confusing because f is in the same eclass as its output

saulshanabrook · 2023-12-01T22:55:54Z

I think of functions like f from your example as storing one e-class as the output.
It's not equal to this output in the equivalence relation- it's storing it as the output, and it might change due to the merge function.
So your visualization is confusing because f is in the same class as its output

Yeah, I mean, I think that's where it's odd, it acts like a value when calling set but when using union or extract it acts just like an e-class currently:

(sort Math)

(function i (i64) Math)
(function add (Math Math ) Math)

(function f () Math :cost 100 :merge (add old new))
(set (f) (i 10))
;;(union (f (i 3)) (i 1))
(union (f) (i 11))
(set (f) (i 12))
(union (i 12) (f))
(set (f) (i 13))
;(union (f (i 3)) (i 10))
(extract (f))
(union (f) (i 100))
(add (f) (f))
(extract (add (f) (f)))

(add (i 12) (i 13))
(add (i 100) (i 100))

Hence why I was curious to learn more about how you are using the behavior in the eggcc example.

oflatt · 2023-12-04T18:50:54Z

Exactly- that's the confusing part. I'm proposing that we stop treating it the same as datatype for union and extract.

In eggcc, we use an extraction table to store a resulting term that we have extracted. The table is an analysis table, similar to lower-bound

oflatt · 2023-12-04T18:54:03Z

For example, in that visualization I would show (f (i 100)) as a separate entry. It's not "equal" to (i 100), it just stores (i 100) as an output

yihozhang · 2023-12-04T22:55:50Z

This proposal allows creating functions that have a merge function of union.
(function has-type (Math) Math :merge (union old new))
These functions behave similarly to datatypes, but they never have their own id- they can only be set to a datatype.

I wonder if this creates issues. It is a common practice to have a function like has-type to first have a placeholder id that later gets unioned with an actual "datatype" id. This is similar to unification variables in Prolog. I believe we should allow all functions whose output is an eqsort to make defaults.

yihozhang · 2023-12-04T23:00:48Z

On the other hand, the confusion in the eggcc extraction can be solved by having a different kind of datatypes that never union (like normal ADTs in a functional language): TermAndCost forms a lattice and is not supposed to be unioned with each other. Unfortunately, right now it is impossible to describe a (define (smaller (pair e1 c1) (pair e2 c2)) (if (< c1 c2) ... ...)) primitive within egglog due to inexpressiveness.

saulshanabrook · 2023-12-04T23:32:27Z

On the other hand, the confusion in the eggcc extraction can be solved by having a different kind of datatypes that never union (like normal ADTs in a functional language): TermAndCost forms a lattice and is not supposed to be unioned with each other

Yeah I was thinking about this as well, if there is a way to define specific constructors as unable to be unioned with each other. Like in the equate-basic example if there would be a way to disallow (union (Num 10) (Num 1)). Like the difference between looking at it by constructor/function and by sort/type.

oflatt · 2023-12-06T22:31:05Z

I wonder if this creates issues. It is a common practice to have a function like has-type to first have a placeholder id that later gets unioned with an actual "datatype" id.
@yihozhang and I chatted about this: this proposal still allows specifying a :default for these functions.

I think adding terms and functional programming would certainly help, but would also be a big change. Also, it's unclear how the functional programming would compose with egglog rules- what if I want to put an eq-able term inside a term? Or vice-versa?

datatypes rfc

1c9e5e7

oflatt requested a review from a team as a code owner December 1, 2023 20:27

oflatt requested review from saulshanabrook and removed request for a team December 1, 2023 20:27

oflatt mentioned this pull request Dec 1, 2023

Clarifying Valid Types #298

Closed

saulshanabrook reviewed Dec 1, 2023

View reviewed changes

saulshanabrook mentioned this pull request Feb 26, 2024

Remove Defaults and Change Relations egraphs-good/egglog-python#125

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Datatypes RFC- Clarifying `union` vs `set` #309

Datatypes RFC- Clarifying `union` vs `set` #309

oflatt commented Dec 1, 2023

saulshanabrook left a comment

oflatt commented Dec 1, 2023 •

edited

oflatt commented Dec 1, 2023

saulshanabrook commented Dec 1, 2023

oflatt commented Dec 4, 2023

oflatt commented Dec 4, 2023

yihozhang commented Dec 4, 2023

yihozhang commented Dec 4, 2023

saulshanabrook commented Dec 4, 2023

oflatt commented Dec 6, 2023

Datatypes RFC- Clarifying union vs set #309

Are you sure you want to change the base?

Datatypes RFC- Clarifying union vs set #309

Conversation

oflatt commented Dec 1, 2023

saulshanabrook left a comment

Choose a reason for hiding this comment

oflatt commented Dec 1, 2023 • edited

oflatt commented Dec 1, 2023

saulshanabrook commented Dec 1, 2023

oflatt commented Dec 4, 2023

oflatt commented Dec 4, 2023

yihozhang commented Dec 4, 2023

yihozhang commented Dec 4, 2023

saulshanabrook commented Dec 4, 2023

oflatt commented Dec 6, 2023

Datatypes RFC- Clarifying `union` vs `set` #309

Datatypes RFC- Clarifying `union` vs `set` #309

oflatt commented Dec 1, 2023 •

edited