WIP - Add variable relations #37

dkubb · 2013-07-21T08:17:38Z

This branch will change materialized relations to be mutable

* [ ] Add Axiom::Relation::Variable proxy
* [ ] Ready for review

mbj · 2013-07-21T13:44:53Z

@dkubb Can you give me a short summary why materialized relations need to be mutable? I know this was heavily discussed in channel but I think a summary is more helpful for me.

dkubb · 2013-07-21T17:48:15Z

@mbj the reason is that most of the datastore based relations work like this:

relation.insert([new_tuple])

# the new tuple is available from the same relation
relation.include?(new_tuple)  # => true

Yet the in-memory relations work like this:

relation.insert([new_tuple])

# the new tuple is not available
relation.include?(new_tuple)  # => false

# however...
new_relation = relation.insert([new_tuple])
new_relation.include?(new_tuple)  # => true

This change is an attempt to resolve the inconsistency and make the in-memory materialized relations work the same as the datastore backed relations.

dkubb · 2013-07-21T18:00:18Z

@solnic does my example above summarize the issue succinctly?

solnic · 2013-07-21T18:51:15Z

@dkubb yes this is perfect

mbj · 2013-07-21T18:52:00Z

@dkubb Thx for summary. I fully agree now.

dkubb · 2013-07-24T23:46:07Z

I've been experimenting with a few approaches to handling this and I've settled on making an Axiom::Relation::Variable object. It should wrap Axiom::Relation::Materializedor Axiom::Relation::Empty objects. The #insert, #update and #delete methods will mutate it's own @relation ivar. The other methods should behave as normal. I may collapse things down a bit further even, but for the meantime this is the plan.

If anyone is interested I can go into more detail on why I did this rather than simply mutating a materialized relation. Atm I don't have much time to go into it, but I wanted to drop an update on this.

solnic · 2013-07-25T00:27:23Z

@dkubb I suspect it's because that's the simplest thing that can possibly work since it's an addition instead of a change, anyhow, worksforme™

snusnu · 2013-07-25T02:27:27Z

@dkubb while this surely looks fine for me, i'd love it if you could explain the rationale behind this a bit more in depth. Relation variables are a known concept in RA, and i'd like to hear about the reasoning behind this and how the concept is inline with the implementation.

dkubb · 2013-07-25T07:16:00Z

@snusnu ok yeah, I'll try to explain it but I haven't yet tried to describe it to anyone out loud so bear with me. And feel free to ask questions if I'm unclear on anything.

So to explain this I need to give a bit of backstory first (more than I did above), and then describe the problem I ran into, and then describe why I think this is a good solution.

Backstory

Last week I finished up a working version of axiom-memory-adapter. I presented the API examples, but when @solnic saw them he said it didn't match the understanding he and @snusnu had about how the relation API works. Specifically, in their tests with the existing adapters, any writes to the underlying data would be reflected in an existing base relation. So if you wrote to the db, then used #each on the relation the new tuple would be included in the relation.

However, in the axiom-memory-adapter things worked a bit differently. If you wrote to a relation a new relation is returned that includes the new tuple. Reading from the original relation would not include the new tuple. This meant that anytime you wanted to write to a relation, you would've had to do relation = relation.insert([new_tuple]).

I'm so glad @solnic called me out on it, because after thinking about it I don't actually know how it would even be possible to have datastore backed relations work like the memory adapter at all. I don't think it would be possible without doing something crazy and scoping relations by created_at dates or something.. and aside from being brittle, there's no guarantee every relation is going to have that kind of field. Plus that doesn't deal with deletes at all.

The most important thing for axiom is that the relations behave the same regardless of what the underlying datastore is. We cannot have relations from axom-memory-adapter having a different interface than axiom-do-adapter relations. Sure, there will be performance differences between each adapter for specific operations, but the interface should not vary. If the interface varies, then we're only slightly better off than if we had custom, optimized interfaces for each datastore. Plus it makes it impossible to build tools to work with axiom, or on top of it, because you could not code to the interface. This made my approach a no-go.

Mutable Relations

One of my first thoughts when faced with this problem is if I made it so mutable materialized relations could be writable. This means when I do relation.insert([new_tuple]) and relation is materialized, reading from relation would see the new tuple immediately. This would give us the same behaviour as datastore backed relations in axiom-memory-adapter and allow the interface to be uniform across the board.

With a normal materialized relation, this should work great. I could just make it so the underlying tuples are written to. The tuples are a Set so they can be inserted into and deleted from with no issues. Any reads to the materialized relation would work just as we wanted.

The problem comes with the second kind of relation we have called Axiom::Relation::Empty. It represents a relation that does not include any tuples. It could be the result of doing something like relation.restrict { false }, or constructing an empty relation with Relation::Empty.new(header).

More commonly it is the result of passing a relation through axiom-optimizer. If the optimizer examines an expression and determines it cannot return any tuples it will optimize it by replacing it with an Axiom::Relation::Empty instance. One simple example would be (relation - relation).optimize. The optimizer uses this to short-circuit others kinds of operations, for example consider:

other.join(relation - relation)

The inner relation statement will always be empty, regardless of what it contains, and joining against an empty relation also cannot return anything so the whole expression can be replaced with an Axiom::Relation::Empty instance. Now, imagine inside deeper inside another expression. A single empty relation can short-circuit a whole branch of a query, greatly simplifying and optimizing the performance of a query.

The actual problem comes when you consider how you write to this instance. Doing something like empty_relation.insert([new_tuple]) could be made to modify the internal state, however the optimizer would still match the class and use it to short-circuit a query. I suppose the optimizer could be made to examine empty relations to see if they are really empty, but I don't really like that. It means we have an object that is basically lying about what it is. It makes no sense to write to an Axiom::Relation::Empty instance and expect it to not be empty anymore. If I could somehow change the class, then maybe it could make more sense, but that's just odd.. mutable ancestry would be even worse than mutable state in an object. And yes, I did think about using Object#extend at runtime, and that's just batshit crazy so I won't go there ;)

Variable relations

After thinking about this problem this afternoon, while cleaning up axiom-optimizer, I thought that maybe delegation might be an elegant answer. I can have something that delegates all the methods to a relation, and uses the return value of #insert, #update and #delete to update it's own @relation internally. I think this will provide the behaviour we want with minimal overhead.

Interestingly enough this was basically the first solution I offered to @solnic in IRC. I thought it was more of a quick-fix or a hack, but now I realize it was probably the simplest and most elegant solution to the problem. It's funny how the first thing that came to mind (probably) ended up being what we needed.

Final thoughts

Aside from attempting to make materialized relations mutable I haven't actually attempted to implement the variable relations idea. I've only thought about this so I could end up being wrong. I have a few strategies to attempt to make this work and I'll report back either way with my findings, even if it doesn't work out so well.

* This branch will change materialized relations to be mutable

dkubb · 2013-07-26T06:19:57Z

Since I'm adding variable relations, and no longer making materialized relations themselves mutable, and you can't change the branch name of a PR, I'm going to close this PR and open a new one.

I'll make sure to link to this discussion for reference.

ghost assigned dkubb Jul 21, 2013

Change materialized relations to be mutable

a5bb005

* This branch will change materialized relations to be mutable

dkubb closed this Jul 26, 2013

dkubb deleted the mutable-materialized-relations branch July 26, 2013 06:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP - Add variable relations #37

WIP - Add variable relations #37

dkubb commented Jul 21, 2013

mbj commented Jul 21, 2013

dkubb commented Jul 21, 2013

dkubb commented Jul 21, 2013

solnic commented Jul 21, 2013

mbj commented Jul 21, 2013

dkubb commented Jul 24, 2013

solnic commented Jul 25, 2013

snusnu commented Jul 25, 2013

dkubb commented Jul 25, 2013

dkubb commented Jul 26, 2013

WIP - Add variable relations #37

WIP - Add variable relations #37

Conversation

dkubb commented Jul 21, 2013

mbj commented Jul 21, 2013

dkubb commented Jul 21, 2013

dkubb commented Jul 21, 2013

solnic commented Jul 21, 2013

mbj commented Jul 21, 2013

dkubb commented Jul 24, 2013

solnic commented Jul 25, 2013

snusnu commented Jul 25, 2013

dkubb commented Jul 25, 2013

Backstory

Mutable Relations

Variable relations

Final thoughts

dkubb commented Jul 26, 2013