Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As of reach dependent on view commit history #4011

Closed
n8layman opened this issue Aug 4, 2022 · 18 comments · Fixed by dolthub/go-mysql-server#1159
Closed

As of reach dependent on view commit history #4011

n8layman opened this issue Aug 4, 2022 · 18 comments · Fixed by dolthub/go-mysql-server#1159

Comments

@n8layman
Copy link

n8layman commented Aug 4, 2022

We observed a strange problem where a view we had established to union three long running tables suddenly started always referring to HEAD when using as of. After some exploration we found out that this occurs when a view is dropped and then added back in. It appears that a newly created view can't use as of to reach further back into the commit history than it's own creation commit.

I also wonder if this is a more general issue with as of. For example, we've been dropping tables then immediately re-adding them back in without an intervening commit in order to handle changes to the schema. This was a cool trick but can dolt's as of handle that? Or will it stop at the last time a table was added?

reprex:

dolt sql -q "create table test (pk int, c1 int, primary key(pk))" 
dolt sql -q "insert into test values (1,2), (2,4), (3,6)"
dolt commit -am "create test table and add values"
dolt sql -q "select * from test" 

dolt sql -q "create view test_view as select * from test where c1 < 3"
dolt sql -q "select * from test_view" 

dolt sql -q "insert into test values (4,1), (5,3), (6,5)"
dolt commit -am "add some more values"

dolt sql -q "select * from test" 
dolt sql -q "select * from test_view as of 'HEAD'"
dolt sql -q "select * from test_view as of 'HEAD^'"

dolt sql -q "alter table test add column c2 int"
dolt commit -am "Added column c2"
dolt sql -q "select * from test_view as of 'HEAD'"
dolt sql -q "select * from test_view as of 'HEAD^'"
dolt sql -q "select * from test_view as of 'HEAD~2'"

dolt sql -q "drop view test_view"
dolt commit -am "drop test_view"

dolt sql -q "create view test_view as select * from test where c1 < 3"
dolt commit -am "re-add test_view"
dolt sql -q "select * from test_view as of 'HEAD'"
dolt sql -q "select * from test_view as of 'HEAD^'"
dolt sql -q "select * from test_view as of 'HEAD~2'" 
@timsehn
Copy link
Sponsor Contributor

timsehn commented Aug 4, 2022

@fulghum will scope and assign to someone. This is definitely a bug in as of.

@n8layman
Copy link
Author

n8layman commented Aug 4, 2022

Here's the doltr issue relating to this bug:
ecohealthalliance/doltr#53

@fulghum
Copy link
Contributor

fulghum commented Aug 4, 2022

Hey @n8layman, thanks for reporting this. I'll dig into this one this morning and see what I can find. I'll test with views and tables and keep this issue updated with what I find.

@fulghum
Copy link
Contributor

fulghum commented Aug 4, 2022

I ran through the repro steps above, as well trying some other ways, but I haven't been able to reproduce this issue yet. So far, the returned data looks like what I was expecting.

In your repro steps above, could you point out which line is returning the incorrect results?

@n8layman
Copy link
Author

n8layman commented Aug 5, 2022

Okay so what I thought was going on was not what was really going on. I'm working with doltr, the R client for dolt, and we first noticed the problem there. I thought my reprex was capturing the problem when I was just actually being dumb and failing to count commits properly. AS OF works fine for the reprex above using the DOLT cli. Sorry for the confusion!

However AS OF for views really wasn't working for our project when we tried the same thing using doltr. I went back and tried to reproduce the problem and was eventually able to get it. As it turns out the real problem may be something completely different. I think it was because our view contains a union of tables and in those cases AS OF is getting dropped instead of passed to all the different tables involved. In a test database with only 4 commits total I get the following.

misc % dolt sql -q "select * from union_view as of 'HEAD~40000'"
+----+----+------+
| pk | c1 | c2   |
+----+----+------+
| 1  | 2  | NULL |
| 2  | 4  | NULL |
| 3  | 6  | NULL |
| 4  | 1  | NULL |
| 5  | 3  | NULL |
| 6  | 5  | NULL |
+----+----+------+

Going backwards 40k commits with a history of only 4 should definitely yield an error! And it also explains why the view is defaulting to HEAD tables and why it worked for you and not for us. I think we just need to check out the commit and go from there instead of relying on AS OF for complicated multi-table views. Argh!!

@n8layman
Copy link
Author

n8layman commented Aug 5, 2022

Adding the following to the original reprex should illustrate what I mean:

dolt sql -q "create view union_view as select * from test union select * from test"
dolt commit -am "add union_view"
dolt sql -q "select * from union_view as of 'HEAD'"
dolt sql -q "select * from union_view as of 'HEAD~5'"
dolt sql -q "select * from union_view as of 'HEAD~200'"

@fulghum
Copy link
Contributor

fulghum commented Aug 5, 2022

Hey @n8layman, thanks for these extra details. You are totally right – there's something buggy happening here. I see it now with the unioned view. I'll start digging in and report back with what I find.

@n8layman
Copy link
Author

n8layman commented Aug 5, 2022

Thanks!

@n8layman
Copy link
Author

n8layman commented Aug 5, 2022

I'm testing a qualified hash approach now. So far it seems to produce the same problem. You can't go back further than the view itself which of course makes sense using this method since you're going back in time on the view definition as well as the table data. However trying something like:

misc % dolt sql -q 'select * from `misc/fjt259bjgj01jmkj3ifl8tjeptqfduv3`.union_view;'

still always seems to use the head commits of the tables the view is operating on if there is a union involved. At least I think that's what's going on?

Continuing with the previous reprex:

misc % dolt sql -q "insert into test values (10,1,1), (11,2,2), (12,3,3)"
Query OK, 3 rows affected

dolt commit -am "add more data"

misc % dolt log
commit roitd7klbo6m6clfuau6khvu9ogbd3aj (HEAD -> main)
Author: nate <layman@ecohealthalliance.org>
Date:  Fri Aug 05 10:56:36 -0700 2022

        add more data

commit fjt259bjgj01jmkj3ifl8tjeptqfduv3
Author: nate <layman@ecohealthalliance.org>
Date:  Thu Aug 04 21:08:41 -0700 2022

        add union_view
        
misc % dolt sql -q 'select * from `misc/roitd7klbo6m6clfuau6khvu9ogbd3aj`.union_view;'
+----+----+------+
| pk | c1 | c2   |
+----+----+------+
| 1  | 2  | NULL |
| 2  | 4  | NULL |
| 3  | 6  | NULL |
| 4  | 1  | NULL |
| 5  | 3  | NULL |
| 6  | 5  | NULL |
| 7  | 1  | 1    |
| 8  | 1  | 1    |
| 9  | 1  | 1    |
| 10 | 1  | 1    |
| 11 | 2  | 2    |
| 12 | 3  | 3    |
+----+----+------+

misc % dolt sql -q 'select * from `misc/fjt259bjgj01jmkj3ifl8tjeptqfduv3`.union_view;'
+----+----+------+
| pk | c1 | c2   |
+----+----+------+
| 1  | 2  | NULL |
| 2  | 4  | NULL |
| 3  | 6  | NULL |
| 4  | 1  | NULL |
| 5  | 3  | NULL |
| 6  | 5  | NULL |
| 7  | 1  | 1    |
| 8  | 1  | 1    |
| 9  | 1  | 1    |
| 10 | 1  | 1    |
| 11 | 2  | 2    |
| 12 | 3  | 3    |
+----+----+------+

@n8layman
Copy link
Author

n8layman commented Aug 5, 2022

Oh as an aside:

For me single quotes works here

misc % dolt sql -q 'select * from `misc/fjt259bjgj01jmkj3ifl8tjeptqfduv3`.union_view;' 

but double quotes doesn't

misc % dolt sql -q "select * from `misc/fjt259bjgj01jmkj3ifl8tjeptqfduv3`.union_view;"

Not sure why...

@fulghum
Copy link
Contributor

fulghum commented Aug 5, 2022

It looks like this is specific to union queries. The analyzer needs to set the as of information for all the tables involved in the view, but the union node is currently preventing that. I'm going to add a special case for union to allow the as of info to bet set on the tables involved and will keep poking around to see if there's anything else going on.

@n8layman
Copy link
Author

n8layman commented Aug 5, 2022

Do you think it would happen with joins as well?

@fulghum
Copy link
Contributor

fulghum commented Aug 5, 2022

From the code I'm looking at, joins should be processed just fine, but I'm going to check that out, too, and make sure we have automated tests for different types of views using as of.

@fulghum
Copy link
Contributor

fulghum commented Aug 8, 2022

Just a quick update... applying the AsOf expression to a view that uses a union statement is working now, but while testing, I found some related bugs with setting AsOf for tables in subqueries. I've got tests passing for those cases now, so will wrap up the PR and get it merged in before the next release of Dolt later this week.

@n8layman
Copy link
Author

n8layman commented Aug 8, 2022

Awesome. Thanks for looking in to it!

@fulghum
Copy link
Contributor

fulghum commented Aug 9, 2022

I got the fix merged into go-mysql-server and will get that new GMS version into Dolt tonight. After that, this fix will be all queued for the next Dolt release.

@fulghum
Copy link
Contributor

fulghum commented Aug 9, 2022

Hey @n8layman, we just released the fix for this issue in version 0.40.23 of Dolt.

Thanks again for taking the time to report this issue and giving us such helpful repro steps! Please don't hesitate to ping us on Discord or open more issues on GitHub if you hit any more snags at all with Dolt.

@n8layman
Copy link
Author

n8layman commented Aug 9, 2022

Thanks again for addressing it so fast!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants