-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make table.filter.update atomic #3992
Comments
I think we should rewrite |
Er, ignore that comment, I forgot for a second that we don't have range writes even for whole tables. I think we should rewrite it to |
@mlucy could we also provide or do we already have an atomicity guarantee for |
|
Moving this to 2.1 after talking with @coffeemug about it. He pointed out that this could be perceived as "compare and set" failing, which would be bad to still have in the upcoming release. (@danielmewes, does that seem fine?) |
One disadvantage of this proposal is that it's less obvious which operations are atomic. Right now, there's a simple rule: |
I think the opposite is true. If I run I think the latter option (changing the implementation in these cases) is much better, because if we error there will be a huge class of queries that will break (and passing |
I think this proposal is still a bit half baked. If we make We cannot do it for all commands that return a selection unfortunately, since for example for The |
So basically I think we should have a pretty clear rule when we provide this guarantee and when we do not. An alternative might be to document the fact that |
Maybe we could add a new We could even error when applying |
Another option could be to not doing this rewrite but to instead generate an error whenever a non-delete write is applied to a (Edit: @coffeemug had already mentioned this, and I agree that it's not a great solution.) |
The query isn't executed atomically, the update is executed atomically. The conjunction of the update and filter is not executed atomically. I think that if we find the right way to phrase this, we could make it super simple for users to understand. Maybe we could say "the only thing that's executed atomically is the function you pass to a The problem with the alternative is that there isn't a clear rule. Suppose we make |
Ok, thinking about this a bit more, this looks pretty subtle to sneak into 2.1. I wouldn't be opposed to moving it into subsequent (and in fact, think we should do that), and fixing it later after we think it through more carefully. We should probably document this better -- I think that can go a long way. |
👍 on documenting this better and coming up with a consistent solution later. I'll open separate documentation issues tomorrow. |
I would rather we do this in 2.1 if we could. I think the current behavior is both bad and unintuitive. IMO we should make it so every term that can easily be made atomic is. Even if we leave the docs the way they are so that the official position is that I think it's OK to officially not guarantee something, but to make the database as safe as possible by default for people who don't think to read the official guarantees. (Think e.g. how GCC turns off strict aliasing by default.) I'd be happy to put off arguing about whether or not to guarantee certain things are atomic, and whether or not to introduce an atomic selection type, until 2.2, but I really don't see a reason not to make the default behavior safe for all selections where it's easy. |
👍 for fixing places where it's easy for now, and adding |
I'm not a fan of this since it seems like it will make it harder for users to find out what the actual guarantees are during their testing / evaluation phase, but if you think strongly that we should do this I will not veto it. We should make sure that our additional guarantee isn't leaked though. table.filter(r.table("other").get(r.row("reference"))).update({counter: r.row(counter).add(1)}) The filter in this is not atomic, but the update (incrementing the counter is). This is a totally legitimate and probably not uncommon query. If we now apply the proposed rewrite rule, this query will fail because the update function is no longer atomic. That would leak the fact that we're doing something magical underneath and the error would probably be very confusing. This by the way is also a general problem with the proposal. Since an update function can either be fully deterministic or fully non deterministic, we would need a way to disable this kind of rewriting. Otherwise a large set of legitimate queries will become impossible to run. |
I think that you should provide atomicity guarantees wherever possible, even if the API is not 100% consistent, just document the use cases that are atomic, users that care about this will understand it is a work in progress. I for one would benefit a lot from this and documentation listing what can be atomic and what can't. |
I usually worry about that a lot as well, but in this case I think it doesn't apply because most people writing I think we should leave questions about what to do with
I don't feel strongly enough that this is the right thing to do that I want to push ahead unilaterally. @coffeemug, what do you think? |
(Moving back to 2.1 since it looks like we might do some work on this for that release and I don't want the issue to get lost.) |
I don't have a strong opinion. I can see both points of view, and they both make sense to me. I wouldn't be opposed to making a subset of our queries behave under tighter semantics under the hood for now, and I wouldn't be opposed to holding off until we can get a more robust solution in place. Sorry, I'm not a good tie-breaker here. I suppose the fact that I don't feel strongly that we should tighten up the code now means I'm in favor of doing nothing (but I'm not opposed to tightening up the code either). |
I still don't like even just doing If we document this, it will become extremely dangerous for users since once they use anything that's non-atomic in the filter (currently using a geo predicate is enough), the atomicity guarantee will silently disappear. If on the other hand we don't document it and don't tell anyone about it, I think it would be better to just explain properly which exact operations are atomic and which are not. I think we can do a lot here with improving documentation. We would mention it in the |
Bumping this to 2.2 since it was apparently never settled, we should try to settle it during the next discussion period. |
Hi all - I just saw a link in the consistency documentation (https://www.rethinkdb.com/docs/consistency/) pointing to this thread and was wondering if there was any new information regarding making the filter.update process atomic? Thanks! |
@danielmewes -- I made this mistake myself just recently when answering a stackoverflow question. I really think we should try to do something better here. |
I just ran into this again. Could we change |
@AtnNn I like that proposal. We also need to be somewhat careful in how we name and describe this, since a We need to make it clear that there's still not going to be any atomicity across documents, and that the only thing it does is that it turns the filter predicate into a CAS update. |
Queries such as
r.table(t).filter(f).update(u)
seem atomic because there is no error when thenon_atomic
flag is not set. However it is possible foru
to be applied whenf
doesn't hold (see rethinkdb/docs#679 (comment))A possible work-around is
r.table(t).filter(f).update((row) -> r.branch(f(row), u(row), {}))
This is a proposal for adding support in the server for making entire queries or subqueries atomic, by either guaranteeing to the user that queries like this one are atomic or letting the user specify that they should be without needing to duplicate any logic into the update function.
The text was updated successfully, but these errors were encountered: