Clean up datastore emulator workarounds#585
Conversation
|
I am generally alright with this fix. Is there a reason to prefer adding the emulator or not to the context, and not to also lookup the os.environ in query? (just wondering if we need to store that context or not) To be clear, I think we could leave it as is, just curious if there is a reason to prefer one over the other. |
|
Yeah, actually I like the idea of doing the environ lookup in the query because I think it would make testing simpler. But I just had an additional idea: would it just make sense to fall back to the |
That might work (just looked at the code) 👍. I suppose the concern there might be other subtle things that might come along? I think for now I would prefer moving to the os.environ based approach, just to avoid bringing context and additional state into things. Once you have pushed that I can kick off CI :) |
I was more like thinking of performance issues, but on second thought, that really shouldn't be an issue. |
…n up related hacks
| more_results=_datastore_query.NOT_FINISHED, | ||
| skipped_results=1000, | ||
| entity_results=[], | ||
| end_cursor=b"dontlookatme", |
There was a problem hiding this comment.
Looking at this change it seemed reasonable, but I am a bit surprised to see test changes. Can you explain why?
There was a problem hiding this comment.
Short version: the buggy behavior of #575 & #584 was being guarded by tests.
Extended: in this PR I am basically restoring the behavior before #528 for the non-emulator case so I am more or less reverting the test changes of that PR as well.
More extended: these local line changes here and the other 1-2 line changes are there because of the following behavior was changed: between #528 and this PR, if an empty batch came from the datastore, then it was assumed that it's a datastore emulator and therefore skipped_cursor was used instead of end_cursor. I don't think that this is a good behavior and always the end_cursor was used before #528 so I think this should be a safe change.
And of course test_count_by_skipping_emulator had to be rewritten because the emulator behavior was changed entirely.
|
Thanks @gaborfeher. I would like if @chrisrossi or @andrewsg could weigh in as well. |
andrewsg
left a comment
There was a problem hiding this comment.
Looks great. Thank you very much!
This my proposed fix for: #584
The idea is to store a boolean to remember the type of the backend (emulator or real) during the whole life of the client. I am not super happy about this but I think it's still an improvement over the current way things work. There is already branching in the code based on the type of the backend, but it's basically a guesswork based on the behavior of the backend. And that seems to be fragile, just like my issues demonstrate. After this PR, the branching would be made explicit, based on the actual backend type.
An interesting side-effect of this change is that the change in line 181 actually also fixes #575 on my data, because the backend in my case doesn't send back an empty batch if the correct cursor is used in the query. However, based on #386 the backend might still send empty batches in some legit cases, and anyways, better safe than sorry, so I've kept the change in line 172.
Unfortunately this breaks some tests, I think mostly because
conftest.in_contextalways creates a non-emulator context, but in some cases we'd need an emulator context.So if you like this PR, please give me some pointers on how to parameterize
conftest.in_context, or just feel free to take it over.In any case, I think it would be a good idea to test the final PR on my data.