-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix 347 #365
Conversation
#363 need to be merged first |
@chicco785 if you could please rebase on master since #363 just got merged :-) |
* added code to group entities by servicepath when multiple notifications arrive at once (#208) * add tests and integration tests (so we have also test with orion triggering such case) * fix notification tests to rely only on api calls (remove call to translator) * changed error code to 400 when due to invalid request * upgraded python to 3.8.5 * fix integration test to use new docker image with entrypoint (#362)
* remove build (we can assume a build is available) * remove volumes at the end of the test * increase timesleep for one test * improve code
rebase completed |
@chicco785 so I've had a second thought about these delete ops. My naive interpretation was that delete should be some kind of maintenance operation you do to clean up tables after you realise you won't need them ever again---e.g. the devices that write to those tables got decommissioned. If that was the case, the current implementation would be good enough since no data would typically be written to those tables while running a delete. But I don't think this is a realistic assumption anymore and there could be cases where people want to delete data while the table is still being used. The QL spec is also a bit ambiguous about it in that it says "delete all ..." on the one hand and then gives you parameters to select a subset of the data on the other. Problem: concurrent insertsWith the current implementation we risk accidentally deleting rows that the user wanted to keep. In fact, we run the following operations outside of transaction boundaries and possibly with an eventually consistent DB back-end:
Obviously anything can happen in between those steps, in particular new rows could be inserted after we run step 2 but before we get to execute step 3. If that happens those new rows will be lost. Here's an example scenario. Say I've got a daily clean up script to delete rows older than 24 hours. Because of network downtime (or whatever other reason) the devices whose entities get written to the table couldn't send new data in the last 24 hours, but they buffered the data and are ready to send it as soon as the network comes back up. (Or perhaps there's a message queue providing that kind of service.) The script starts running while those devices hammer the system with hundreds of records to insert. As you can imagine we could easily lose hundreds or even thousands of records---think bulk insert. Proposed solutionChange the SQL translator implementation so we only delete entities but not the table, e.g.
This is consistent with the QL spec. Then we provide an additional endpoint to drop the table, i.e. to call I've got a patch I could push to this branch if we decide to go ahead with this solution. Alternative solutionLeave things as they are but open an issue about it so the problem won't fly under our radar. |
rather than an additional end point, we could have an optional parameter be aware that this may break several tests because if the table stays in and next insert for the same attribute has a different format, it won't work. i.e. insert will fail |
cool, I'll push the patch I have in the meantime which has an extra endpoint, then I'll get rid of it later. Doing this b/c I have other commits on top of that patch.
yep, the patch fixes the tests too. Have a look and see if there's anything to change. Like I said, later on I'll get rid of the new endpoint. Also I've fixed the delete for timescale (the other commits I was talking about earlier), if you agree I think the easiest is for me to push to this PR those changes too and rename this PR to e.g. "Improved delete endpoints" or something like that. |
@c0c0n3 if that's only for your tests, is ok, but if the plan is to have an option on the existing endpoint, I won't merge this to master |
@chicco785 yea, like I said earlier I'll get rid of the new endpoint it's just that I have some commits in between for timescale, so what I'm asking is this:
If not, we could have separate PRs for (1) and (2). But I think it'd be more overhead? |
yep
yep
|
…scale support is complete.
@chicco785 I think it's ready for review. To sum up:
|
fix #347
all tests assumed fiware service path null, so the 'path' selective delete was not performed and tables were always dropped, which of course is wrong behaviour.
what is strange anyhow is the fact that select count (*) in crate return values different from 0 until you don't commit.