Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
datastore_active is being rewritten due to race condition #3245
CKAN Version if known (or site URL)
Please describe the expected behaviour
When file resource saved to FileStore, datapusher is notified about it, downloads the resource, process the file, create new table in DataStore and sets datastore_active flag for the particular resource.
Please describe the actual behaviour
It's working as described, though when uploading multiple resources (eg. through API), there is a race condition probably in
What steps can be taken to reproduce the issue?
In httpd conf file, the race condition can be "resolved" with forcing one process and one thread for CKAN.
A nice solution would be to make package_update atomic, but that needs a new error class (and API version) for conflicts that should be retried.
A less-nice solutions is making sure datapusher only updates one resource at a time per dataset. Not a perfect solution, but will be much less likely to fail.
A quick hack would be to sneak the datastore_active change past package_patch/package_update so it doesn't update all the other dataset metadata fields, then make sure to re-index the dataset.
@amercader package_patch calls package_show+package_update to change the values passed. When the datastore just needs to set datastore_active=True on a resouce that's a lot of extra work and a much larger chance of conflict with action in progress.
Skipping all the validation and work done by package_update by changing the model directly would be a layering violation, but I could see the argument for it in this case, and it's unlikely to harm any real ckan instances. We are already doing something like this in our bulk actions that set the state to public/private.
I might not get to this very quickly, so if someone wants to submit a PR, here is the code that bulk updates the dataset private flag without going through package_update: https://github.com/ckan/ckan/blob/master/ckan/logic/action/update.py#L1138-L1149
And the place we need the resource-updating code is: https://github.com/ckan/ckan/blob/master/ckanext/datastore/logic/action.py#L147-L153
While i tried to reproduce issue, i wrote script that creates ~20 resource in different threads and it turned out that just one resource appears in package(as result of this issue). So i decided, that quickest fix won't work - it's solution just for datastore. And i implemented simplest(but, in other hand, there are less chances to break anything) decorator for atomic actions that uses conditional lock. How about this?
(but i don't know how to test it... creating threads inside CKAN test looks quite cumbersome...)