Change the way we're handling dataset resource updates to changed time #2968
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR changes the logic of how DKAN determines whether to create a "revision log" update to a dataset when a resource is changed. We get a lot of extra revisions to datasets based on changes to resources. Rather than trying to check if the update happened during a harvest, this simply checks if the resource revision is more than a minute older than the dataset's updated date. If not, it's not worth recording.
This should provide more consistency and accuracy, and reduce extra revisions created through automated processes.
Finally, this removes the queue functionality and makes the update immediately. If this creates much longer wait times for harvests to finish we can revisit, but as implemented, the queueing defeats the whole purpose of the revision logging! If the logging is queued and happens hours (or days) later, the updated timestamp will be still inaccurate, just for different reasons.
QA Steps
I'm not sure how to QA this beyond the tests. Let's discuss if there are concerns!