Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harvest autoarchive #2368

Merged
merged 16 commits into from Jan 17, 2020
Merged

Harvest autoarchive #2368

merged 16 commits into from Jan 17, 2020

Conversation

abulte
Copy link
Contributor

@abulte abulte commented Dec 23, 2019

Fix datagouv/data.gouv.fr#157

This PR:

  • adds an "archived" badge to the datasets list in admin
  • adds an "archived" badge to the dataset view in admin
  • adds an autoarchive attribute to HarvestSource (defaults to True)
  • adds an autoarchive steps to the harvest process: archives datasets that are attached to the current harvester and not present on the remote platform
  • fixes a bug on active field on the HarvestSource form
  • exposes the dataset.archived attr in the API (read and write)
  • support HARVEST_AUTOARCHIVE_GRACE_DAYS setting: do not archive unless the last harvest date has reached today - HARVEST_AUTOARCHIVE_GRACE_DAYS. This is to avoid archiving datasets that temporarily disappear from the remote platform (due to remote platform bug or maybe harvesting logic flaw)

TODO:

$ udata harvest run 5dfa334e1077e4b5aec33000
➢ Harvesting source "5dfa334e1077e4b5aec33000"
DEBUG: Initializing backend
DEBUG: Queued 10 items
DEBUG: Processing: wisconsin-polling-places
DEBUG: Processing: us-national-foreclosure-statistics-january-2012
DEBUG: Processing: gold-prices-london-1950-2008-monthly
DEBUG: Processing: afghanistan-election-districts
DEBUG: Processing: varicella-chickenpox-incidence-and-mortality-and-after-vaccine
DEBUG: Processing: us-tobacco-usage-statistics
DEBUG: Processing: state-workforce-generation-2011-2015
DEBUG: Processing: crime-data-ten-most-populous-cities-us
DEBUG: Processing: london-deprivation-index
DEBUG: Processing: florida-bike-lanes
DEBUG: Running autoarchive
DEBUG: Archiving dataset 5e0083cf1077e41d4a255e77: A archivér

@abulte abulte marked this pull request as ready for review January 15, 2020 08:27
@abulte abulte requested a review from quaxsze January 15, 2020 08:28
@abulte abulte merged commit 287af69 into opendatateam:master Jan 17, 2020
@abulte abulte deleted the harvest-autoarchive branch January 17, 2020 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Moissonneurs : gestion des données supprimées sur la plateforme distante
2 participants