Skip to content
This repository has been archived by the owner on Dec 5, 2022. It is now read-only.

Calculate MD5 hash of each fetched page and ensure that the content has changed from day to day #388

Closed
jzohrab opened this issue Aug 9, 2020 · 1 comment
Labels
critical Critical issue duplicate This issue or pull request already exists enhancement New feature or request from-cds Transferred from https://github.com/covidatlas/coronadatascraper

Comments

@jzohrab
Copy link
Contributor

jzohrab commented Aug 9, 2020

Original issue https://github.com/covidatlas/coronadatascraper/issues/159, transferred here on Thursday Mar 19, 2020 at 20:10 GMT


We should report if it has not been updated at all. This would catch errors like the NJ dataset changing URLs but leaving the old one accessible.

@jzohrab jzohrab added critical Critical issue enhancement New feature or request from-cds Transferred from https://github.com/covidatlas/coronadatascraper labels Aug 9, 2020
@jzohrab
Copy link
Contributor Author

jzohrab commented Aug 9, 2020

(Transferred comment)

Maybe filename should be: {md5 of url.substr(8)}-{md5 of contents.substr(8)}-{8601Z timestamp}.ext, wdyt?

@jzohrab jzohrab added the duplicate This issue or pull request already exists label Aug 9, 2020
@jzohrab jzohrab closed this as completed Aug 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
critical Critical issue duplicate This issue or pull request already exists enhancement New feature or request from-cds Transferred from https://github.com/covidatlas/coronadatascraper
Projects
None yet
Development

No branches or pull requests

1 participant