Skip to content
This repository has been archived by the owner on Dec 5, 2022. It is now read-only.

Feature: Add "caveats" for scrapers #530

Open
jzohrab opened this issue Apr 7, 2020 · 1 comment
Open

Feature: Add "caveats" for scrapers #530

jzohrab opened this issue Apr 7, 2020 · 1 comment

Comments

@jzohrab
Copy link
Contributor

jzohrab commented Apr 7, 2020

Description

In some scrapers, we're making justifiable assumptions about how to interpret the data (e.g., covidatlas/coronadatascraper#572 - KOR quarantines). For scrapers, we could hardcode these caveats in the scrapers, and perhaps include them in the source output, e.g.:

[
  {
    "county": "Los Angeles County",
    "state": "California",
    "country": "United States",
...
    "url": "http://www.publichealth.lacounty.gov/media/Coronavirus/",
    "cases": 0,
    "deaths": 0,
    "caveats": [
        "some_data_here"
   ],
...
  }
]

Perhaps these assumptions could be rolled up to the higher levels:

    "caveats": [
        "LA, CA: some_data_here",
        "PA: penn. caveats here"
   ]

Why do you need this feature or component?

Publicize assumptions

Notes

For testing/regression, I don't think we'd need to check the caveats field, as it might change over time. One sanity check would be enough.

@shaperilio
Copy link

Yes! This came up also in the discussion of the Panama scraper, because the Panama granularity level is akin to "borroughs" (smaller than cities) and we don't have anyway to store that. So if we call them counties, that detail could go in a field like this.

@jzohrab jzohrab closed this as completed Aug 9, 2020
@jzohrab jzohrab reopened this Aug 9, 2020
@jzohrab jzohrab transferred this issue from covidatlas/coronadatascraper Aug 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants