Skip to content
This repository has been archived by the owner on Dec 13, 2022. It is now read-only.

Verify Quebec official testing dataset #70

Closed
jeanpaulrsoucy opened this issue Jan 31, 2021 · 1 comment
Closed

Verify Quebec official testing dataset #70

jeanpaulrsoucy opened this issue Jan 31, 2021 · 1 comment

Comments

@jeanpaulrsoucy
Copy link
Member

All official QC testing numbers have recently been added as an experimental dataset (22a3fc30feeded1a27457853f02f9acea8245d7): https://github.com/ccodwg/Covid19Canada/blob/master/official_datasets/qc/qc_testing_datasets_prov.csv

These reflect reflect the data underlying figures 4.1, 4.2, 4.3 on the INSPQ data page: https://www.inspq.qc.ca/covid-19/donnees

However, it is odd that 4.1 and 4.2 end on different dates (t -1 and t-2, respectively), despite apparently reporting the same metric. This should be investigated and variable names tweaked if necessary. The methodology page could help, especially the sources section: https://www.inspq.qc.ca/covid-19/donnees/methodologie#sources

This is part of the larger initiative of making official datasets available and compatible (#53).

@craigthusiast
Copy link

Thanks for adding this QC data in! It is definitely odd that the latest reporting date in figure 4.1 is more recent (by a day) than figure 4.2 on the INSPQ page. At least the # of confirmed cases on a particular day in 4.1 matches the # of confirmed cases for that same day in 4.2.

I think you need to rename your columns representing figure 4.2:
"unique_people_tested" should be "admissible_tests"
"unique_people_tested_positive" should be "admissible_tests_positive"
"unique_people_tested_negative" should be "admissible_tests_negative"
"unique_people_tested_positivity_percent" should be "admissible_tests_positivity_percent"

I believe "unique people tested" daily is the day-over-day difference in the "cumulative_unique_people_tested" column, which for some reason that totally confounds me, is DRASTICALLY lower than the number of admissible tests or samples analyzed on the same day.

Unfortunately, their testing data is not reported on the most recent day they report confirmed cases, so my charts will only show the daily new cases for the current reporting day (today -1), while testing data will only be displayed for the previous day (today -2) and earlier.

They calculate and publish their % positivity based on confirmed cases as a fraction of "admissible tests". Personally, I think it should be calculated based on "samples analyzed", but I'm no expert and I guess they have their rationale for that detailed out in their Methodology page.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants