Auto update the PEI dataset by downloading pdf and extracting data via Python #36

mc51 · 2022-01-21T08:53:50Z

First of: Important and good App. Thanks for the effort!
I also had the idea to make comparing tests more easy, because I was frustrated by how hard it was to find and use the data from Paul Ehrlich Institut (PEI). I've built an interactive website https://corona.pw displaying the original PEI data in a more convenient way. (It's also on GitHub)

The part which was a little tricky, was automatically extracting the data from the .pdf file. It's a bit hacky, but if the structure doesn't change to much, it should also work in the future. I guess that would be a good addition to this tool? You can check the Python code for that in the repo.
I'm a bit busy right now, that's why I've created an issue instead of PR. But if I find some time, I'll also gladly open a PR.

The text was updated successfully, but these errors were encountered:

Marcono1234 · 2022-02-07T03:11:33Z

It looks like this data is also available as Excel spreadsheet on https://www.pei.de/DE/newsroom/dossier/coronavirus/coronavirus-inhalt.html?nn=169730&cms_pos=8:

("Excel-Tabellen: Vergleichende Evaluierung der Sensitivität von SARS-CoV-2 Antigenschnelltests (Selbsttests + Schnelltests)"; direct download link)

It is a bit weird that they provide the download links on two separate pages, and that the other page only contains the PDF download link without linking between these pages. The Excel file might also not exist that long yet, while writing this it has v=8 whereas the PDF file has v=77 in its URL (assuming that this is a version number; though its value seems to be ignored for HTTP requests).

Edit: Looks like the maintainers are aware of it as well and have created src/data/xlsx2all.py to parse that Excel file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto update the PEI dataset by downloading pdf and extracting data via Python #36

Auto update the PEI dataset by downloading pdf and extracting data via Python #36

mc51 commented Jan 21, 2022

Marcono1234 commented Feb 7, 2022 •

edited

Auto update the PEI dataset by downloading pdf and extracting data via Python #36

Auto update the PEI dataset by downloading pdf and extracting data via Python #36

Comments

mc51 commented Jan 21, 2022

Marcono1234 commented Feb 7, 2022 • edited

Marcono1234 commented Feb 7, 2022 •

edited