Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package evaluation & first pass documentation #9

Closed
Tracked by #11
krivard opened this issue Sep 22, 2021 · 12 comments
Closed
Tracked by #11

Package evaluation & first pass documentation #9

krivard opened this issue Sep 22, 2021 · 12 comments
Assignees

Comments

@krivard
Copy link
Contributor

krivard commented Sep 22, 2021

The goal of this task is to better understand whether this package does what we need it to, and permit other Delphi members to understand the same without having to read the source.

Delphi blog posts that include plots also include the code used to produce those plots. Reproduce as many of the plots as you can from the following blog posts, using this package instead of covidcast.

Use your newly acquired knowledge to flesh out the package documentation with the basics.

@krivard
Copy link
Contributor Author

krivard commented Sep 22, 2021

This is slated for Jin-Hong, tag incoming

@jaydu1
Copy link
Contributor

jaydu1 commented Sep 29, 2021

I have made some plots locally. Do I need to integrate the plotting functionality into the package?

Btw, the two blog posts use R, so I instead follow the document of the covidcast (https://cmu-delphi.github.io/covidcast/covidcast-py/html/plot_examples.html). I will come back to the posts after I finish this.

@krivard
Copy link
Contributor Author

krivard commented Sep 29, 2021

We do not plan for the first release to include plotting. For these tests, write your own plotting code; for the documentation, focus on data fetching.

@krivard
Copy link
Contributor Author

krivard commented Sep 29, 2021

(also, welcome!)

@jaydu1
Copy link
Contributor

jaydu1 commented Sep 30, 2021

Hi Katie, is there any document/precise definition for the fields in the new covid metadata?
There are some columns of signal_df, such as value_label and high_values_are that I did not found out on the current Epidata API website.

@krivard
Copy link
Contributor Author

krivard commented Oct 1, 2021

Not yet -- that's all documented as notes on the cell headers in the source spreadsheet. I'll give you read access.

@jaydu1
Copy link
Contributor

jaydu1 commented Oct 3, 2021

Thanks. I've added the docs for metadata about available sources and signals. While I have some other questions:

  1. For the detailed metadata about each signal, covidcast has more information (e.g. last_update, min_lag) from EpiData API than the current package (code). I am not sure if this is designed to be compatible with other epidata or if something is missing.
  2. The current package returns the detailed metadata as an object DataSignal for each signal. Its data can be accessed by DataSignal().signal for instance. Do we need to add a function to convert the DataSignal to a DataFrame? And concatenate metadata of all signals as covidcast does?

@krivard
Copy link
Contributor Author

krivard commented Oct 4, 2021

(1) last_update and min_lag were omitted by mistake; this package is meant to be a drop-in/relatively-quick replacement for covidcast for all data fetching applications.

(2) it's possible DataSignal is experimental, or even if not, that it was designed to be an upgrade for folks using metadata as one huge dataframe that they're constantly deduplicating. Basically the metadata results have two use cases:

  • give me all the (source, signal) pairs so I can loop over them and do some kind of analytics on every covidcast dataset
  • give me the metadata stats on a particular (source, signal) pair so I can construct queries that make sense, set my axis limits, etc

so if DataSignal will handle only the latter, we should come up with a way to make the former possible. Whether that's a data frame or some other method, I don't feel super strongly either way

@jaydu1
Copy link
Contributor

jaydu1 commented Oct 14, 2021

For (1), I found that the current package uses 'covidcast/meta' endpoint while covidcast uses 'covidcast_meta' endpoint. There are some differences between the two endpoints. If I understand correctly, we should use the latter so that this package would be a quick replacement for covidcast.

For (2), the current package can provide functionality as you suggest. Looks like I only need to fix the first issue.

@lee14257
Copy link

lee14257 commented May 24, 2022

#10 This is the PR for initial documentation and guide for the package.

Some next steps are:

  • Add more extensive list of examples in the "Getting Started" section
    • Cover more API endpoints
    • Cover all data parsing use-cases (CSV, Classic, DF, etc.)
  • Test the EpiRange and Epiweek class more thoroughly
  • Check for autocomplete functionality in covidcast
  • Check more thoroughly for bugs / errors

@dshemetov
Copy link
Contributor

dshemetov commented May 30, 2024

@rzats Here is a task you can work on. Feel free to ping me here or on Slack if something is unclear! TL;DR: we need to build a basic query test file.

port R vignettes and examples over

So some of the examples we want to port are in docstrings in the R functions. Here is an example for pvt_cdc endpoint (the endpoints marked with pvt_ are private and require a special key to access (different from the rate-limited API key); ask Brian for those). Here's a simpler example for a public endpoint.

So what we would look like is to take those R queries and port them to this package. At this point, let's just write a single test file like test_epidata_calls.py (we can figure out if we can do a similar docstring example with Sphinx later). Each test should make an equivalent call with the Python client and make sure that the output is non-trivial (as in non-empty; if some of them come back empty, then talk to me, and we should fix the query) for now (perhaps later, we can try and snapshot the outputs).

I expect that there may be errors along the way when trying to reproduce these (for various reasons, like function input data types or parsing issues, etc.). Record those as TODO items here.

@dshemetov dshemetov assigned dshemetov and rzats and unassigned dshemetov and jaydu1 May 30, 2024
@dshemetov
Copy link
Contributor

dshemetov commented Jul 16, 2024

Initial package evaluation and first pass documentation considered done. Moving bigger picture items to #11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants