-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi signals #156
Multi signals #156
Conversation
- As written, latest_issue() would retain all attributes of the original data frame, including the row names. But the row names should change because of the filtering - Left the original column names in there for good measure too - Applied the same fix to earliest_issue()
- Operates on a list of covidcast_signal data frames, as provided by covidcast_signals(), for example - Can specify "wide" or "tall" format, with former as the default - For convenience, this can operate on a list of data frame with time-shifted values appended to their columns, as provided by append_shifts(), for example - Also, downstream helpers covidcast_longer(), covidcast_wider() are there if you change your mind - covidcast_wider() is not fully implemented: it won't work when time-shifted columns are present - Neither of the downstream helpers are documented - I'm not certain that having the dt column be dynamically present (only present when there are time-shifted columns) is the best solution; but it seems confusing to add it as the default, since there user wouldn't know it's coming from if he/she never called append_shifts() - I'm also not certain I like the following asymmetry: when you call append_shifts() you always get a wide result, but when you call aggregate_signals() you can choose between a wide or a tall result. If you wanted to get column-shifted values in tall form, then you'd have to do something silly, where you wrap the result in a list and then call aggregate_signals() with format = "tall" - On the other hand, I don't want to add yet another option to append_shifts(), because that feels like it's getting overly featurized, offering every possible feature in every possible case
- Thanks to Andrew Chin for helpful feedback - Rename append_shifts() to apply_shifts(): now you have to ask for dt = 0 in order to retain the original value column, else it is dropped. This is cleaner behavior and leads to easier logic downstream - Hide apply_shifts() from the documentation. Remains in NAMESPACE so technically a user can call it but it's not visible. This is probably best as it presents a simpler set of options to the end user, and hides the overlap and somewhat ugly asymmetry between apply_shifts() and aggregate_signals() - Change aggregate_signals() so that it takes an argument dt to do shifts; clarify in the documentation the three main use cases for this function - Fix up covidcast_wider() and covidcast_taller(), and vignette:
- Strike direction from the record: doc, covidcast_signal object, plotting functionality - Remove geo_type from the attributes of covidcast_signal object, since it already appears in the metadata attribute - Change existing dependencies to fetch geo_type from metadata - Change vignettes to use USAFacts rather than JHU CSSE cata - Reorder columns in covidcast_signal object in a more sensible way, and modify aggregation functions to preserve this ordering
- These were incompatible with how aggregate_signals() interprets dt, and I think the latter is more natural. Negative translates into a lag value, positive into a lead value - Update documentation, correlation vignette - Modify the intro-api talk to use the new dt_x argument properly, and it looks like nothing else needs to change (no other dependencies on covidcast_cor() with dt_x or dt_y being nonzero)
- Also fix a small bug to allow aggregate_signals() to operate directly on a data frame (not just a list) - Do a small bit of code refactoring and some doc updates
As mentioned in #10 the interface here looks good to me, though someone more familiar with R paradigms should have the final say. |
@chinandrew @capnrefsmmat Few more updates. First, I filled out the vignette so this branch should be good to go. Andrew approves but suggests somebody more familiar with R takes a look. To repeat this doesn't need to be merged until evalcast gets merged, which shouldn't be for a few days at least. If Alex you're too busy then maybe @JedGrabman can take a look. Second, a spotted a discrepancy in the way I defined To summarize:
|
That can go into NEWS.md, and we can bump the minor version of the package accordingly. Note that this branch doesn't contain #148, so there'll be a conflict if you do this. |
@capnrefsmmat @chinandrew Upon reflection there was really no reason for me to have branched this off of the evalcast branch. I was just working on something else that I needed the evalcast package for, and then I started implementing these multi signal functions because I also needed them. But my "real" commits starting with b42b353 have noting to do with evalcast. Since the evalcast branch is actually held up with more development needed, what's the best way to jut pick off the commits starting with b42b353 through the end, and move them to a separate branch, which is created off of main? Then we could just merge that new into main. |
We'd have to carefully cherry-pick, since there are changes to evalcast in this series of commits, namely in That's doable, although whether or not we do it, I think it still makes sense to merge evalcast and continue further development of it in new PRs targeting specific features/fixes. |
You should either be able to do a rebase like EDIT: looks like alex beat me to it. |
Thanks both of you! I guess I could try to cherry-pick but it might be easiest just to merge evalcast to main as it stands now (Alex I just left a message agreeing with your logic on #85). |
- Change vignettes references to URL links (since vignettes aren't built along with the package by default) - Use rlang::warn more consistently (whenever it might be useful) - Copyedit a few small spots including the DESCRIPTION
… into multi-signals
@capnrefsmmat Well I snuck in a few other really small light housekeeping things. See the details of my last commit. I managed to do it before you merge this into main (but apparently not before you merged main into this!). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed two issues while preparing to merge
Hi all, particularly capnrefsmmat @chinandrew, this finishes off the
covidcast_signals()
function, and provides related signal aggregation functionality.covidcast_signal
object, plotting functionality, etc.), which I suggested we should do in Derivative functionality #131.Because I branched this off of evalcast-review (which was branched off of evalcast), the commit log here is huge. My first "real" commit for this branch starts at b42b353. Once evalcast gets merged into main #85 (should happen within a couple days), the commit log here will look a lot cleaner, and we can merge this branch. chinandrew has already provided an informal (?) review, maybe he can take one more look.
┆Issue is synchronized with this Asana task by Unito