You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Want to test out if it's possible to do group_by -> summarise -> group_by -> summarise (or e.g. group_by -> summarise -> summarise) - @jhofman will provide an example
The text was updated successfully, but these errors were encountered:
@sharlagelfand: There were too many observations in the bike data, so here's an artificial but hopefully still interesting one: take a few famous baseball players, compute their batting average for each year they played, noting the team they played for, and then look at their median batting average over the time with that team.
plyr::baseball %>%
filter(id == "ruthba01" | id == "cobbty01" | id == "hornsro01") %>%
group_by(id, team, year) %>%
summarize(ba = h / ab) %>%
group_by(id, team) %>%
summarize(median_ba = median(ba)) %>%
ggplot(aes(x = id, y = median_ba, color = team)) +
geom_point(position = position_dodge(width = 0.25)) +
labs(x = "Player", y = "Median batting average over time with each team")
I don't love the styling of this plot, but perhaps it's enough to get started with?
Thanks @jhofman! This actually brings up another question about how to handle summary operations that are combinations of multiple variables, e.g. ba = h / ab - right now we don't have a way to show distributions of two variables or how the relationship between them derives a new variable... I'll create an issue for that, and see if we can come up with an example that just does multiple steps without making us encounter the "derived from multiple variables" for now
Want to test out if it's possible to do group_by -> summarise -> group_by -> summarise (or e.g. group_by -> summarise -> summarise) - @jhofman will provide an example
The text was updated successfully, but these errors were encountered: