Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
expand functions added #21
i'm a Data Analyst from Germany and love working with your packages, especially dplyr. While using the group_by %>% summarize chain, I discovered, that there are only groups made out of existing data. But if I want to group by date, for example, and I don't have data for a specific date, then its does not appear in the grouped result, so I have a "hole", but I would like to have
and so on...
I didn't find a solution, so I build a function for it based on expand.grid(). I thought it would be a nice add to tidyr, so take a look at it, maybe you like it (or at least the idea of expanding data).
You can expand from min:max and by unique ids (which is convenient if you grouped by many variables, like date and gender), and you can select as many expandable columns as you wish.
Feel free to contact me.
Here is an example how expand could be used in a dplyr chain:
firstpay <- data%>% group_by(UserId)%>% filter(paytime == min(paytime))%>% # Filter the data with the first payment per User group_by(date)%>% summarize(first_time_payer = n())%>% # Calculate the amount of first time payers per day ungroup%>% # Ungroup for cumsum operator expand("date")%>% # Expand by date, since there are days without first time payers mutate(total_payer = cumsum(first_time_payer)) # calculate the cumulated amount payers for every date point