Enhancements for *dply #188

Merged
merged 5 commits into from Jan 6, 2014

Conversation

Projects
None yet
2 participants
Contributor

krlmlr commented Jan 6, 2014

  • Generate .id column as factor in ldply(); fixes #137
  • New parameter .id for rdply() to specify name of index column (default: .n); fixes #142
  • New parameter .id for ldply() to specify name of index column (default: .id); fixes #140; fixes #107

Rebased and edited #177.

hadley added a commit that referenced this pull request Jan 6, 2014

@hadley hadley merged commit 67a818d into hadley:master Jan 6, 2014

1 check passed

default The Travis CI build passed
Details
Owner

hadley commented Jan 6, 2014

Thanks! What do you do think about something similar for adply where it would form the prefix for the column names? Not sure what happens currently.

@krlmlr krlmlr deleted the krlmlr:fix/140-142-id-137-column-as-factor branch Jan 6, 2014

@krlmlr krlmlr referenced this pull request Jan 6, 2014

Closed

.id parameter for adply? #191

Owner

hadley commented Jan 7, 2014

Unfortunately changing .id to a factor breaks ggplot2.

Contributor

krlmlr commented Jan 7, 2014

Could you please provide more detail? Which part of ggplot is broken?

I came up with this issue because I frequently have the following situation:

  1. A sequence of values given in a vector (or a list)
  2. This sequence results in a data frame after a call to ldply
  3. This data frame is used in a ggplot where the .id column is mapped to the facet label

Previously, this resulted in a sorting of the facets, but mostly I want to specify the sequence of facets once (in step 1). As a workaround, I have invented a function ofactor that works like factor but orders the levels in order of appearance (and not in sorted order as with factor). Using a factor for the .id column resolves this issue in an elegant way.

Owner

hadley commented Jan 7, 2014

Facetting, because it expects a character, and it's now a factor. I think it's a reasonable change, but plyr is used by so many packages that even changes that make sense may cause bugs in existing, and that's something I'd prefer to avoid if possible.

Contributor

krlmlr commented Jan 7, 2014

I have searched ggplot for .id, and didn't find anything. Do you mean user code that calls ggplot with the result of a ldply call? Could you give a failing example? I used to think that factors were always coercible to character...

Contributor

krlmlr commented Jan 8, 2014

A simple test reveals the error: http://rpubs.com/krlmlr/11768

Of course we could patch ggplot2, but perhaps the safest way is to ensure that a factor is returned only if the user requests it, say, by passing a value to the .id parameter. New default for the .id parameter could be NA to indicate that the old behavior is requested. ?

Owner

hadley commented Jan 9, 2014

That seems reasonable to me. Would you mind submitting a patch?

Contributor

krlmlr commented Jan 10, 2014

Done: #195

@krlmlr krlmlr restored the krlmlr:fix/140-142-id-137-column-as-factor branch Mar 13, 2014

@krlmlr krlmlr deleted the krlmlr:fix/140-142-id-137-column-as-factor branch Mar 13, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment