-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed
Description
The new ability of the *_join() commands to join on variables with different names in the two datasets currently requires that every variable be named. Intuitively, it would seem like naming only those variables that actually have different names in the two datasets would be sufficient.
For example, the second case here:
> A <- data.frame(ID=c(1,1,2),strategy=c("a","b","b"),V1=c(45,34,23))
> B <- data.frame(Manager=c(2,3),strategy=c("b","a"),V2=c(11,22))
> left_join(A,B,by=c("ID"="Manager","strategy"="strategy"))
ID strategy V1 V2
1 1 a 45 NA
2 1 b 34 NA
3 2 b 23 11
> left_join(A,B,by=c("ID"="Manager","strategy"))
Error: cannot join on columns 'strategy' x '' : index out of bounds
In this example I think it shouldn't be necessary for the user to write "strategy" twice when specifying that we should join on strategy.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
featurea feature request or enhancementa feature request or enhancement