Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upStrange results from a non-equi join with multiple conditions #2275
Comments
|
Fixing #2360 also takes care of this. Please write back if not. Just issued the PR. Should be merged shortly, assuming tests pass. TODO: update the SO post linked by Frank. |
This was brought up on SO.
The goal is to find out if each row in DT1 has a match in DT2 in the sense of
on=.(RANDOM_STRING, DATE >= START_DATE, DATE <= EXPIRY_DATE):My usual approach is to do a join, counting matches with
.Nandby=.EACHI. However, the OP found that this fails here:There is probably a way to come up with the correct result in a less slow way (foverlaps?), but my point is that I expect the
.N, by=.EACHI]$N > 0Lway to work. Is it failing thanks to a bug or am I mistaken in using it here?I had trouble making a smaller example. Drop the
nparameter by a factor of 10 and you'll see that the problem disappears. Stranger, the OP noticed that if you repeatedly run theDT1[!(MATCHED), MATCHED := ... ]line, it will keep making changes over many iterations. Also, the OP said they couldn't construct an example when theon=condition only contained one inequality.EDIT: one faster way of coming up with the correct result, thanks to SO OP: