Observed while answering this SO Question: https://stackoverflow.com/a/66041678/3576984
# NB: recall CJ output is keyed by default
DT1 = CJ(a = 1:3, b = 4:5, c = 6)
# the same table, but column two is re-named a
DT2 = CJ(a = 1:3, a = 4:5, d = 6)
Observe the difference of when DT2 is keyed vs not:
# KEYED
DT1[DT2]
# a b c i.a
# 1: 1 1 6 4
# 2: 1 1 6 5
# 3: 2 2 6 4
# 4: 2 2 6 5
# 5: 3 3 6 4
# 6: 3 3 6 5
# UNKEYED
setkey(DT2, NULL)
DT1[DT2]
# a b c
# 1: 1 4 6
# 2: 1 5 6
# 3: 2 4 6
# 4: 2 5 6
# 5: 3 4 6
# 6: 3 5 6
Is there some reason the first case should be intended behavior?
The verbose output suggests it starts doing the right thing, then gets tripped up later on:
DT1[DT2, verbose=TRUE]
# i.a has same type (integer) as x.a. No coercion needed.
# i.a has same type (integer) as x.b. No coercion needed.
# i.d has same type (double) as x.c. No coercion needed.
# on= matches existing key, using key
# Starting bmerge ...
# bmerge done in 0.000s elapsed (0.000s cpu)
# Constructing irows for '!byjoin || nqbyjoin' ... 0.000s elapsed (0.000s cpu)
Observed while answering this SO Question: https://stackoverflow.com/a/66041678/3576984
Observe the difference of when
DT2is keyed vs not:Is there some reason the first case should be intended behavior?
The
verboseoutput suggests it starts doing the right thing, then gets tripped up later on: