Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column-reordered table printed when subsetting #3306

Closed
renkun-ken opened this issue Jan 23, 2019 · 4 comments
Closed

Column-reordered table printed when subsetting #3306

renkun-ken opened this issue Jan 23, 2019 · 4 comments
Milestone

Comments

@renkun-ken
Copy link
Member

@renkun-ken renkun-ken commented Jan 23, 2019

I notice a strange behavior of print.data.table in latest release. When there's a large number of columns, printing a subsetted data.table will show a column-reordered table, but the result still have correct colnames.

library(data.table)

dt <- data.table(id = 1:10000)
for (i in 1:300) {
  dt[, paste0("v", i) := rnorm(.N)]
}
dt[v100 >= 0]

image

> colnames(dt[v100 >= 0])
  [1] "id"   "v1"   "v2"   "v3"   "v4"   "v5"   "v6"   "v7"   "v8"   "v9"   "v10"  "v11"  "v12"  "v13"  "v14"  "v15"  "v16"  "v17"  "v18"  "v19"  "v20"  "v21"  "v22"  "v23"  "v24" 
 [26] "v25"  "v26"  "v27"  "v28"  "v29"  "v30"  "v31"  "v32"  "v33"  "v34"  "v35"  "v36"  "v37"  "v38"  "v39"  "v40"  "v41"  "v42"  "v43"  "v44"  "v45"  "v46"  "v47"  "v48"  "v49" 
 [51] "v50"  "v51"  "v52"  "v53"  "v54"  "v55"  "v56"  "v57"  "v58"  "v59"  "v60"  "v61"  "v62"  "v63"  "v64"  "v65"  "v66"  "v67"  "v68"  "v69"  "v70"  "v71"  "v72"  "v73"  "v74" 
 [76] "v75"  "v76"  "v77"  "v78"  "v79"  "v80"  "v81"  "v82"  "v83"  "v84"  "v85"  "v86"  "v87"  "v88"  "v89"  "v90"  "v91"  "v92"  "v93"  "v94"  "v95"  "v96"  "v97"  "v98"  "v99" 
[101] "v100" "v101" "v102" "v103" "v104" "v105" "v106" "v107" "v108" "v109" "v110" "v111" "v112" "v113" "v114" "v115" "v116" "v117" "v118" "v119" "v120" "v121" "v122" "v123" "v124"
[126] "v125" "v126" "v127" "v128" "v129" "v130" "v131" "v132" "v133" "v134" "v135" "v136" "v137" "v138" "v139" "v140" "v141" "v142" "v143" "v144" "v145" "v146" "v147" "v148" "v149"
[151] "v150" "v151" "v152" "v153" "v154" "v155" "v156" "v157" "v158" "v159" "v160" "v161" "v162" "v163" "v164" "v165" "v166" "v167" "v168" "v169" "v170" "v171" "v172" "v173" "v174"
[176] "v175" "v176" "v177" "v178" "v179" "v180" "v181" "v182" "v183" "v184" "v185" "v186" "v187" "v188" "v189" "v190" "v191" "v192" "v193" "v194" "v195" "v196" "v197" "v198" "v199"
[201] "v200" "v201" "v202" "v203" "v204" "v205" "v206" "v207" "v208" "v209" "v210" "v211" "v212" "v213" "v214" "v215" "v216" "v217" "v218" "v219" "v220" "v221" "v222" "v223" "v224"
[226] "v225" "v226" "v227" "v228" "v229" "v230" "v231" "v232" "v233" "v234" "v235" "v236" "v237" "v238" "v239" "v240" "v241" "v242" "v243" "v244" "v245" "v246" "v247" "v248" "v249"
[251] "v250" "v251" "v252" "v253" "v254" "v255" "v256" "v257" "v258" "v259" "v260" "v261" "v262" "v263" "v264" "v265" "v266" "v267" "v268" "v269" "v270" "v271" "v272" "v273" "v274"
[276] "v275" "v276" "v277" "v278" "v279" "v280" "v281" "v282" "v283" "v284" "v285" "v286" "v287" "v288" "v289" "v290" "v291" "v292" "v293" "v294" "v295" "v296" "v297" "v298" "v299"
[301] "v300"
@MichaelChirico
Copy link
Member

@MichaelChirico MichaelChirico commented Jan 23, 2019

confirmed incorrect order, though not the same order on my machine\

@MichaelChirico
Copy link
Member

@MichaelChirico MichaelChirico commented Jan 23, 2019

Root is rbind reordering columns:

dt[v100 >= 0, colnames(rbind(head(.SD, 5), tail(.SD, 5)))]
  [1] "id"   "v226" "v225" "v1"   "v2"   "v227" "v228" "v3"   "v4"   "v229" "v230" "v5"   "v6"   "v231" "v232" "v7"   "v8"   "v233"
 [19] "v234" "v9"   "v10"  "v235" "v236" "v11"  "v12"  "v237" "v238" "v13"  "v14"  "v239" "v240" "v15"  "v16"  "v241" "v242" "v17" 
 [37] "v18"  "v243" "v244" "v19"  "v20"  "v245" "v246" "v21"  "v22"  "v247" "v248" "v23"  "v24"  "v249" "v250" "v25"  "v26"  "v251"
 [55] "v252" "v27"  "v28"  "v253" "v254" "v29"  "v30"  "v255" "v256" "v31"  "v32"  "v257" "v258" "v33"  "v34"  "v259" "v260" "v35" 
 [73] "v36"  "v261" "v262" "v37"  "v38"  "v263" "v264" "v39"  "v40"  "v265" "v266" "v41"  "v42"  "v267" "v268" "v43"  "v44"  "v269"
 [91] "v270" "v45"  "v46"  "v271" "v272" "v47"  "v48"  "v273" "v274" "v49"  "v50"  "v275" "v276" "v51"  "v52"  "v277" "v278" "v53" 
[109] "v54"  "v279" "v280" "v55"  "v56"  "v281" "v282" "v57"  "v58"  "v283" "v284" "v59"  "v60"  "v285" "v286" "v61"  "v62"  "v287"
[127] "v288" "v63"  "v64"  "v289" "v290" "v65"  "v66"  "v291" "v292" "v67"  "v68"  "v293" "v294" "v69"  "v70"  "v295" "v296" "v71" 
[145] "v72"  "v297" "v298" "v73"  "v74"  "v299" "v300" "v75"  "v76"  "v150" "v149" "v151" "v152" "v77"  "v78"  "v153" "v154" "v79" 
[163] "v80"  "v155" "v156" "v81"  "v82"  "v157" "v158" "v83"  "v84"  "v159" "v160" "v85"  "v86"  "v161" "v162" "v87"  "v88"  "v163"
[181] "v164" "v89"  "v90"  "v165" "v166" "v91"  "v92"  "v167" "v168" "v93"  "v94"  "v169" "v170" "v95"  "v96"  "v171" "v172" "v97" 
[199] "v98"  "v173" "v174" "v99"  "v100" "v175" "v176" "v101" "v102" "v177" "v178" "v103" "v104" "v179" "v180" "v105" "v106" "v181"
[217] "v182" "v107" "v108" "v183" "v184" "v109" "v110" "v185" "v186" "v111" "v112" "v187" "v188" "v113" "v114" "v189" "v190" "v115"
[235] "v116" "v191" "v192" "v117" "v118" "v193" "v194" "v119" "v120" "v195" "v196" "v121" "v122" "v197" "v198" "v123" "v124" "v199"
[253] "v200" "v125" "v126" "v201" "v202" "v127" "v128" "v203" "v204" "v129" "v130" "v205" "v206" "v131" "v132" "v207" "v208" "v133"
[271] "v134" "v209" "v210" "v135" "v136" "v211" "v212" "v137" "v138" "v213" "v214" "v139" "v140" "v215" "v216" "v141" "v142" "v217"
[289] "v218" "v143" "v144" "v219" "v220" "v145" "v146" "v221" "v222" "v147" "v148" "v223" "v224"

Can be fixed by setting use.names = FALSE but I'm not sure this is intended behavior of rbind to begin with...

@MichaelChirico
Copy link
Member

@MichaelChirico MichaelChirico commented Jan 23, 2019

Funnily enough rbind works perfectly for smaller # of columns; only kicks in for wider tables (probably why this hasn't come up before):

ncol = 30
DT = setDT(replicate(ncol, rnorm(3L), simplify = FALSE))
identical(colnames(rbind(DT[1], DT[3])), colnames(DT))
# [1] TRUE


ncol = 300
DT = setDT(replicate(ncol, rnorm(3L), simplify = FALSE))
identical(colnames(rbind(DT[1], DT[3])), colnames(DT))
# [1] FALSE
@DrKodiak
Copy link

@DrKodiak DrKodiak commented Feb 1, 2019

I was about to file a separate bug report when I came across this post.
rbind rearranges columns when there are more than 254 columns. This only happens with data.tables, not data.frames. Setting use.names to FALSE prevents this.


library(data.table)

a <- data.table(t(1:254))
b <- rbind(a,a)
all(colnames(b) == colnames(a))
# TRUE

a <- data.table(t(1:255))
b <- rbind(a,a)
all(colnames(b) == colnames(a))
# FALSE

a <- data.frame(t(1:255))
b <- rbind(a,a)
all(colnames(b) == colnames(a))
# TRUE

a <- data.table(t(1:255))
b <- rbind(a,a, use.names = F)
all(colnames(b) == colnames(a))
# TRUE
jangorecki added a commit that referenced this issue Feb 4, 2019
@mattdowle mattdowle added this to the 1.12.2 milestone Feb 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

4 participants
You can’t perform that action at this time.