From c49a8ac91ffd38d18c2131dba3f71706645c27e9 Mon Sep 17 00:00:00 2001 From: Florian Kohrt Date: Fri, 17 Mar 2023 07:17:02 +0100 Subject: [PATCH 1/2] Document data.tables with no columns --- vignettes/datatable-faq.Rmd | 1 + 1 file changed, 1 insertion(+) diff --git a/vignettes/datatable-faq.Rmd b/vignettes/datatable-faq.Rmd index 4b0645e6b..5ae30bf1c 100644 --- a/vignettes/datatable-faq.Rmd +++ b/vignettes/datatable-faq.Rmd @@ -396,6 +396,7 @@ A key advantage of column vectors in R is that they are _ordered_, unlike SQL[^2 - `check.names` is by default `TRUE` in `data.frame` but `FALSE` in data.table, for convenience. - `data.table` has always set `stringsAsFactors=FALSE` by default. In R 4.0.0 (Apr 2020), `data.frame`'s default was changed from `TRUE` to `FALSE` and there is no longer a difference in this regard; see [stringsAsFactors, Kurt Hornik, Feb 2020](https://developer.r-project.org/Blog/public/2020/02/16/stringsasfactors/). - Atomic vectors in `list` columns are collapsed when printed using `", "` in `data.frame`, but `","` in data.table with a trailing comma after the 6th item to avoid accidental printing of large embedded objects. + - Unlike data.frames a data.table cannot store rows with no columns: `nrow(DF[, 0])` returns the number of rows, while `nrow(DT[, 0])` always returns 0; but see issue [#2422](https://github.com/Rdatatable/data.table/issues/2422). In `[.data.frame` we very often set `drop = FALSE`. When we forget, bugs can arise in edge cases where single columns are selected and all of a sudden a vector is returned rather than a single column `data.frame`. In `[.data.table` we took the opportunity to make it consistent and dropped `drop`. From 2dc8e8fbd1718f6c82a7bc7ba89f9e82fc88f9aa Mon Sep 17 00:00:00 2001 From: Florian Kohrt Date: Sat, 25 Mar 2023 11:06:09 +0100 Subject: [PATCH 2/2] Mention that rows are children of columns --- vignettes/datatable-faq.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/vignettes/datatable-faq.Rmd b/vignettes/datatable-faq.Rmd index 5ae30bf1c..6013634f7 100644 --- a/vignettes/datatable-faq.Rmd +++ b/vignettes/datatable-faq.Rmd @@ -396,7 +396,7 @@ A key advantage of column vectors in R is that they are _ordered_, unlike SQL[^2 - `check.names` is by default `TRUE` in `data.frame` but `FALSE` in data.table, for convenience. - `data.table` has always set `stringsAsFactors=FALSE` by default. In R 4.0.0 (Apr 2020), `data.frame`'s default was changed from `TRUE` to `FALSE` and there is no longer a difference in this regard; see [stringsAsFactors, Kurt Hornik, Feb 2020](https://developer.r-project.org/Blog/public/2020/02/16/stringsasfactors/). - Atomic vectors in `list` columns are collapsed when printed using `", "` in `data.frame`, but `","` in data.table with a trailing comma after the 6th item to avoid accidental printing of large embedded objects. - - Unlike data.frames a data.table cannot store rows with no columns: `nrow(DF[, 0])` returns the number of rows, while `nrow(DT[, 0])` always returns 0; but see issue [#2422](https://github.com/Rdatatable/data.table/issues/2422). + - Unlike data.frames a data.table cannot store rows with no columns, as rows are considered to be the children of columns: `nrow(DF[, 0])` returns the number of rows, while `nrow(DT[, 0])` always returns 0; but see issue [#2422](https://github.com/Rdatatable/data.table/issues/2422). In `[.data.frame` we very often set `drop = FALSE`. When we forget, bugs can arise in edge cases where single columns are selected and all of a sudden a vector is returned rather than a single column `data.frame`. In `[.data.table` we took the opportunity to make it consistent and dropped `drop`.