Skip to content

Commit

Permalink
Closes #1536. ?truelength.Rd fixed that over allocation happens on ad…
Browse files Browse the repository at this point in the history
…ditions alone.
  • Loading branch information
arunsrinivasan committed Feb 17, 2016
1 parent 272331a commit d96881d
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 1 deletion.
2 changes: 2 additions & 0 deletions README.md
Expand Up @@ -139,6 +139,8 @@

13. `?shift.Rd` is fixed so that it does not get misconstrued to be in a time series sense. Closes [#1530](https://github.com/Rdatatable/data.table/issues/1530). Thanks to @pstoyanov.

14. `?truelength.Rd` is fixed to reflect that over-allocation happens on data.tables loaded from disk only during column additions and not deletions, [#1536](https://github.com/Rdatatable/data.table/issues/1536). Thanks to @Roland and @rajkrpan.

### Changes in v1.9.6 (on CRAN 19 Sep 2015)

#### NEW FEATURES
Expand Down
2 changes: 1 addition & 1 deletion man/truelength.Rd
Expand Up @@ -25,7 +25,7 @@ alloc.col(DT,
Please note : over allocation of the column pointer vector is not for efficiency per se. It's so that \code{:=} can add columns by reference without a shallow copy.
}
\value{
\code{truelength(x)} returns the length of the vector allocated in memory. \code{length(x)} of those items are in use. Currently, it's just the list vector of column pointers that is over-allocated (i.e. \code{truelength(DT)}), not the column vectors themselves, which would in future allow fast row \code{insert()}. For tables loaded from disk however, \code{truelength} is 0 in \R 2.14.0 and random in \R <= 2.13.2; i.e., in both cases perhaps unexpected. \code{data.table} detects this state and over-allocates the loaded \code{data.table} when the next column addition or deletion occurs. All other operations on \code{data.table} (such as fast grouping and joins) do not need \code{truelength}.
\code{truelength(x)} returns the length of the vector allocated in memory. \code{length(x)} of those items are in use. Currently, it's just the list vector of column pointers that is over-allocated (i.e. \code{truelength(DT)}), not the column vectors themselves, which would in future allow fast row \code{insert()}. For tables loaded from disk however, \code{truelength} is 0 in \R 2.14.0+ (and random in \R <= 2.13.2), which is perhaps unexpected. \code{data.table} detects this state and over-allocates the loaded \code{data.table} when the next column addition occurs. All other operations on \code{data.table} (such as fast grouping and joins) do not need \code{truelength}.
\code{alloc.col} \emph{reallocates} \code{DT} by reference. This may be useful for efficiency if you know you are about to going to add a lot of columns in a loop. It also returns the new \code{DT}, for convenience in compound queries.
}
Expand Down

0 comments on commit d96881d

Please sign in to comment.