Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

confusing POSIXlt warning #2068

Closed
tdhock opened this issue Mar 20, 2017 · 8 comments · Fixed by #3776
Closed

confusing POSIXlt warning #2068

tdhock opened this issue Mar 20, 2017 · 8 comments · Fixed by #3776

Comments

@tdhock
Copy link
Member

tdhock commented Mar 20, 2017

Hey guys thanks for all your hard work making data.table great!

Can you please clarify the warning that is issued when strptime is used inside of the second argument of [.data.table? Right now it says "POSIXlt column type detected and converted to POSIXct" even in cases when strptime does not operate on columns at all.

For example this is a MRE which is potentially confusing:

> data.table(X=1)[, { strptime("1984-03-17", "%Y-%m-%d"); X}]
[1] 1
Warning message:
In strptime("1984-03-17", "%Y-%m-%d") :
  POSIXlt column type detected and converted to POSIXct. We do not recommend use of POSIXlt at all because it uses 40 bytes to store one date.
> devtools::session_info()
Session info -------------------------------------------------------------------
 setting  value                       
 version  R version 3.3.3 (2017-03-06)
 system   i686, linux-gnu             
 ui       X11                         
 language en_US                       
 collate  en_US.UTF-8                 
 tz       posixrules                  
 date     2017-03-20                  

Packages -----------------------------------------------------------------------
 package       * version    date       source                                  
 bitops          1.0-6      2013-08-17 CRAN (R 3.2.2)                          
 caTools         1.17.1     2014-09-10 CRAN (R 3.2.2)                          
 colorspace      1.3-0      2016-11-01 R-Forge (R 3.3.1)                       
 data.table    * 1.10.5     2017-03-20 Github (Rdatatable/data.table@ce00cb1)  
 devtools        1.12.0     2016-12-05 CRAN (R 3.3.3)                          
 digest          0.6.10     2016-08-02 CRAN (R 3.2.2)                          
 ggplot2       * 2.1.0      2016-11-09 Github (faizan-khan-iit/ggplot2@5fb99d0)
 gtable          0.2.0      2016-02-26 CRAN (R 3.2.2)                          
 lattice       * 0.20-34    2016-09-06 CRAN (R 3.3.3)                          
 memoise         1.0.0      2016-01-29 CRAN (R 3.2.2)                          
 munsell         0.4.3      2016-02-13 CRAN (R 3.2.2)                          
 namedCapture  * 2015.12.01 2015-12-21 Github (tdhock/namedCapture@0517592)    
 plyr            1.8.4      2016-06-08 CRAN (R 3.2.2)                          
 RColorBrewer  * 1.1-2      2014-12-07 CRAN (R 3.2.2)                          
 Rcpp            0.12.9     2017-01-14 cran (@0.12.9)                          
 RCurl         * 1.96-0     2016-08-07 local                                   
 requireGitHub * 2014.4.4   2016-08-13 local                                   
 RJSONIO       * 1.3-0      2014-07-28 CRAN (R 3.2.2)                          
 RSelenium     * 1.3.6      2016-11-09 Github (ropensci/RSelenium@22f06b9)     
 scales          0.4.1      2016-11-09 CRAN (R 3.3.1)                          
 withr           1.0.1      2016-02-04 CRAN (R 3.2.2)                          
 XML           * 3.99-0     2016-08-07 local                                   
> 

I'm not sure there should be a warning at all in this case.

@MichaelChirico
Copy link
Member

MichaelChirico commented Mar 20, 2017

See ?strptime Value section:

strptime turns character representations into an object of class "POSIXlt".

Maybe I'm not understanding what you expect?

@tdhock
Copy link
Member Author

tdhock commented Mar 21, 2017

Hey @MichaelChirico thanks for your response. I am aware that strptime returns POSIXlt, but I thought that it was strange that data.table warns about "column type" even when the strptime neither uses nor assigns to a column. for example data.table(X=1)[, strptime("1984-03-17", "%Y-%m-%d")] generates the warning. Maybe it should be "warning: every time strptime is used in the context of the second argument of [.data.table, it returns POSIXct instead of the usual POSIXlt"?

@MichaelChirico
Copy link
Member

Why not just

POSIXlt detected and converted to POSIXct. We do not recommend use of POSIXlt at all because it uses 40 bytes to store one date.

Though this is somewhat in contradiction to data.table::year:

function(x) as.POSIXlt(x)$year + 1900L

Which doesn't throw the warning:

data.table(X = 1)[ , year('2017-03-21')]
# [1] 2017

Though I think in the longer term the plan is to move all of these helper functions to C, perhaps by proxy by moving IDate/ITime to C.

@tdhock
Copy link
Member Author

tdhock commented Mar 21, 2017

on a related note, in the old version of data.table (before the strptime helper function) I was using

DT[, date := as.POSIXct(strptime(chr, format))]

and now we can use

DT[, date := strptime(chr, format)]

but it always raises that warning so to avoid it I am always writing

DT[, date := suppressWarnings(strptime(chr, format))]

which is kind of annoying... I wonder if you would consider just removing the warning altogether?

Another way to explain the issue is the following. I expect that the return value of strptime should stay POSIXlt when I am using it like this

data.table(X=1)[, {
  lots of code
  doing lots of things
  some.intermediate.variable.which.I.expect.is.POSIXlt <- strptime(chr, format)
  return something else
}]

but I expected that ANY POSIXlt (not just the ones that come from strptime) should be converted to POSIXct for assignment to a data.table column via := for example

DT[, date.POSIXct := someFunctionThatReturnsPOSIXlt()]

@MichaelChirico
Copy link
Member

When IDateTime is feature-complete, your pipeline will skip both POSIXct and POSIXlt. So I'm not sure how worthwhile it is to put much dev time into this.

Definitely need to keep the warning.

@tdhock
Copy link
Member Author

tdhock commented Mar 21, 2017

ok well don't worry about it then, it's not a big deal

@arvindpdmn
Copy link

I get the same warning for this line of code (R 3.5.2, data.table 1.12.0):
a[, date_of_birth := as.POSIXct(strptime(as.character(date_of_birth), "%Y-%m-%d"))]

@MichaelChirico
Copy link
Member

MichaelChirico commented Aug 18, 2019

OK so what's happening is that the evaluation environment of j has strptime overwritten locally:

data.table/R/data.table.R

Lines 1151 to 1154 in a8e926a

SDenv$strptime = function(x, ...) {
warning("POSIXlt column type detected and converted to POSIXct. We do not recommend use of POSIXlt at all because it uses 40 bytes to store one date. Use as.POSIXct to avoid this warning.")
as.POSIXct(base::strptime(x, ...))
}

From there, it doesn't discriminate on whether strptime is operating/producing a column. I don't think there's any easy fix to be more selective on this warning, but the message could be more helpful.

Note that AFAIK strptime can always be replaced by an as.POSIXct call (which wraps to as.POSIXlt-->strptime anyway), in which case j will be ignorant to strptime being called "under the hood" (since the call chain will end up at base::as.POSIXct and so base::strptime is used, not SDenv$strptime)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants