anydate transforms to previous day #5

Closed
twolodzko opened this Issue Sep 14, 2016 · 64 comments

Projects

None yet

4 participants

@twolodzko
twolodzko commented Sep 14, 2016 edited

anydate transforms date to previous day, while anytime correctly transforms the dates:

> anydate(20150101)
[1] "2014-12-31"
> anydate("2015/01/01")
[1] "2014-12-31"
> anytime(20150101)
[1] "2015-01-01 CET"
> anytime("2015/01/01")
[1] "2015-01-01 CET"
@eddelbuettel
Owner
eddelbuettel commented Sep 14, 2016 edited

That seems to be a timezone error on your end. What does anytime:::getTZ() get you?

FWIW, this does not reproduce:

> anydate(20150101)
[1] "2015-01-01"
> anydate("2015/01/01")
[1] "2015-01-01"
> anytime(20150101)
[1] "2015-01-01 CST"
> anytime("2015/01/01")
[1] "2015-01-01 CST"
> anytime:::getTZ()
[1] "SystemV/CST6CDT"
> 

Please try with explicit tz=... arguments as well.

@eddelbuettel
Owner

Also:

R> anytime("2015/01/01", tz="CET")      
[1] "2015-01-01 07:00:00 CET"
R> anydate("2015/01/01", tz="CET")
[1] "2015-01-01"
R> 
@twolodzko
twolodzko commented Sep 14, 2016 edited

Previous example was from Windows machine (will check tomorrow), but the same is on Linux:

> anytime:::getTZ()
[1] "Poland"
> anytime::anydate("20150101")
[1] "2014-12-31"

and

> anytime::anydate("2015/01/01", tz="CET")
[1] "2014-12-31"
@eddelbuettel
Owner
eddelbuettel commented Sep 14, 2016 edited

Thanks for the follow-up. Maybe we check some more tomorrow.

Here anydate() does very little -- it just transform an R object of type POSIXct to Date. The error, if any, will happen before that. Maybe you can do some more digging comparing from POSIXct you get from anytime() relative to strptime(). One can look at the components this way:

R> tstr <- "2016-01-01 01:02:03"
R> a <- unclass(as.POSIXlt(as.POSIXct(strptime(tstr, "%Y-%m-%d %H:%M:%S"))))
R> b <- unclass(as.POSIXlt(anytime(tstr)))
R> all.equal(a, b)
[1] TRUE
R> 

Do that locally and see if a and b differ. (I needed to go the extra step to POSIXct then POSIXlt to get the GMT offset in both.

@tonyxv
tonyxv commented Sep 14, 2016

I also get the same issue, running on a windows machine and RStudio in Sydney Australia

> anydate(20150101)
[1] "2014-12-31"
> anydate("2015/01/01")
[1] "2014-12-31"
> anytime(20150101)
[1] "2014-12-31 23:00:00 AEDT"
> anytime("2015/01/01")
[1] "2014-12-31 23:00:00 AEDT"
> anydate("2015/01/01", tz="CET")
[1] "2014-12-31"
> anydate("2015/01/01", tz="Australia/Sydney")
[1] "2014-12-31"
@eddelbuettel
Owner
eddelbuettel commented Sep 14, 2016 edited

Debugging help welcome. "Works here" as the saying goes. Below is from R 3.3.1 in a win7 VM. So works across OSs in my TZ.

R> library(anytime)
R> anydate(20150101)
[1] "2015-01-01"
R> anydate("2015/01/01")
[1] "2015-01-01"
R> anytime(20150101)
[1] "2015-01-01 CST"
R> anytime("2015/01/01")
[1] "2015-01-01 CST"
R> anydate("2015/01/01", tz="CET")
[1] "2015-01-01"
R> anydate("2015/01/01", tz="Australia/Sydney")
[1] "2015-01-01"
R> Sys.info()[1:3]
                     sysname                      release                      version 
                   "Windows"                      "7 x64" "build 7601, Service Pack 1" 
R> 
@eddelbuettel
Owner

I can however replicate it so that is a first step in the right direction:

edd@max:~/git/anytime(master)$ R --slave -e 'source("tests/simple.R", echo=TRUE)'

R> Sys.setenv(TZ = "Australia/Sydney")

R> library(anytime)

R> anydate(20150101)
[1] "2014-12-31"

R> anydate("2015/01/01")
[1] "2014-12-31"

R> anytime(20150101)
[1] "2015-01-01 AEDT"

R> anytime("2015/01/01")
[1] "2015-01-01 AEDT"

R> anydate("2015/01/01", tz = "CET")
[1] "2014-12-31"

R> anydate("2015/01/01", tz = "Australia/Sydney")
[1] "2014-12-31"
edd@max:~/git/anytime(master)$ 
@eddelbuettel
Owner

And I may have a fix. If we first pass to as.POSIXlt() and then to as.Date(), things seem to work:

edd@max:~/git/anytime(master)$ R --slave -e 'source("tests/simple.R", echo=TRUE)'

R> Sys.setenv(TZ = "Australia/Sydney")

R> library(anytime)

R> anydate(20150101)
[1] "2015-01-01"

R> anydate("2015/01/01")
[1] "2015-01-01"

R> anytime(20150101)
[1] "2015-01-01 AEDT"

R> anytime("2015/01/01")
[1] "2015-01-01 AEDT"

R> anydate("2015/01/01", tz = "CET")
[1] "2014-12-31"

R> anydate("2015/01/01", tz = "Australia/Sydney")
[1] "2015-01-01"

Try version 0.0.1.2 now in GitHub, or else wait for 0.0.2. More testing would be welcome.

And thanks a ton for this heads-up.

@eddelbuettel eddelbuettel added a commit that referenced this issue Sep 15, 2016
@eddelbuettel renamed
this was issue #5 not #4
cce9c8b
@tonyxv
tonyxv commented Sep 18, 2016

I've updated to the latest version 0.0.2. The problem still persists on my machine. Is there another TZ setting I need to set?

> install.packages("anytime")
--- Please select a CRAN mirror for use in this session ---
trying URL 'https://cran.ms.unimelb.edu.au/bin/windows/contrib/3.3/anytime_0.0.2.zip'
Content type 'application/zip' length 606552 bytes (592 KB)
downloaded 592 KB

package ‘anytime’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
        C:\Users\xxxx\AppData\Local\Temp\Rtmp8wVj6B\downloaded_packages
> Sys.setenv(TZ = "Australia/Sydney")
> library(anytime)
> anydate(20150101)
[1] "2014-12-31"
> anydate("2015/01/01")
[1] "2014-12-31"
> anytime(20150101)
[1] "2014-12-31 23:00:00 AEDT"
> anytime("2015/01/01")
[1] "2014-12-31 23:00:00 AEDT"
> anydate("2015/01/01", tz = "CET")
[1] "2014-12-31"
> anydate("2015/01/01", tz = "Australia/Sydney")
[1] "2014-12-31"
> sessionInfo(package=NULL)
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252    LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] anytime_0.0.2

loaded via a namespace (and not attached):
[1] tools_3.3.1 Rcpp_0.12.7
@eddelbuettel
Owner

I have no idea; this does reproduce here:

R> Sys.setenv(TZ = "Australia/Sydney") # important: set this before loading anytime
R> library(anytime)                    # will memoize TZ
R> packageVersion("anytime")           # 0.0.2 on CRAN vs 0.0.2.1 on GH shouldn't matter
[1] ‘0.0.2.1R> anydate(20150101)
[1] "2015-01-01"
R> 

I fear that you are on your own until you have something I can reproduce.

@eddelbuettel eddelbuettel reopened this Sep 19, 2016
@eddelbuettel
Owner
eddelbuettel commented Sep 19, 2016 edited

Maybe it's a Windows issue, maybe it isn't. I wrote anytime mostly for anytime() and tossed in anydate() as a convenience. If that latter one does not work for you, so be it. I can't fix it from here, sadly.

@eddelbuettel
Owner

One way to help would be to examine all the subcomponents of unclass(as.POSIXlt(...))). Good luck.

@eddelbuettel
Owner

I just tried on windows7 and there too I am unable to replicate this:

R> Sys.setenv("TZ"="Australia/Sydney")
R> library(anytime)
R> packageVersion("anytime")
[1] �0.0.2R> anydate(20150101)
[1] "2015-01-01"
R> Sys.info()[1:3]
                 sysname                      release                      version     
               "Windows"                      "7 x64" "build 7601, Service Pack 1"     
R> Sys.time()
[1] "2016-09-20 03:27:04.65765 AEST"
R> 
@tonyxv
tonyxv commented Sep 19, 2016

Hi,
Thanks for looking into this.

> anytime("2015/01/01")
[1] "2014-12-31 23:00:00 AEDT"

I just had a thought, looks like anytime() is returning a date/time that is 1 hour too early. I'm guessing it could have something to do with daylight saving time? I wonder if it will fixed when we put clocks forward by 1 hour on 2-Oct-2016.

@eddelbuettel
Owner

Yes -- please try to look at unclass(as.POSIXlt(anytime("2015/01/01"))) and compare to what base R gets you. A field somewhere is probably not set. Also check what I do in this bit. I need that second term everywhere else -- but maybe it throws things off for you.

@tonyxv
tonyxv commented Sep 20, 2016

I get:

> unclass(as.POSIXlt(anytime("2015/01/01")))
$sec
[1] 0

$min
[1] 0

$hour
[1] 23

$mday
[1] 31

$mon
[1] 11

$year
[1] 114

$wday
[1] 3

$yday
[1] 364

$isdst
[1] 1

$zone
[1] "AEDT"

$gmtoff
[1] 39600

attr(,"tzone")
[1] "Australia/Sydney" "AEST"             "AEDT"            
> unclass(as.POSIXlt("2015/01/01"))
$sec
[1] 0

$min
[1] 0

$hour
[1] 0

$mday
[1] 1

$mon
[1] 0

$year
[1] 115

$wday
[1] 4

$yday
[1] 0

$isdst
[1] 1

$zone
[1] "AEDT"

$gmtoff
[1] NA

@eddelbuettel
Owner

Ok, thanks, that shows it as 'too late'. The day and hour have already shifted.

@eddelbuettel
Owner

But I just noticed something. On my Windows box I got (see above)

R> Sys.time()
[1] "2016-09-20 03:27:04.65765 AEST"
R> 

whereas you have AEDT. Only one of the two can be correct.

@tonyxv
tonyxv commented Sep 20, 2016

I'm seeing:

> Sys.time()
[1] "2016-09-20 11:14:35 AEST"
@eddelbuettel
Owner

Never mind, false alert. AEDT was for Jan 01.

In short, I have no relevant idea and cannot reproduce. Sorry.

@tonyxv
tonyxv commented Sep 20, 2016

Why is your anytime() a whole date:

R> anytime(20150101)
[1] "2015-01-01 AEDT"

whereas mine returns a date and time:

anytime(20150101)
[1] "2014-12-31 23:00:00 AEDT"
@eddelbuettel
Owner

Standard behaviour of format.POSIXct. Try

R> format( as.POSIXct( "2016-09-21 00:00:00" ))
[1] "2016-09-21"
R> 
@jason-turner2
jason-turner2 commented Oct 17, 2016 edited

Here in Sydney, Dec 31 happens during DST. If someone in Brisbane could test, that would be helpful too (no DST in Brisbane).

The result is a bit odd: "2014-12-31 23:00 AEST" is the same time as "2015-01-01 0:00 AEDT". In the example below, it looks like it's taken the numbers for AEST but stuck the "AEDT" symbol on the end (but we all know that looks can be deceiving).

$gmtoff is 11h, which is correct for AEDT.

Using anytime v 0.0.3, on R 3.3.1, x86_64, mingw32, on Windows 10:

> anydate(20150101)
[1] "2014-12-31"
> anytime(20150101)
[1] "2014-12-31 23:00:00 AEDT"

> unclass(as.POSIXlt(anytime("2015/01/01")))
$sec
[1] 0

$min
[1] 0

$hour
[1] 23

$mday
[1] 31

$mon
[1] 11

$year
[1] 114

$wday
[1] 3

$yday
[1] 364

$isdst
[1] 1

$zone
[1] "AEDT"

$gmtoff
[1] 39600

attr(,"tzone")
[1] "Australia/Sydney" "AEST"             "AEDT"            
@eddelbuettel
Owner

If only I knew where that hour offset creeps in. 😞

@jason-turner2

It gets a little weirder on Linux (Debian 7, R.version 3.1.1, anytime 0.0.3):

> anydate(20150101)
[1] "2014-12-31"
> anytime(20150101)
[1] "2014-12-31 13:00:00"
> unclass(as.POSIXlt(anytime("2015/01/01")))
$sec
[1] 0

$min
[1] 0

$hour
[1] 13

$mday
[1] 31

$mon
[1] 11

$year
[1] 114

$wday
[1] 3

$yday
[1] 364

$isdst
[1] 0

$zone
[1] ""

$gmtoff
[1] 0

attr(,"tzone")
[1] "NA" ""   ""

Checking the time and zone:

> Sys.time()
[1] "2016-10-17 12:47:32 AEDT"

Which is correct (at time of typing).

@eddelbuettel
Owner
eddelbuettel commented Oct 17, 2016 edited

anytime use gettz to help it get a timezone as a fallback. What does it do for you?

R> library(anytime)
R> anytime(Sys.time(), tz="SystemV/CST6CDT")
[1] "2016-10-16 20:51:01.303017 CDT"
R> anytime(Sys.time(), tz="America/Chicago")
[1] "2016-10-16 20:51:10.63891 CDT"
R> anytime:::getTZ()
[1] "SystemV/CST6CDT"
R> 

I get the correct (nerdy) representation of Chicago-time as "CST6CDT".

Oh, and for Jan 1:

R> anytime(20150101)
[1] "2015-01-01 CST"
R> anydate(20150101)
[1] "2015-01-01"
R> 
@jason-turner2
jason-turner2 commented Oct 17, 2016 edited

On the Debian box:

anytime:::getTZ()
[1] NA

On the Win10 box:

> anytime:::getTZ()
[1] "Australia/Sydney"

Now, if I tell the Debian R session our time zone, I cannot reproduce the bug that first opened this issue:

> anytime(20150101, tz="Australia/Sydney")
[1] "2015-01-01 AEDT"
> anydate(20150101, tz="Australia/Sydney")
[1] "2015-01-01"

But on Windows, the bug is still there:

> anytime(20150101, tz="Australia/Sydney")
[1] "2014-12-31 23:00:00 AEDT"

Are there any underlying package versions I should check?

@eddelbuettel
Owner
eddelbuettel commented Oct 17, 2016 edited

Right. I think there are several issues here:

Linux

  • let us look into (micro)package gettz and see why you get NA. Oh, wait, maybe you don't have package gettz installed. Try
R> library(gettz)     # maybe install it first; it's to avoid the NA you saw
R> gettz::gettz()
[1] "America/Chicago"
R> 

and then, with some hope, we think the behaviour on Linux will be the same as when you tell it about Sydney.

Windows

We may have a subtle bug between R's use (on Windows) of its own TZ database, and what Boost date_time uses. That would require some digging and testing on Windows.

@jason-turner2

I installed gettz, and ran:

> library(gettz)
> gettz::gettz()
[1] "Australia/Sydney"

I re-started R, and ran

> library(gettz)
> library(anytime)
> anytime(20150101)
[1] "2014-12-31 13:00:00"
> gettz::gettz()
[1] "Australia/Sydney"
> anytime(20150101, tz=gettz())
[1] "2015-01-01 AEDT"
@eddelbuettel
Owner

Super helpful. What we have in the second set should not happen. There is a minimal amount of code anytime's R/init.R -- we need to figure out why the (available) information from gettz is not used. Maybe !isTRUE(nzchar(tz)) is not right for you? Can you poke around?

@jason-turner2

Interesting.

> Sys.timezone()
[1] NA

> ! isTRUE(nzchar(Sys.timezone()))
[1] FALSE
@jason-turner2
jason-turner2 commented Oct 17, 2016 edited

Maybe this?

> tz <- Sys.timezone()
> (! isTRUE(nzchar(tz)) || is.na(tz))
[1] TRUE
>

because:

> nzchar(NA)
[1] TRUE
@eddelbuettel
Owner
eddelbuettel commented Oct 17, 2016 edited

Yes, I think that is better. I think in my case it was "" hence the test for zero length. For some reason you get NA so we better account for that.

I am just so happy that gettz does provide the coverage we need (ie finding Sydney for you).

@eddelbuettel
Owner

Yes, really my bad;

Value:

     ‘Sys.timezone’ returns an OS-specific character string, possibly
     ‘NA’ or an empty string (which on some OSes means ‘UTC’).  For the
     default ‘location = TRUE’ this will be a location such as
     ‘"Europe/London"’ if one can be ascertained.  For ‘location =
     FALSE’ this may be an abbreviation such as ‘"EST"’ or ‘"CEST"’ on
     Windows.

(from ?Sys.timezon())

@jason-turner2

Now, back to the Windows bug. I'm not familiar with the Boost date_time mentioned above. Where should I peek?

@eddelbuettel
Owner

Do you have g++ and Boost installed? I generally just start by working with five-line little programs with a single main().

The main pain here may be installing Boost. R can help -- the BH package we use for anytime has what we need. Rtools can give you g++.

@jason-turner2

I never spotted that:

...
 For ‘location = FALSE’ this may be an abbreviation such as ‘"EST"’ or ‘"CEST"’ on Windows.

Every day I learn that time zones are even messier than I believed yesterday. :-/

@eddelbuettel
Owner

Ditto, yet it ain't working:

R> Sys.timezone()
[1] "SystemV/CST6CDT"
R> Sys.timezone(location=TRUE)
[1] "SystemV/CST6CDT"
R> 
@jason-turner2

g++, yes.

Boost: I only have libbost-iostreams. If I install the full libboost, should I re-install anytime?

@eddelbuettel
Owner

Try installing anytime from source on Windows, and look at the actual commands. It will point -I.... into BH. We can use that from the command-line and or Rcpp.

@jason-turner2
jason-turner2 commented Oct 17, 2016 edited

The Windows box has BH installed (probably from binary), but no Boost I'm aware of apart from that.

I was able to successfully install anytime from source.

c:/Rtools/mingw_64/bin/g++  -I"C:/PROGRA~1/R/R-33~1.1/include" -DNDEBUG    -I"C:/Program Files/R/R-3.3.1/library/Rcpp/include" -I"C:/Program Files/R/R-3.3.1/library/BH/include" -I"d:/Compiler/gcc-4.9.3/local330/include"     -O2 -Wall  -mtune=core2 -c RcppExports.cpp -o RcppExports.o
c:/Rtools/mingw_64/bin/g++  -I"C:/PROGRA~1/R/R-33~1.1/include" -DNDEBUG    -I"C:/Program Files/R/R-3.3.1/library/Rcpp/include" -I"C:/Program Files/R/R-3.3.1/library/BH/include" -I"d:/Compiler/gcc-4.9.3/local330/include"     -O2 -Wall  -mtune=core2 -c anytime.cpp -o anytime.o
c:/Rtools/mingw_64/bin/g++ -shared -s -static-libgcc -o anytime.dll tmp.def RcppExports.o anytime.o -Ld:/Compiler/gcc-4.9.3/local330/lib/x64 -Ld:/Compiler/gcc-4.9.3/local330/lib -LC:/PROGRA~1/R/R-33~1.1/bin/x64 -lR
installing to C:/Users/J/Documents/R/win-library/3.3/anytime/libs/x64

Next step?

@jason-turner2

The bug is still there:

> library(anytime)
> anytime(20150101)
[1] "2014-12-31 23:00:00 AEDT"
@eddelbuettel
Owner

Yes, sure we haven't drilled yet. Your Boost-via-header-only-package BH is here: -I"C:/Program Files/R/R-3.3.1/library/BH/include"

You should now be able to do the equivalent of this (by adding -I....that directory for BH...

$ g++ -o /tmp/boost_date_parser boost_date_parser_cmdline.cpp
$ /tmp/boost_date_parser "2016-01-01 20:21:22"
ptime is 2016-Jan-01 20:21:22 -- seconds from epoch are 1451679682 from 2016-01-01 20:21:22
$ 

Code below.

// -*- mode: C++; c-indent-level: 4; c-basic-offset: 4; tab-width: 4 -*-

// cf http://stackoverflow.com/a/3787188/143305 and extended
#include <iostream>
#include <boost/date_time.hpp>

namespace bt = boost::posix_time;

const std::locale formats[] = {
    // std::locale(std::locale::classic(),new bt::time_input_facet("%x")),
    std::locale(std::locale::classic(),new bt::time_input_facet("%Y-%m-%d %H:%M:%S")),
    std::locale(std::locale::classic(),new bt::time_input_facet("%Y%m%d %H%M%S")),
    std::locale(std::locale::classic(),new bt::time_input_facet("%Y/%m/%d %H:%M:%S")),
    std::locale(std::locale::classic(),new bt::time_input_facet("%m/%d/%Y %H:%M:%S")),
    std::locale(std::locale::classic(),new bt::time_input_facet("%m-%d-%Y %H:%M:%S")),
    std::locale(std::locale::classic(),new bt::time_input_facet("%d.%m.%Y %H:%M:%S")),
    std::locale(std::locale::classic(),new bt::time_input_facet("%Y-%m-%d")),
    std::locale(std::locale::classic(),new bt::time_input_facet("%Y%m%d"))
};
const size_t formats_n = sizeof(formats)/sizeof(formats[0]);

std::time_t pt_to_time_t(const bt::ptime& pt) {
    bt::ptime timet_start(boost::gregorian::date(1970,1,1));
    bt::time_duration diff = pt - timet_start;
    return diff.ticks()/bt::time_duration::rep_type::ticks_per_second;
}

void seconds_from_epoch(const std::string& s) {
    bt::ptime pt, ptbase;
    for (size_t i=0; pt == ptbase && i < formats_n; ++i) {
        std::istringstream is(s);
        is.imbue(formats[i]);
        is >> pt;
    }
    if (pt == bt::ptime()) {
        std::cerr << "Parse error for " << s << '\n';
        exit(-1);
    }
    std::cout << "ptime is " << pt << " -- seconds from epoch are " 
              << pt_to_time_t(pt) << " from " << s << '\n';
}

int main(int argc, char *argv[]) {
    if (argc < 2) {
        std::cerr << "Usage: " << argv[0] << " arg1 [arg2...]\n";
        exit(-1);
    }
    for (int i=1; i<argc; i++) {
        seconds_from_epoch(argv[i]);
    }
    exit(0);
}
@eddelbuettel
Owner

It'lll either be at this layer, or one step down in anytime when we convert to numeric representation to feed into as.POSIXct(). We'll get there -- but end of a (long) day for me here.

You are being extremely helpful. Let's see if we can nail this thing over the next few days. Boost Date_time documentation is not for the faint of heart, but there are examples....

@jason-turner2

I'll be around for the next few days if you need some sleep. :)

@jason-turner2

Ok, here we go go go...

.\boost_date_parser_cmdline.exe "2016-01-01 20:21:22"
ptime is 2016-Jan-01 20:21:22 -- seconds from epoch are 1451679682 from 2016-01-01 20:21:22
@eddelbuettel
Owner

Very very good. Now if you poke into anytime's sources you will find the ptToDouble() function.

In it, I compute a dstadj. Try taking that out, and maybe we won't have the 'date comes back as one hour before midnight'. If so, the $64,000 becomes when to adjust and when not it. It is a wee bit heuristic but it works over here...

@jason-turner2
jason-turner2 commented Oct 17, 2016 edited

Before we chase that: does boost_date_parser_cmdline.exe know about the local time zone, or is it assuming UTC? I ask because looking at that integer:

> as.POSIXct(1451679682 , origin=as.POSIXct('1970-01-01 0:00', tz='UTC'))
[1] "2016-01-02 07:21:22 AEDT"

Or is none of that relevant, because it's not yet relevant? :)

@eddelbuettel
Owner

It is set to work on localtime -- it is at the beginning of ptToDouble. (And you need the ) before tz='UTC')

@jason-turner2
jason-turner2 commented Oct 17, 2016 edited

I commented out in here:

  double totsec = tdiff.total_microseconds()/1.0e6, dstadj = 0;
    /*  <---- added by J
#if defined(_WIN32)
    if (totsec > 0) {           // on Windows, for dates before 1970-01-01: segfault
        dstadj = localAsTm->tm_isdst*60*60;
    }
#else
    dstadj = localAsTm->tm_isdst*60*60;
#endif
added by J ---> */
    return totsec - dstadj;

And that seems to remove one bug and add another. So now on to the $64,000 question: when and how to apply the DST correction under Windows Downunda. :)

> anytime(20150101)
[1] "2015-01-01 AEDT"
> anytime(20150601)
[1] "2015-05-31 23:00:00 AEST"
@eddelbuettel
Owner

Right. Boost and R seem to disagree about how to deal with Australia. Between them, one will be wrong.
Really not quite sure how to handle this... More thought needed.

@jason-turner2

When an idea hits, I'll be very happy to help where I can. Updates on this thread go to my Inbox. Good luck!

@eddelbuettel
Owner

If you have any ideas....

@jason-turner2

No ideas, but a question: is the BH package identical on Windows and Linux? i.e. nothing compiled. The Boost docs suggest it probably is. Just working out where to look.

@eddelbuettel
Owner
eddelbuettel commented Oct 17, 2016 edited

No, you just misunderstand how Boost and header-only libraries work -- see eg here on Wikipedia for quick take.

There are LOTS of them with just headers. The parts of Boost which require linking are typically i/o related (eg for Boost Date_time you must link if you want to format strings rather than parse them).

All the OS-dependent stuff is dealt with via #define and #if the usual way.

@eddelbuettel
Owner

I wrote a little script that extracts all possible value for TZ from the zoneinfo file, and then parses a given string at the timezone, and in R formats to that timezone.

When I run that, all 611 values come out equal. I can email you the script--drop me a line at the usual email of edd@debian.org.

@eddelbuettel
Owner

The fact that it works for you on Linux is encouraging. This may all come down to a bug in the entry used on Windows. Now to find whether it is R, or Boost.

I'll follow-up with a little R/C++ hybrid making use of the 611 TZ values.

@eddelbuettel
Owner

I thought I had something but I don't.

Boost would need its own timezone db to do what I planned to do --> over kill.

@eddelbuettel
Owner

@jason-turner2 Any chance you could give the current master branch a spin, particularly the UTC parsing? That should give us a clue vis-a-vis the time offset as we can not pin things down to UTC-to-localtime-by-R-only.

@jason-turner2

@eddelbuettel , happy to do it. Might not be able today; will definitely get to it this week.

@eddelbuettel
Owner

Sounds good! We should keep each other honest and try to chase this one down.

Having utctime() and utcdate() should help a little. I suspect a Windows+Boost issue, but we are still far from "proof" on this.

@eddelbuettel
Owner

@jason-turner2 : I think i found something crazy.

First off, I had though the tm_isdst field could only be zero or one; I just learned from help(DateTimeClasses) that minus one is also possible. So poked around a little. Start, if you can, with the current master in GitHub and add the line

    Rcpp::Rcout << "FYI, isdst is " << localAsTm->tm_isdst << std::endl;

as the penultimate line in ptToDouble() in file src/anytime.cpp -- currently line 149 for me.

Then some experiments:

Brisbane only every gets AEST, never AEDT ?

$ Rscript -e 'library(anytime); anytime:::setTZ("Australia/Brisbane"); anytime("2016-07-11"); \ print(anytime:::getTZ())'                                               
FYI, isdst is 1
[1] "2016-07-11 15:00:00 AEST"
[1] "Australia/Brisbane"
$ Rscript -e 'library(anytime); anytime:::setTZ("Australia/Brisbane"); anytime("2016-01-11");  \print(anytime:::getTZ())'                                               
FYI, isdst is 0
[1] "2016-01-11 16:00:00 AEST"
[1] "Australia/Brisbane"
$

Sydney and Canberra get a two-hour difference?

$ Rscript -e 'library(anytime); anytime:::setTZ("Australia/Sydney"); anytime("2016-08-11 12:00:00"); print(anytime:::getTZ())'
FYI, isdst is 1
[1] "2016-08-12 03:00:00 AEST"
[1] "Australia/Sydney"
$ Rscript -e 'library(anytime); anytime:::setTZ("Australia/Sydney"); anytime("2016-02-11 12:00:00"); print(anytime:::getTZ())'
FYI, isdst is 0
[1] "2016-02-12 05:00:00 AEDT"
[1] "Australia/Sydney"
$ Rscript -e 'library(anytime); anytime:::setTZ("Australia/Canberra"); anytime("2016-02-11 12:00:00"); print(anytime:::getTZ())'
FYI, isdst is 0
[1] "2016-02-12 05:00:00 AEDT"
[1] "Australia/Canberra"
$ Rscript -e 'library(anytime); anytime:::setTZ("Australia/Canberra"); anytime("2016-08-11 12:00:00"); print(anytime:::getTZ())'
FYI, isdst is 1
[1] "2016-08-12 03:00:00 AEST"
[1] "Australia/Canberra"
$ 

All this was on Ubuntu 16.04.1; might be interesting on Windows too.

CCing @bobjansen who wants to help with testing. Six eyes probably beat four.

@jason-turner2

I did this:

  1. uninstalled anytime and closed the R session.
  2. pulled the latest from github and edited as requested. (look at me, mum! I'm a C++ hacker now!) :)
  3. installed from the command prompt. Started a fresh R session.
  4. ran the examples.

There is certainly some weirdness.

I added a line to print each result with the GMT offset, to keep from confusing myself. It didn't work; I'm confused.

Queensland (the state that contains Brisbane) never does DST, but the isdst flag shown during the test is correct for the states that do (e.g. NSW, Victoria, etc). Brisbane seems to subtract one hour or two. Very strange.

As you probably know, when Sydney has DST (like now), Brisbane is one hour behind Sydney (some less-than charitable Sydney residents say Brisbane is in fact fifty years plus one hour behind Sydney, but I'm not convinced this is true). During Standard Time hours, the clocks are the same in both cities.

In the results below, Sydney and Canberra both subtract an hour, DST or not. Also weird.

On Windows 10, R-3.3.1 (Bug in your hair), 64-bit.

> library(anytime) 
> 
> # Brisbane should not correct for DST ever: Queensland
> # never has DST - they think it fades the curtains.
> anytime:::setTZ("Australia/Brisbane") 
> anytime("2016-07-11")
FYI, isdst is 0
[1] "2016-07-10 23:00:00 AEST"
> format(anytime("2016-07-11"), "%Y-%m-%d %H:%M:%S %z")
FYI, isdst is 0
[1] "2016-07-10 23:00:00 +1000"
> print(anytime:::getTZ())
[1] "Australia/Brisbane"
> 
> # a DST-candidate date (if Qld did DST, which they don't,
> # because it would upset the cattle) 
> anytime("2016-01-11")
FYI, isdst is 1
[1] "2016-01-10 22:00:00 AEST"
> format(anytime("2016-01-11"), "%Y-%m-%d %H:%M:%S %z")
FYI, isdst is 1
[1] "2016-01-10 22:00:00 +1000"
> print(anytime:::getTZ())
[1] "Australia/Brisbane"
> 
> # Sydney, where this computer actually lives.
> # and does DST during the southern-hemisphere summer,
> # roughly October to April.
> anytime:::setTZ("Australia/Sydney") 
> anytime("2016-08-11 12:00:00") 
FYI, isdst is 0
[1] "2016-08-11 11:00:00 AEST"
> format(anytime("2016-08-11 12:00:00"), "%Y-%m-%d %H:%M:%S %z")
FYI, isdst is 0
[1] "2016-08-11 11:00:00 +1000"
> print(anytime:::getTZ())
[1] "Australia/Sydney"
> 
> anytime:::setTZ("Australia/Sydney")
> anytime("2016-02-11 12:00:00")
FYI, isdst is 1
[1] "2016-02-11 11:00:00 AEDT"
> format(anytime("2016-02-11 12:00:00"), "%Y-%m-%d %H:%M:%S %z")
FYI, isdst is 1
[1] "2016-02-11 11:00:00 +1100"
> print(anytime:::getTZ())
[1] "Australia/Sydney"
> 
> # Canberra. Should (I believe) behave identically to Sydney
> 
> anytime:::setTZ("Australia/Canberra")
> anytime("2016-02-11 12:00:00")
FYI, isdst is 1
[1] "2016-02-11 11:00:00 AEDT"
> format(anytime("2016-08-11 12:00:00"), "%Y-%m-%d %H:%M:%S %z")
FYI, isdst is 0
[1] "2016-08-11 11:00:00 +1000"
> print(anytime:::getTZ())
[1] "Australia/Canberra"
> 
> anytime:::setTZ("Australia/Canberra") 
> anytime("2016-08-11 12:00:00")
FYI, isdst is 0
[1] "2016-08-11 11:00:00 AEST"
> format(anytime("2016-02-11 12:00:00"), "%Y-%m-%d %H:%M:%S %z")
FYI, isdst is 1
[1] "2016-02-11 11:00:00 +1100"
> print(anytime:::getTZ())
[1] "Australia/Canberra"
@eddelbuettel
Owner

Thanks for running these.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment