Year 2038 issues on win64 #2238

Closed
bagder opened this Issue Jan 15, 2018 · 11 comments

Comments

Projects
None yet
3 participants
@bagder
Owner

bagder commented Jan 15, 2018

  1. CURLINFO_FILETIME returns a time_t as a long, which thus can't return a date after 2038 on win64.

  2. CURLOPT_FILETIME takes a long argument, so libcurl on win64 cannot use dates after 2038 in time-based requests.

  3. There's code using long in the parsedate() function that will cap curl's ability to parse and return the correct value for date strings specifying dates in 2038 and later.

Most (all?) other 64 bit architectures have 64 bit longs so they're not similarly affected.

@monnerat

This comment has been minimized.

Show comment Hide comment
@monnerat

monnerat Jan 15, 2018

Collaborator

long is 32-bit on OS400 too...
OS400 is a (some kind of hardware) 64-bit architecture with 128-bit pointers. It supports long long for 64-bit integers.

Collaborator

monnerat commented Jan 15, 2018

long is 32-bit on OS400 too...
OS400 is a (some kind of hardware) 64-bit architecture with 128-bit pointers. It supports long long for 64-bit integers.

@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Jan 15, 2018

Owner

Some notes:

Neither curl_easy_setopt() nor curl_easy_getinfo() currently have any type bit used for time_t values. Abuse the same one we use for curl_off_t could be tempting, but I'm convinced there are targets where time_t and curl_off_t differs in size so it wouldn't be a foolproof approach.

I think we'll be forced to introduce new type bits.

CURLINFO_FILETIME is documented to return -1 when unknown, which in an internal comment was an argument to not store the time as a time_t (which might be unsigned), but I think I'd rather prefer to have a separate boolean set TRUE/FALSE when we have the info or not.

Owner

bagder commented Jan 15, 2018

Some notes:

Neither curl_easy_setopt() nor curl_easy_getinfo() currently have any type bit used for time_t values. Abuse the same one we use for curl_off_t could be tempting, but I'm convinced there are targets where time_t and curl_off_t differs in size so it wouldn't be a foolproof approach.

I think we'll be forced to introduce new type bits.

CURLINFO_FILETIME is documented to return -1 when unknown, which in an internal comment was an argument to not store the time as a time_t (which might be unsigned), but I think I'd rather prefer to have a separate boolean set TRUE/FALSE when we have the info or not.

@monnerat

This comment has been minimized.

Show comment Hide comment
@monnerat

monnerat Jan 16, 2018

Collaborator

We could also have
#define CURL_TIME_UNKNOWN ((time_t) -1)

Collaborator

monnerat commented Jan 16, 2018

We could also have
#define CURL_TIME_UNKNOWN ((time_t) -1)

bagder added a commit that referenced this issue Jan 18, 2018

parsedate: fix date parsing for systems with 32 bit long
Make curl_getdate() handle dates before 1970 as well (returning negative
values).

Make test 517 test dates for 64 bit time_t.

This fixes bug (3) mentioned in #2238
@jay

This comment has been minimized.

Show comment Hide comment
@jay

jay Jan 19, 2018

Owner

#define CURL_TIME_UNKNOWN ((time_t) -1)

I think it's possible that some platforms can have multiple size time_t so for example libcurl can be built with time_t one size and the app can be built with it another size I think.

The best might be do it as CURLINFO_FILETIME_T and then curl_off_t which we can assume by 2038 will be 64 bits, and if it's not then long long wouldn't exist

Owner

jay commented Jan 19, 2018

#define CURL_TIME_UNKNOWN ((time_t) -1)

I think it's possible that some platforms can have multiple size time_t so for example libcurl can be built with time_t one size and the app can be built with it another size I think.

The best might be do it as CURLINFO_FILETIME_T and then curl_off_t which we can assume by 2038 will be 64 bits, and if it's not then long long wouldn't exist

@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Jan 19, 2018

Owner

it's possible that some platforms can have multiple size time_t

Really? That would be totally weird. If that truly exists, I'm sure that's more of a mistake than a feature we need to take precautions to deal with.

My biggest concern with using -1 is that a time_t can actually legally hold (pretty much) all values already for datestamps. -1 equals the time one second before Jan 1st 1970. When we use -1 as an error code (which we do for curl_getdate) we actually need to make sure we don't return -1 for the valid date string that would return that value!

But as we've already gone with -1 in several places in our API we're stuck with that externally.

Owner

bagder commented Jan 19, 2018

it's possible that some platforms can have multiple size time_t

Really? That would be totally weird. If that truly exists, I'm sure that's more of a mistake than a feature we need to take precautions to deal with.

My biggest concern with using -1 is that a time_t can actually legally hold (pretty much) all values already for datestamps. -1 equals the time one second before Jan 1st 1970. When we use -1 as an error code (which we do for curl_getdate) we actually need to make sure we don't return -1 for the valid date string that would return that value!

But as we've already gone with -1 in several places in our API we're stuck with that externally.

@jay

This comment has been minimized.

Show comment Hide comment
@jay

jay Jan 19, 2018

Owner

I think I've seen it done somewhere like time64 or time32 typedef to timet. Or maybe I'm thinking of off_t

Owner

jay commented Jan 19, 2018

I think I've seen it done somewhere like time64 or time32 typedef to timet. Or maybe I'm thinking of off_t

@monnerat

This comment has been minimized.

Show comment Hide comment
@monnerat

monnerat Jan 19, 2018

Collaborator

-1 equals the time one second before Jan 1st 1970. When we use -1 as an error code (which we do for curl_getdate) we actually need to make sure we don't return -1 for the valid date string that would return that value!

Yes, this is true, but:

  • In Dec 31 1969 at 23:59:59 UTC, no OS was using this time convention, and converting dates when migrating old files should have lead to a situation where problems would (statistically) already have occurred: (signed) -1 is always that time, whatever time_t size is.
  • For an unsigned implementation, this means the very last second before counter overflows. This will be a minor problem compared to all others that this situation will cause for such an OS (remember Y2K bug!).
  • We are speaking about just a 1 second duration.

IMHO, a file can only have this time if it has been set explicitly. This cannot reasonably be a real time.

If we prefer to use the max value for all cases, we could have something equivalent to:

#if (time_t) -1 < (time_t) 0
#define CURL_TIME_UNKNOWN (((time_t) -1) >> 1)
#else
#define CURL_TIME_UNKNOWN ((time_t) -1)
#endif
Collaborator

monnerat commented Jan 19, 2018

-1 equals the time one second before Jan 1st 1970. When we use -1 as an error code (which we do for curl_getdate) we actually need to make sure we don't return -1 for the valid date string that would return that value!

Yes, this is true, but:

  • In Dec 31 1969 at 23:59:59 UTC, no OS was using this time convention, and converting dates when migrating old files should have lead to a situation where problems would (statistically) already have occurred: (signed) -1 is always that time, whatever time_t size is.
  • For an unsigned implementation, this means the very last second before counter overflows. This will be a minor problem compared to all others that this situation will cause for such an OS (remember Y2K bug!).
  • We are speaking about just a 1 second duration.

IMHO, a file can only have this time if it has been set explicitly. This cannot reasonably be a real time.

If we prefer to use the max value for all cases, we could have something equivalent to:

#if (time_t) -1 < (time_t) 0
#define CURL_TIME_UNKNOWN (((time_t) -1) >> 1)
#else
#define CURL_TIME_UNKNOWN ((time_t) -1)
#endif
@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Jan 19, 2018

Owner

Right, but setting the date on files isn't totally unreasonable use-case. Perhaps you set the date of old scanned photos, or PDFs of old books or similar to the date of their origins.

My linux system has no problem with very old dates, but I see that Apache doesn't like to serve files from before 1970:

$ touch -d "1 jan 1804" $apacheroot/1804.html
$ ls -l $apacheroot/1804.html
-rw-r--r-- 1 daniel daniel 0 jan  1  1804 1804.html
$ curl -sI localhost/1804.html | grep ^Last
Last-Modified: Thu, 01 Jan 1970 00:00:00 GMT
Owner

bagder commented Jan 19, 2018

Right, but setting the date on files isn't totally unreasonable use-case. Perhaps you set the date of old scanned photos, or PDFs of old books or similar to the date of their origins.

My linux system has no problem with very old dates, but I see that Apache doesn't like to serve files from before 1970:

$ touch -d "1 jan 1804" $apacheroot/1804.html
$ ls -l $apacheroot/1804.html
-rw-r--r-- 1 daniel daniel 0 jan  1  1804 1804.html
$ curl -sI localhost/1804.html | grep ^Last
Last-Modified: Thu, 01 Jan 1970 00:00:00 GMT
@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Jan 19, 2018

Owner

Negative values only work if time_t is signed though. I think need to adjust for that in #2250

Owner

bagder commented Jan 19, 2018

Negative values only work if time_t is signed though. I think need to adjust for that in #2250

@monnerat

This comment has been minimized.

Show comment Hide comment
@monnerat

monnerat Jan 19, 2018

Collaborator

If we admit to have a single CURL_TIME_UNKNOWN value out of 4G possible time values, I think the max time_t value suggestion above is the most reasonable (possibly biasing matching real file time by one if needed, as you did in #2250 but downwards), because it would represent the time 1 second before system's "end of the world". In addition, I hope systems providing 32-bit only time_t will not be supported anymore in 2038 !?!

Else you cannot avoid an external FALSE/TRUE indicator.

Collaborator

monnerat commented Jan 19, 2018

If we admit to have a single CURL_TIME_UNKNOWN value out of 4G possible time values, I think the max time_t value suggestion above is the most reasonable (possibly biasing matching real file time by one if needed, as you did in #2250 but downwards), because it would represent the time 1 second before system's "end of the world". In addition, I hope systems providing 32-bit only time_t will not be supported anymore in 2038 !?!

Else you cannot avoid an external FALSE/TRUE indicator.

bagder added a commit that referenced this issue Jan 19, 2018

parsedate: fix date parsing for systems with 32 bit long
Make curl_getdate() handle dates before 1970 as well (returning negative
values).

Make test 517 test dates for 64 bit time_t.

This fixes bug (3) mentioned in #2238

bagder added a commit that referenced this issue Jan 25, 2018

parsedate: fix date parsing for systems with 32 bit long
Make curl_getdate() handle dates before 1970 as well (returning negative
values).

Make test 517 test dates for 64 bit time_t.

This fixes bug (3) mentioned in #2238

Closes #2250
@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Jan 25, 2018

Owner

CURLOPT_FILETIME actually takes a long set to 1 or 0 so that's not a problem.

CURLOPT_TIMEVALUE however takes a long for a time...

Owner

bagder commented Jan 25, 2018

CURLOPT_FILETIME actually takes a long set to 1 or 0 so that's not a problem.

CURLOPT_TIMEVALUE however takes a long for a time...

bagder added a commit that referenced this issue Jan 25, 2018

time: support > year 2038 time stamps for system with 32bit long
... with the introduction of CURLOPT_TIMEVALUE_LARGE and
CURLINFO_FILETIME_T.

Fixes #2238

bagder added a commit that referenced this issue Jan 26, 2018

time: support > year 2038 time stamps for system with 32bit long
... with the introduction of CURLOPT_TIMEVALUE_LARGE and
CURLINFO_FILETIME_T.

Fixes #2238

bagder added a commit that referenced this issue Jan 29, 2018

time: support > year 2038 time stamps for system with 32bit long
... with the introduction of CURLOPT_TIMEVALUE_LARGE and
CURLINFO_FILETIME_T.

Fixes #2238

@bagder bagder closed this in 8f69a9f Jan 30, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment