Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

64 bit time_t #17393

Closed
hoodmane opened this issue Jul 8, 2022 · 15 comments · Fixed by #17401
Closed

64 bit time_t #17393

hoodmane opened this issue Jul 8, 2022 · 15 comments · Fixed by #17401

Comments

@hoodmane
Copy link
Contributor

hoodmane commented Jul 8, 2022

As discussed in pyodide/pyodide#2841, 2038 is only 16 years from now so it would be great to be able to represent dates further into the future as unix timestamps. In interest of this, we would like to be able to opt into a 64 bit time_t (or have it as the default).

sbc100 added a commit that referenced this issue Jul 8, 2022
This brings us back in line with upstream musl.  The change to 32-bit
was only recently made in #16966. The reason we made this change was
made was because we had certain C library calls that were implemented in
JS that returned `time_t`.  Since returning 64-bit values from JS
functions is not always easy (we don't always have WASM_BIGINT
available) that simplest solution was to define `time_t` to 32-bit which
doesn't have issues at the JS boundary.

However, in the intervening time many of the `time_t`-returning function
have been moved into native code (See #16606 and #16439) with only two
remaining: _mktime_js and _timegm_js.  So this change redefines just
those two functions to return `int` while keeping `time_t` itself as
64-bit.

Fixes: #17393
sbc100 added a commit that referenced this issue Jul 8, 2022
This brings us back in line with upstream musl.  The change to 32-bit
was only recently made in #16966. The reason we made this change was
made was because we had certain C library calls that were implemented in
JS that returned `time_t`.  Since returning 64-bit values from JS
functions is not always easy (we don't always have WASM_BIGINT
available) that simplest solution was to define `time_t` to 32-bit which
doesn't have issues at the JS boundary.

However, in the intervening time many of the `time_t`-returning function
have been moved into native code (See #16606 and #16439) with only two
remaining: _mktime_js and _timegm_js.  So this change redefines just
those two functions to return `int` while keeping `time_t` itself as
64-bit.

Fixes: #17393
sbc100 added a commit that referenced this issue Jul 8, 2022
This brings us back in line with upstream musl.  The change to 32-bit
was only recently made in #16966. The reason we made this change was
made was because we had certain C library calls that were implemented in
JS that returned `time_t`.  Since returning 64-bit values from JS
functions is not always easy (we don't always have WASM_BIGINT
available) that simplest solution was to define `time_t` to 32-bit which
doesn't have issues at the JS boundary.

However, in the intervening time many of the `time_t`-returning function
have been moved into native code (See #16606 and #16439) with only two
remaining: _mktime_js and _timegm_js.  So this change redefines just
those two functions to return `int` while keeping `time_t` itself as
64-bit.

Fixes: #17393
sbc100 added a commit that referenced this issue Jul 8, 2022
This brings us back in line with upstream musl.  The change to 32-bit
was only recently made in #16966. The reason we made this change was
made was because we had certain C library calls that were implemented in
JS that returned `time_t`.  Since returning 64-bit values from JS
functions is not always easy (we don't always have WASM_BIGINT
available) that simplest solution was to define `time_t` to 32-bit which
doesn't have issues at the JS boundary.

However, in the intervening time many of the `time_t`-returning function
have been moved into native code (See #16606 and #16439) with only two
remaining: _mktime_js and _timegm_js.  So this change redefines just
those two functions to return `int` while keeping `time_t` itself as
64-bit.

Fixes: #17393
sbc100 added a commit that referenced this issue Jul 8, 2022
This brings us back in line with upstream musl.  The change to 32-bit
was only recently made in #16966. The reason we made this change was
made was because we had certain C library calls that were implemented in
JS that returned `time_t`.  Since returning 64-bit values from JS
functions is not always easy (we don't always have WASM_BIGINT
available) that simplest solution was to define `time_t` to 32-bit which
doesn't have issues at the JS boundary.

However, in the intervening time many of the `time_t`-returning function
have been moved into native code (See #16606 and #16439) with only two
remaining: _mktime_js and _timegm_js.  So this change redefines just
those two functions to return `int` while keeping `time_t` itself as
64-bit.

Fixes: #17393
@tiran
Copy link
Contributor

tiran commented Jul 17, 2022

I believe that the change broke some stat or utime() related APIs in Emscripten. CPython's test suite is failing on EMSDK 3.1.6 and TOT, e.g.

 ======================================================================
FAIL: test_utime (test.test_os.UtimeTests.test_utime)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/cpython-wasm-test/cpython-wasm-test/cpython/Lib/test/test_os.py", line 805, in test_utime
    self._test_utime(set_time)
  File "/home/runner/work/cpython-wasm-test/cpython-wasm-test/cpython/Lib/test/test_os.py", line 793, in _test_utime
    self.assertAlmostEqual(st.st_atime, atime_ns * 1e-9, delta=1e-6)
AssertionError: 1.1182583010295808e+17 != 1.002003 within 1e-06 delta (1.1182583010295808e+17 difference)

@sbc100
Copy link
Collaborator

sbc100 commented Jul 17, 2022

Ha, that's ironic/unfortunate given that its was python folks who wanted this changed. Any ideas why it would be failing?

@tiran
Copy link
Contributor

tiran commented Jul 17, 2022

Not yet, but I have a gut feeling that the utimensat syscall code should be changed to use bigint and 64bit ints.

__syscall_utimensat__sig: 'iippi',
__syscall_utimensat: function(dirfd, path, times, flags) {
path = SYSCALLS.getStr(path);
#if ASSERTIONS
assert(flags === 0);
#endif
path = SYSCALLS.calculateAt(dirfd, path, true);
if (!times) {
var atime = Date.now();
var mtime = atime;
} else {
var seconds = {{{ makeGetValue('times', C_STRUCTS.timespec.tv_sec, 'i32') }}};
var nanoseconds = {{{ makeGetValue('times', C_STRUCTS.timespec.tv_nsec, 'i32') }}};
atime = (seconds*1000) + (nanoseconds/(1000*1000));
times += {{{ C_STRUCTS.timespec.__size__ }}};
seconds = {{{ makeGetValue('times', C_STRUCTS.timespec.tv_sec, 'i32') }}};
nanoseconds = {{{ makeGetValue('times', C_STRUCTS.timespec.tv_nsec, 'i32') }}};
mtime = (seconds*1000) + (nanoseconds/(1000*1000));
}
FS.utime(path, atime, mtime);
return 0;
},

@tiran
Copy link
Contributor

tiran commented Jul 17, 2022

timespec struct definition depends on size of time_t

system/lib/libc/musl/include/alltypes.h.in:STRUCT timespec { time_t tv_sec; int :8*(sizeof(time_t)-sizeof(long))*(__BYTE_ORDER==4321); long tv_nsec; int :8*(sizeof(time_t)-sizeof(long))*(__BYTE_ORDER!=4321); };

@sbc100
Copy link
Collaborator

sbc100 commented Jul 18, 2022

Hmm.. it looks like the JS filesystem layer assumes timestamps are JS numbers, so we will not be able fully represent i64 times. Do you know if its reasonable for the value passed to utimens to be truncated such that value returns by stat that follows it looks some precision?

If not then I think I might just revert this change since 32-bit time_t is more compatible with the JS layer (which represents time as JS number (double) number of seconds): https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/getTime

@sbc100
Copy link
Collaborator

sbc100 commented Jul 18, 2022

Hmm.. looks like it may be ok to truncate: https://www.qnx.com/developers/docs/7.1/#com.qnx.doc.neutrino.lib_ref/topic/u/utimensat.html:

"The time that's actually recorded depends on the resolution in the filesystem's data structures. For example, no matter what you put in the tv_nsec field, the time that a FAT filesystem records uses a 2-second granularity. The recorded time is the greatest value supported by the filesystem that isn't greater than the specified time."

@sbc100
Copy link
Collaborator

sbc100 commented Jul 18, 2022

Still, I'm not sure it work the extra cost in code complexity and output side to have to deal with i64 here rather than i32.

@sbc100
Copy link
Collaborator

sbc100 commented Jul 18, 2022

Playing around with linux is looks like its currently capping tv_sec at 0x37fffffff (2446-05-10 15:38:53.000000000). Specifying anything bigger than this results in stat returning that value... so I guess its fine if we cap ours at 2^53-1.

@tiran
Copy link
Contributor

tiran commented Jul 18, 2022

>>> datetime.datetime.fromtimestamp(2**53 - 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: year 285428751 is out of range

Yeah, a cap at 2^53-1 looks fine to me, too! 👍

@sbc100
Copy link
Collaborator

sbc100 commented Jul 18, 2022

Hopefully #17459 will fix you failure

@hoodmane
Copy link
Contributor Author

Thanks @sbc100, @tiran!

@tiran
Copy link
Contributor

tiran commented Jul 18, 2022

I have created a GH repo with daily GHA to run smoke tests of EMSDK tip-of-tree with CPython 3.11 and main branch, https://github.com/tiran/cpython-wasm-test/actions

sbc100 added a commit that referenced this issue Jul 18, 2022
sbc100 added a commit that referenced this issue Jul 18, 2022
sbc100 added a commit that referenced this issue Jul 18, 2022
sbc100 added a commit that referenced this issue Jul 18, 2022
sbc100 added a commit that referenced this issue Jul 18, 2022
sbc100 added a commit that referenced this issue Jul 18, 2022
sbc100 added a commit that referenced this issue Jul 18, 2022
sbc100 added a commit that referenced this issue Jul 18, 2022
sbc100 added a commit that referenced this issue Jul 19, 2022
sbc100 added a commit that referenced this issue Jul 19, 2022
@tiran
Copy link
Contributor

tiran commented Jul 19, 2022

Thanks! The PR fixed most broken tests of CPython's test suite. The functions localtime_js and _gmtime_js need some attention, too. They still parse time_t as i32. Python expects that localtime() fails with EOVERFLOW when the input is out of supported range.

https://github.com/tiran/cpython-wasm-test/runs/7403750856

@tiran
Copy link
Contributor

tiran commented Jul 19, 2022

Fixed in #17471 . The PR also adds some tests for edge cases.

@tiran
Copy link
Contributor

tiran commented Jul 20, 2022

Good news! CPython's test suite is passing again with Emscripten tot-upstream.

xbcnn pushed a commit to xbcnn/emscripten that referenced this issue Jul 22, 2022
This brings us back in line with upstream musl.  The change to 32-bit
was only recently made in emscripten-core#16966. The reason we made this change was
made was because we had certain C library calls that were implemented in
JS that returned `time_t`.  Since returning 64-bit values from JS
functions is not always easy (we don't always have WASM_BIGINT
available) that simplest solution was to define `time_t` to 32-bit which
doesn't have issues at the JS boundary.

However, in the intervening time many of the `time_t`-returning function
have been moved into native code (See emscripten-core#16606 and emscripten-core#16439) with only two
remaining: _mktime_js and _timegm_js.  So this change redefines just
those two functions to return `int` while keeping `time_t` itself as
64-bit.

Fixes: emscripten-core#17393
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants