Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to 64 bit file and time APIs on GNU libc for 32bit systems #3175

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

snogge
Copy link
Contributor

@snogge snogge commented Mar 29, 2023

Use the 64 bit types and APIs included in GNU libc also for 32-bit systems. These are the types and APIs used when you compile your C code against GNU libc headers with the preprocessor options -D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64.

This is required for 2038 compatible software built for 32-bit systems.

This PR is not done, I need some guidance with finishing it.

  1. Does it need some sort of configuration options to turn this on and off?
  2. There is a conflict for some of the stat functions where two signatures link against the same symbol. For C this is not a problem, but we probably need to make struct stat and struct stat64 be different names for the same struct. I don't know how to do that.
  3. It's possible that some types should be moved/split/joined in the file hierarchy. I'd appreciate some feedback on that, but I'm waiting on the answer to question 1 before moving forward.

I've done one commit for each modified type so it is easier to review the changes in isolation.

@rustbot
Copy link
Collaborator

rustbot commented Mar 29, 2023

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @JohnTitor (or someone else) soon.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@snogge
Copy link
Contributor Author

snogge commented Mar 29, 2023

Oops, left some debug/test commits in there. Removed now.

@snogge
Copy link
Contributor Author

snogge commented Apr 11, 2023

Hi @JohnTitor , do you have any feedback on this change?
Is there anything I can do to facilitate the review?
I'd appreciate any hints on solving question 2:

  1. There is a conflict for some of the stat functions where two signatures link against the same symbol. For C this is not a problem, but we probably need to make struct stat and struct stat64 be different names for the same struct. I don't know how to do that.

I'm not a Rust developer and not sure how I can create that type alias.

@bors
Copy link
Contributor

bors commented Apr 23, 2023

☔ The latest upstream changes (presumably #3218) made this pull request unmergeable. Please resolve the merge conflicts.

@snogge snogge force-pushed the time_bits=64 branch 7 times, most recently from 5dfc8ed to 7d2c859 Compare April 24, 2023 13:36
@JohnTitor
Copy link
Member

JohnTitor commented Apr 24, 2023

Thanks for the PR! But I think this is a huge breaking change and we have to discuss it before reviewing the changes (or, even submitting a PR). Could you open an issue instead?

@snogge
Copy link
Contributor Author

snogge commented Apr 25, 2023

Sure, I'll do that.

@snogge
Copy link
Contributor Author

snogge commented Apr 25, 2023

I'm working on updating this PR to handle problem 2 mentioned above and to pass tests on more platforms. Expect an update later this week.

@snogge snogge force-pushed the time_bits=64 branch 2 times, most recently from 5314484 to eb997cb Compare April 25, 2023 12:02
@snogge
Copy link
Contributor Author

snogge commented Apr 25, 2023

Sorry for the many pushes, that was meant only for my fork. Anyway, now the PR should handle the duplicate functions mentioned in problem 2, and also pass tests on more platforms.

@bors
Copy link
Contributor

bors commented May 6, 2023

☔ The latest upstream changes (presumably #3237) made this pull request unmergeable. Please resolve the merge conflicts.

@snogge
Copy link
Contributor Author

snogge commented Sep 7, 2023

Just rebased on main. Passes all tests in main.yml and bors.yml workflows.
Still not very clean though, feedback on code style would be welcome.

I considered adding a cargo feature to control this, but as that would be braking the ABI I guess I shouldn't?

@bors
Copy link
Contributor

bors commented Nov 12, 2023

☔ The latest upstream changes (presumably #3437) made this pull request unmergeable. Please resolve the merge conflicts.

@snogge
Copy link
Contributor Author

snogge commented May 8, 2024

I've reconstructed the PR to support an environment variable RUST_LIBC_TIME_BITS which can be set to 32 to force 32-bit time on 32-bit platforms. The default, based on the cargo environment variables is 64-bit time for those platforms.
I've also modified the CI workflows to test these variants.

#[cfg(gnu_time64_abi)]
__sec: ::c_ulong,
#[cfg(gnu_time64_abi)]
__usec: ::c_ulong,
pub type_: ::__u16,
Copy link

@plugwash plugwash May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks the evdev-rs crate with no obvious non-hacky way of fixing it (evdev-rs has other errors too, but they are errors I can deal with)

692s error[E0560]: struct input_event has no field named time
692s --> src/lib.rs:265:13
692s |
692s 265 | time: self.time.as_raw(),
692s | ^^^^ input_event does not have this field
692s |
692s = note: available fields are: __sec, __usec

The original C header deals with this using macros, which are defined differently for the non-time64 and time64 cases.
(non-time64)

#define input_event_sec time.tv_sec
#define input_event_usec time.tv_usec

(time64)

#define input_event_sec __sec
#define input_event_usec __usec

Not sure how this can be best dealt with in rust, the best I can think of would be to add get/set/get_mut methods.

I also suspect this structure definition is wrong on 32-bit linux systems with 64-bit time using C libraries other than glibc but that isn't directly related to this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for testing.
Does this definition of struct input_event work for evdev-rs - it will require code changes.

    pub struct input_event {
        #[cfg(any(target_pointer_width = "64", not(gnu_time64_abi)))]
        pub input_event_sec: ::time_t,
        #[cfg(all(target_pointer_width = "32", gnu_time64_abi))]
        pub input_event_sec: ::c_ulong,

        #[cfg(any(target_pointer_width = "64", not(gnu_time64_abi)))]
        pub input_event_usec: ::suseconds_t,
        #[cfg(all(target_pointer_width = "32", gnu_time64_abi))]
        pub input_event_usec: ::c_ulong,

        #[cfg(target_arch = "sparc64")]
        _pad1: ::c_int,

        pub type_: ::__u16,
        pub code: ::__u16,
        pub value: ::__s32,
    }

Copy link

@plugwash plugwash Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Structurally this looks like a sensible compromise and one to which downstream code could be adapted fairly easily.

I still think the definition is probablly wrong on 32-bit linux systems with 64-bit time and C libraries other than glibc. Since afaict this is a kernel data structure I would not expect the field sizes to change with which C library is in use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're probably right, I'll have to think about this some more.
But I'll update with this struct definition for now.

@plugwash
Copy link

plugwash commented Jun 4, 2024

After patching the filetime crate to build against the patched libc crate, i'm seeing a test failure. I guess there is some sort of problem with the file time APIs.

git clone https://github.com/plugwash/filetime
cd filetime
cargo test

---- tests::set_file_times_test stdout ----
thread 'tests::set_file_times_test' panicked at src/lib.rs:449:9:
assertion `left == right` failed: modification time should be updated
  left: FileTime { seconds: 1073741822, nanos: 0 }
 right: FileTime { seconds: 20000, nanos: 0 }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

@snogge
Copy link
Contributor Author

snogge commented Jun 7, 2024

@plugwash , you need to use the utimensat_time64 syscall in https://github.com/plugwash/filetime/blob/main/src/unix/linux.rs#L44
I haven't (yet) figured out why the call to set_file_times on https://github.com/plugwash/filetime/blob/main/src/lib.rs#L438 works, but something about the use of only the mtime argument causes the mtime to be set to the value of UTIME_OMIT.

Also, you might want to change to the futimens syscall. It uses only the file descriptor.

@plugwash
Copy link

plugwash commented Jun 8, 2024

Thanks a lot for the advice on the filetime issue.

I'm putting together a list of known fallout from the time64 update for rust in Debian along with the patches we are carrying to deal with it, it's possible I've missed some stuff. Some of these patches also make debian-specific assumptions, they should be regarded as pointers to stuff that needs fixing and potential fixes, not nessacerally patches that should be adopted as-is.

timespec now contains padding:

glibc has inconsitent definitions of suseconds_t and struct timeval:

input_event defition change:

explicit syscalls:

bindgen needs to pass time64 defines libclang:

issues related to bindgen:

snogge added 16 commits June 18, 2024 10:35
The gnu_time64_abi option indicates that a 32-bit arch should use
64-bit time_t and 64-bit off_t and simliar file related types.

32-bit platforms are identified by CARGO_CFG_TARGET_POINTER_WIDTH, but
x86_64-unknown-linux-gnux32 is a 64-bit platform and should not use
gnu_time64_abi even if it has 32-bit pointers.

riscv32 is a 32-bit platform but has always used 64-bit types for
time and file operations.  It should not use the gnu_time64_abi

The default - all relevant platforms should use 64-bit time - can be
overridden by setting the RUST_LIBC_TIME_BITS environment variable to
32.  It can also be set to 64 to force gnu_time64_abi, or default
which is the same as leaving it unset.  The last two options are
mainly useful for CI flows.
Set _TIME_BITS=64 and _FILE_OFFSET_BITS=64 preprocessor symbols for
the C test code as for building the code.
Set the basic types correctly for gnu_time64_abi (_TIME_BITS=64 and
_FILE_OFFSET_BITS=64).
Set the link names of relevant symbols to use be the same as when a C
program is built against GNU libc with -D_TIME_BITS=64 and
-D_FILE_OFFSET_BITS=64.
The actual values may be different on 32bit archs and glibc with
64-bit time.
Also add the F_[SG]ET*64 constants.
For 64 bit time on 32 bit linux glibc timeval.tv_usec is actually
__suseconds64_t (64 bits) while suseconds_t is still 32 bits.
Big-endian platforms wants 32 bits of padding before tv_nsec,
little-endian after.

GNU libc always uses long for tv_nsec.
@snogge snogge force-pushed the time_bits=64 branch 2 times, most recently from 299e6da to 473fcae Compare June 18, 2024 13:12
Do not use gnu/b32/time*.rs for riscv32 and sparc.  Add timex to
gnu/b32/(riscv32|sparc)/mod.rs instead.
Move the stat struct from gnu/b32/mod.rs to gnu/b32/time*.rs

For arm, mips, powerpc, riscv32, and x86, move stat64 to
time32.rs and add an stat64 = stat alias to time64.rs.

For sparc, do the same for statfs64 and statvfs64.
In Linux Tier1, run the tests for i686-unknown-linux-gnu with
RUST_LIBC_TIME_BITS set to 32, 64, and default.

In Linux Tier2, run the tests for arm-unknown-linux-gnueabihf and
powerpc-unknown-linux-gnu with RUST_LIBC_TIME_BITS set to 32, 64,
and default.  Use RUST_LIBC_TIME_BITS=defaults for the other
platforms.

In Build Channels Linux, build the relevant platforms with
RUST_LIBC_TIME_BITS unset and set to 32 and 64.
@snogge
Copy link
Contributor Author

snogge commented Jun 19, 2024

I'm putting together a list of known fallout from the time64 update for rust in Debian along with the patches we are carrying to deal with it, it's possible I've missed some stuff. Some of these patches also make debian-specific assumptions, they should be regarded as pointers to stuff that needs fixing and potential fixes, not nessacerally patches that should be adopted as-is.

This is great!

timespec now contains padding:

I don't see what we can do about this.

glibc has inconsitent definitions of suseconds_t and struct timeval:

I have no idea why glibc did not choose to redefine suseconds_t to a 64 bit type instead of using __suseconds64_t => int64_t.

* filetime - https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/filetime/debian/patches/suseconds_t-workaround.patch

* laurel - https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/laurel/debian/patches/use-timespec-new.patch

Not going via libc::timespec just looks better to me.

* libpulse-binding - https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/libpulse-binding/debian/patches/avoid-suseconds_t.patch

https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/libpulse-binding/debian/patches/avoid-suseconds_t.patch#L34
Using time_t here is easy but maybe a bit misleading. time_t is never involved in defining this member type in glibc.

* net2 - https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/net2/debian/patches/suseconds_t-workaround.patch

* nix - https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/nix/debian/patches/time_t_workarounds.diff

https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/nix/debian/patches/time_t_workarounds.diff#L66
Shouldn't that be i64? suseconds_t is a signed integer type.

* socket2 - https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/socket2/debian/patches/suseconds_t-workaround.patch

* unix-socket - https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/unix-socket/debian/patches/suseconds_t-workaround.patch

* vsock - https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/vsock/debian/patches/suseconds_t-workaround.patch

input_event defition change:

* evdev-rs - https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/evdev-rs/debian/patches/time64.patch

I've pushed the new input_event struct that looks like (except for conditionals and stuff)

    pub struct input_event {
        pub input_event_sec: ::time_t,
        pub input_event_sec: ::c_ulong,
    };

I think it is correct for all archs, I cannot figure out any where it shouldn't be.

explicit syscalls:

* filetime - https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/filetime/debian/patches/use-correct-syscall-on-time64.patch
  to be continued...........

Just use the libc::utimensat function instead? It will call the correct glibc wrapper around the correct syscall.
I think filetime calls a some other libc functions anyway.

bindgen needs to pass time64 defines libclang:

* bindgen - https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/bindgen/debian/patches/pass-time64-flags-to-libclang.patch

issues related to bindgen:

* selinux-sys - https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/selinux-sys/debian/patches/fix-bindgen.diff

@snogge
Copy link
Contributor Author

snogge commented Jun 19, 2024

input_event defition change:

* evdev-rs - https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/evdev-rs/debian/patches/time64.patch

I've pushed the new input_event struct that looks like (except for conditionals and stuff)

    pub struct input_event {
        pub input_event_sec: ::time_t,
        pub input_event_sec: ::c_ulong,
    };

I think it is correct for all archs, I cannot figure out any where it shouldn't be.

@plugwash, I reread your comment about this, and I now I can figure out where it wouldn't work. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants