Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add forbidden function casts RFC #3526

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

WaffleLapkin
Copy link
Member

This RFC proposes to forbid function -> integer as casts, since they have surprising aspects and are error prone.

Rendered

@WaffleLapkin WaffleLapkin added T-lang Relevant to the language team, which will review and decide on the RFC. A-function-pointers Proposals relating to function pointers. A-cast Proposals relating to explicit casts. labels Nov 12, 2023
Co-authored-by: Nilstrieb <48135649+Nilstrieb@users.noreply.github.com>
@Veykril Veykril added the A-edition-2024 Area: The 2024 edition label Nov 12, 2023
Co-authored-by: Jules Bertholet <jules.bertholet@gmail.com>
@programmerjake
Copy link
Member

programmerjake commented Nov 12, 2023

what happens when code pointers are bigger than data pointers? do you have to fall back on transmute to get the address, since casting to a data pointer will lose information?

an example is the 8086 Medium Memory Model, which has 16-bit data pointers and 32-bit code pointers.

@GoldsteinE
Copy link

GoldsteinE commented Nov 13, 2023

usize is guaranteed to be the same size as a data pointer, so seemingly if cast to a data pointer would truncate, cast to usize would too. Does Rust support any such platform?

https://doc.rust-lang.org/reference/type-layout.html#pointers-and-references-layout

@programmerjake
Copy link
Member

usize is guaranteed to be the same size as a data pointer, so seemingly if cast to a data pointer would truncate, cast to usize would too. Does Rust support any such platform?

I couldn't find any specific instances, but I know AVR also can have different code/data pointer sizes, and MSP430 too, just all the configurations I could find don't do that.

With rustc's GCC backend, there are a whole pile of weird and wonderful cpu architectures that rust could feasibly support in the future.

I know others have discussed different code/data pointer sizes before, in reference to dlsym-style dynamic linking and how POSIX therefore requires code/data pointers to be the same size and how not everything supports that.

@GoldsteinE
Copy link

Either way, if cast to *const () is a problem, a cast to usize also is (because they’re guaranteed to be the same size). A cast to a wider type could work, but “cast of function pointer to usize can truncate” is arguably an even bigger footgun.

Casting function pointer to an integer doesn’t seem to be a common operation either: you can’t do pointer arithmetics on it, so I think the only reason you would want this is some FFI / inline asm stuff? I think if you’re on an architecture where function pointers are larger than data pointers and you want their numeric values, you very much know what you’re doing and could write a transmute, and if you’re writing cross-platform code you would more often than not write as usize instead of as fnptrsize and get bit by truncation.

@GoldsteinE
Copy link

Side note: having something like sptr::OpaqueFnPtr in the standard library would solve the problem, because its .addr() could return an appropriately-sized integer on these architectures, causing a compilation error instead of runtime truncation.

@carbotaniuman
Copy link

I am ambivalent on this proposal as-is without something like addr on function items, as having to chain casts between a data pointer seems like we're making the actual usage worse, and may pose issues on other architectures. Having addr by contrast, is not only shorter, but allows makes it more clear what is being done.

@GoldsteinE
Copy link

may pose issues on other architectures

Could you elaborate on this, please? As I've written before, if cast to a data pointer truncates, cast to usize also does, so this doesn't seem to cause any new issues.

@programmerjake
Copy link
Member

As I've written before, if cast to a data pointer truncates, cast to usize also does, so this doesn't seem to cause any new issues.

if a function pointer doesn't fit in usize but does fit in some larger integer type, (e.g. u32 when usize is 16-bit), then disallowing function pointer to integer casts removes the obvious way to get a function's address by as casting, since casting to a data pointer and then to u32 looses information since a data pointer can only hold 16-bits when usize is 16-bit.

// assume 32-bit code pointers and 16-bit data pointers:
pub fn works(fptr: fn()) -> u32 {
    fptr as u32
}

pub fn broken(fptr: fn()) -> u32 {
    fptr as *const () as u32
}

@tmandry
Copy link
Member

tmandry commented Nov 15, 2023

Personally I would like to see us do something here. But I feel uneasy about recommending that function pointers be converted to data pointers, both for the reasons of size discrepancies that @programmerjake raised and because more generally, code and data could exist in different address spaces entirely.

@programmerjake
Copy link
Member

programmerjake commented Nov 15, 2023

I think we should add core::ffi::uintfptr and core::ffi::intfptr types (like C's uintptr_t and intptr_t, except for function pointers) and add a addr method to all function pointer types. Then I'd be fine removing function-pointer to integer casts in a new edition.

e.g.:

impl<A, R, const ABI: &'static str> extern ABI fn(..A) -> R { // all function pointer types (needs more impls for unsafe fn, etc.
    pub fn addr(self) -> core::ffi::uintfptr {
        unsafe { mem::transmute(self) }
    }
}

@Sky9x
Copy link

Sky9x commented Nov 17, 2023

Side note: having something like sptr::OpaqueFnPtr in the standard library would solve the problem, because its .addr() could return an appropriately-sized integer on these architectures, causing a compilation error instead of runtime truncation.

I like to use fn(!) for opaque fn ptrs because it isn't callable without a transmute and is generally less footgun than just unsafe fn()

@riking
Copy link

riking commented Nov 20, 2023

wasm is a well-known architecture where function pointers are in a different address space from data pointers (they're table indices) despite being the same size.

@GoldsteinE
Copy link

wasm is a well-known architecture where function pointers are in a different address space from data pointers (they're table indices) despite being the same size.

Is it relevant? Cast through a data pointer should still work I think?

@jonathanpallant
Copy link

jonathanpallant commented Nov 21, 2023

Either way, if cast to *const () is a problem, a cast to usize also is (because they’re guaranteed to be the same size)

CHERI is coming so I'm in favour of stopping conflating data pointers with function pointers with integers that can count a number of things in memory.

Making people use transmute seems reasonable, because it will complain if two things are not the same size, but do I like Fn::addr() -> core::ffi::uintfptr as well.

Also I spent nearly a decade programming the XAP2 processor (it's in almost every CSR Bluetooth chip) and it has entirely separate code and data address spaces. So, I'm sensitised to these kinds of issues because they bit me hard in C all the time.

@GoldsteinE
Copy link

Making people use transmute seems reasonable

Note that this requires fnptr to integer to fnptr transmutes to be valid, which is not obvious, cf. rust-lang/unsafe-code-guidelines#286, rust-lang/unsafe-code-guidelines#340

CHERI is coming so I'm in favour of stopping conflating data pointers with function pointers with integers that can count a number of things in memory.

Unfortunately, I don’t think we can stop conflating uintptr_t with size_t with ptrdiff_t without breaking stability guarantees, which I think means that Rust on CHERI must have 128-bit array indices and generally everything pointer-related must be 128 bits.

@WaffleLapkin
Copy link
Member Author

The concern about targets with data and function pointers being different is interesting. Note that I don't think difference in address spaces is significant — whatever the "blessed" way will be, we can just make sure we account for address spaces properly (going through *const () should not be a problem here, since we can just ptr::invalid).

For platforms with size_of::<fn()>() > size_of::<*const ()>() I have a few questions:

  1. Are there platforms like this that may be supported by Rust in the future (or are already supported)?
  2. Can we support this without making breaking changes?

Expanding on the second question, it seems like the assumption that size_of::<usize>() == size_of::<*const ()>() == size_of::<fn()>() is so ingrained in Rust, that we simply can't break it.

usize/ptr is guaranteed in usize docs (among with all the APIs) (emph. mine):

The pointer-sized unsigned integer type.

The size of this primitive is how many bytes it takes to reference any location in memory. For example, on a 32 bit target, this is 4 bytes and on a 64 bit target, this is 8 bytes.

Assumptions for fn() are not that public, but still we have a few examples in the docs

// (cut for brievety)
let bar_ptr: fn(i32) = not_bar_ptr; // force coercion to function pointer
assert_eq!(mem::size_of_val(&bar_ptr), mem::size_of::<usize>());
let fnptr: fn(i32) -> i32 = |x| x+2;
let fnptr_addr = fnptr as usize;
let fnptr = fnptr_addr as *const ();
let fnptr: fn(i32) -> i32 = unsafe { std::mem::transmute(fnptr) };
assert_eq!(fnptr(40), 42);

And also unstable APIs (FnPtr::addr) and also std code, for example fmt traits impls:
https://github.com/rust-lang/rust/blob/3acb261e214cd13ae54346af30eae5807501ec37/library/core/src/ptr/mod.rs#L1989 (this goes FnPtr -> *const () -> usize).

I'm not sure we have a way out of all of this, given that unsafe code is relatively likely to depend on stuff like this (my opinion).

Co-authored-by: Malo <57839069+MDLC01@users.noreply.github.com>
@traviscross
Copy link
Contributor

@rustbot labels -A-edition-2024 +A-maybe-next-edition

We discussed the timeline of edition items in the lang meeting today, and this didn't make the list, so it seems unlikely this will happen for Rust 2024. We'll keep this in mind for future editions. Thanks to everyone working on this and providing helpful feedback here.

@rustbot rustbot added the A-maybe-future-edition Changes of edition-relevance not targeted for a specific edition. label Apr 10, 2024
@rustbot rustbot removed the A-edition-2024 Area: The 2024 edition label Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cast Proposals relating to explicit casts. A-function-pointers Proposals relating to function pointers. A-maybe-future-edition Changes of edition-relevance not targeted for a specific edition. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
Status: Idea
Development

Successfully merging this pull request may close these issues.

None yet