Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix undefined behaviour when loading control file #687

Merged

Conversation

syvb
Copy link
Contributor

@syvb syvb commented Sep 14, 2022

The type of __pgx_marker is extern "C" fn() -> eyre::Result<ControlFile>, but it was incorrectly called as extern "C" fn() -> SqlGraphEntity in cargo-pgx, which is undefined behavior since it's UB to assume that a valid value for an eyre::Result is also a valid value for a SqlGraphEntity. On my M1 macOS device the undefined behavior results in an unconditional Failed to get control file information panic. This PR fixes that.

The only reason this ever worked was because on some platforms eyre::Result<ControlFile> and SqlGraphEntity both can have a ControlFile after a 1 byte long prefix. And it happened to be the case that Result::Ok and SqlGraphEntity::ExtensionRoot both correpond to a prefix of 0.

I've also wrote a similar fix for the same issue 0.4, I can make a PR against master with that commit if you want to backport this to the 0.4.x release series.

@workingjubilee
Copy link
Member

There are... many fixes in 0.5.0 that will not appear in 0.4. This will be another one of them.

Comment on lines 393 to 395
let symbol: libloading::os::unix::Symbol<
unsafe extern "C" fn() -> eyre::Result<ControlFile>,
> = lib
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems unlikely a Result returning function like this is actually FFI-safe, extern "C" fn or no. Rust enums are usually not FFI-safe except in some qualified cases. This probably works but it is probably still UB, unfortunately. Maybe this should be using extern "Rust" fn?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does look like UB to me. That issue also affects the other two external function calls in this file (to __pgx_sql_mappings and __pgx_internals*), which all return repr(Rust) enums.

I think even a extern "Rust" fn would still technically be UB since the compiler doesn't guarantee that the representation of an item is the same across compilation sessions (although in practice it usually is, at least when the target and compiler version are the same). Serializing data to a C array as it goes across the FFI boundary would remove that UB (although doing so would probably result in a minor hit to performance).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the guarantees are very thin here, but a repr(Rust) enum should have a consistent result, in practice, with the same compiler (there are various reasons why this has to be true), and is less likely to be a bear (on performance or otherwise).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR to make the relevant functions extern "Rust". extern "Rust" is implied if nothing is specified but I wrote it out explicitly on the relevant functions for clarity.

I also documented that the cargo-pgx compiler should match the compiler used to compile crates that use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out rustfmt doesn't like explicit extern "Rust", so I've made them implict. (rustfmt only changed the ones that weren't in a macro but I did all of them for consistency)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #[rustfmt::skip] is better here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use #[rustfmt::skip].

@syvb
Copy link
Contributor Author

syvb commented Sep 14, 2022

I was able to determine that the cause of this issue is a rustc change sometime between beta 1.64 and nightly 2022-09-11. I don't think having an aarch64 CPU actually caused the issue, but instead the LLVM 14 -> 15 bump might be responsible.

Edit to confirm: just tested on x86 Linux with the latest nightly and the same issue occurs

@syvb syvb force-pushed the 0.5_fix-controlfile-ub branch 2 times, most recently from 9de4455 to af09eb0 Compare September 14, 2022 21:02
Various functions used in schema generator were marked as extern "C"
despite not having FFI-safe types (and not being useful to call from C
code).
Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like everything is good to go! Thank you!

@workingjubilee workingjubilee merged commit bb393a1 into pgcentralfoundation:develop Sep 14, 2022
syvb added a commit to syvb/pgx that referenced this pull request Sep 15, 2022
…n#687)

* Fix undefined behaviour when loading control file
* Use extern "Rust" when function types not FFI-safe

Various functions used in schema generator were marked as extern "C"
despite not having FFI-safe types (and not being useful to call from C
code).

* Document requirement for compiler to match
bors bot added a commit to timescale/timescaledb-toolkit that referenced this pull request Sep 15, 2022
536: Document requirement to use same compiler for cargo-pgx and Toolkit r=Smittyvb a=Smittyvb

The same compiler must be used to compile cargo-pgx and Toolkit, or bad things (undefined behaviour) might happen. See pgcentralfoundation/pgrx#687 for why this is the case.

This PR updates the installation documentation in the README to indicate this. I split the command in two since you only need to reinstall cargo-pgx, there's no need to re-run `cargo pgx init`.

Co-authored-by: Smitty <smitty@timescale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants