Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue #1209 #1210

Merged
merged 12 commits into from Jul 13, 2023
Merged

Conversation

eeeebbbbrrrr
Copy link
Contributor

@eeeebbbbrrrr eeeebbbbrrrr commented Jul 13, 2023

Cleanup lifetimes such that a Spi::connect() block can't return things that should be borrowed from the lifetime of that SPI connection, which includes cursors (SpiCursor).

This causes the test from #1209 to not compile, which is the desired result, with:

error[E0515]: cannot return value referencing temporary value
   --> pgrx-tests/src/tests/spi_tests.rs:516:13
    |
516 |               c.open_cursor("select 'hello' from generate_series(1, 10000)", None)
    |               ^-------------------------------------------------------------------
    |               |
    |  _____________temporary value created here
    | |
517 | |                 .fetch(10000)
518 | |                 .unwrap()
    | |_________________________^ returns a value referencing data owned by the current function
    |
    = help: use `.collect()` to allocate the iterator

error[E0515]: cannot return value referencing function parameter `c`
   --> pgrx-tests/src/tests/spi_tests.rs:516:13
    |
516 |               c.open_cursor("select 'hello' from generate_series(1, 10000)", None)
    |               ^-------------------------------------------------------------------
    |               |
    |  _____________`c` is borrowed here
    | |
517 | |                 .fetch(10000)
518 | |                 .unwrap()
    | |_________________________^ returns a value referencing data owned by the current function
    |
    = help: use `.collect()` to allocate the iterator

error[E0515]: cannot return value referencing temporary value
   --> pgrx-tests/src/tests/spi_tests.rs:523:30
    |
523 |             Spi::connect(|c| c.open_cursor("select 1", None).fetch(1).unwrap());
    |                              -------------------------------^^^^^^^^^^^^^^^^^^
    |                              |
    |                              returns a value referencing data owned by the current function
    |                              temporary value created here
    |
    = help: use `.collect()` to allocate the iterator

error[E0515]: cannot return value referencing function parameter `c`
   --> pgrx-tests/src/tests/spi_tests.rs:523:30
    |
523 |             Spi::connect(|c| c.open_cursor("select 1", None).fetch(1).unwrap());
    |                              -------------------------------^^^^^^^^^^^^^^^^^^
    |                              |
    |                              returns a value referencing data owned by the current function
    |                              `c` is borrowed here
    |
    = help: use `.collect()` to allocate the iterator

I am not sure how to write a test with pgrx' test suite to ensure something doesn't compile. That'll be easy to do over in PL/Rust, but I feel like we should have some kind of regression test around this here. Thoughts @thomcc or @workingjubilee?

Cleanup lifetimes such that a `Spi::connect()` block can't returned things that should be borrowed from the lifetime of that SPI connection, which includes cursors (`SpiCursor`).
@workingjubilee
Copy link
Member

TIME FOR COMPILETEST

@workingjubilee
Copy link
Member

@eeeebbbbrrrr The library of choice to test what you want to test is https://docs.rs/ui_test/latest/ui_test/ and it enables what are called "UI tests" by rustc's testing infrastructure. I said "compiletest" because it is a revised version of https://github.com/rust-lang/rust/tree/1b3e68692592d71938df8e7fd8e53fbe5e7ef58c/src/tools/compiletest

@david-monroe
Copy link

david-monroe commented Jul 13, 2023

Unfortunately I don't think it is that simple.

  1. SpiHeapTupleData is also just a pointer to allocated data that is freed by SPI_finish.

  2. There is impl<'a> FromDatum for &'a str which I think allows you to get a &'static str pointing into temporary data via

        pub fn get<T: IntoDatum + FromDatum>(&self, ordinal: usize) -> Result<Option<T>> {

    FromDatum should likely have a lifetime parameter similar to Deserialize:

    impl<'de: 'a, 'a> Deserialize<'de> for &'a str {

You should do a thorough analysis of lifetimes in this crate.

@workingjubilee
Copy link
Member

We have indeed been doing that. I have been working on various possible solutions for actually guaranteeing type safety over FFI. Unfortunately it's very hard to get someone who both:

  • is very familiar with the "objective lifetimes" of Postgres
  • is very familiar with how Rust's lifetime system works

And it's hard to simply fuse two people together, so while Eric knows the former and I know the latter, we can't necessarily do a Vulcan mind meld and even if we did, there's a lot of code to review and fix.

As you might have noticed, approximately everything implements FromDatum and it is a bound for everything thus it is used everywhere so "just adding a lifetime parameter" probably is close to "rewrite the entire 34000 lines of pgrx, pgrx-macros, and pgrx-sql-entity-graph".

@eeeebbbbrrrr
Copy link
Contributor Author

Unfortunately I don't think it is that simple.

  1. SpiHeapTupleData is also just a pointer to allocated data that is freed by SPI_finish.

There's a follow-up commit here that addresses this.

@eeeebbbbrrrr
Copy link
Contributor Author

I want to spend a little more time analyzing this, but I think this PR is now catching the things.

It looks like there were two problems here:

  • not enough of the Spi types and functions have the proper (any?) lifetime annotations to ensure their instances or return values all associated with the surrounding Spi::connect() lifetime.

  • SpiHeapTupleDataEntry::value() is allocating the "FromDatum" version of the backing Datum in the wrong Postgres MemoryContext

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking, I think.

pgrx/src/spi.rs Show resolved Hide resolved
pgrx/src/spi.rs Outdated Show resolved Hide resolved
#[allow(dead_code)]
status_code: SpiOkCodes,
table: Option<*mut pg_sys::SPITupleTable>,
table: Option<&'conn mut pg_sys::SPITupleTable>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we later use this to access data allocated past the SPITubleTable, or does it alias with anything else? I notice it doesn't get a lot of action (directly) in our test infra, so I'm wondering if the new &'conn mut will break existing user code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being an implementation detail, I'm unclear how it would break existing user code?

If that thing were to leak out to the public API tho, at least now it has a proper lifetime and might affect compilation, which we'd want, yeah?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a few details where internal changes can actually leak out due to Rust unfortunately inferring certain information that is otherwise fairly laborious to annotate on every single type ever, and I can never off-handedly remember exactly which things get affected by this. I think *mut T to &'a mut T is usually nonbreaking, however, it's &'a T to &'a mut T that reliably gotchas people.

This is sometimes referred to as the "perfect derive problem" or "semver hazards": https://github.com/rust-lang/lang-team/blob/master/design-meeting-minutes/2022-04-12-implied-bounds-and-perfect-derive.md

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jubilee, is there an action item here? I’m not really sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think so, I'm just noting it as something to keep an eye on and to try to write more tests that exercise the more niche points of pgrx::spi.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, there was the question about aliasing as usual, and also about range of access, given that I'm not immediately sure if pg_sys::SPITubleTable is like ArrayType or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Range of access doesn't really matter. Stacked borrowed forbids it, but treed-borrows allows &mut Header with trailing data, and I've been told that even though TB is likely to be replaced, that problem will certainly remain fixed in the replacement.

IOW, this only matters for running in miri (under default settings), which isn't a concern.

pgrx/src/spi.rs Show resolved Hide resolved
@workingjubilee
Copy link
Member

As you might have noticed, approximately everything implements FromDatum and it is a bound for everything thus it is used everywhere so "just adding a lifetime parameter" probably is close to "rewrite the entire 34000 lines of pgrx, pgrx-macros, and pgrx-sql-entity-graph".

Which, to be clear, is what is going to happen, effectively.

not enough of the Spi types and functions have the proper (any?) lifetime annotations to ensure their instances or return values all associated with the surrounding Spi::connect() lifetime.

I have some questions about those.

…` it provides to ensure there's no way to return anything from the closure that is tied to that lifetime.

Additionally, it does make sense for `impl Query for &PreparedStatement` to attach a lifetime to the PreparedStatement reference and ensure that the connection lifetime lives at least as long as it does.  In practice there shouldn't be a way for a PreparedStatement to escape a `Spi::connect(|| ...)` call anyways, but just in case...
…o return a `SpiTupleTable` when they shouldn't and technically no longer can.
@eeeebbbbrrrr
Copy link
Contributor Author

I want to spend a little more time analyzing this, but I think this PR is now catching the things.

It looks like there were two problems here:

  • not enough of the Spi types and functions have the proper (any?) lifetime annotations to ensure their instances or return values all associated with the surrounding Spi::connect() lifetime.
  • SpiHeapTupleDataEntry::value() is allocating the "FromDatum" version of the backing Datum in the wrong Postgres MemoryContext

To amend the first bullet point there... The SpiClient that Spi::connect() gives the closure needs to be an anonymous lifetime, not one dictated by the caller, which is what I first did in 67aad55.

This stuff is incredibly complex.

Our intention here was always that any Spi-related things created within the scope of Spi::connect() cannot escape that scope. The only thing that can are returned Datums, which we have special support for to allocate in the parent memory context. That's essentially point two above in that we were missing that in one place.

I want to think on this a bit more before we merge, but I'm feeling pretty good about it now.

pgrx-tests/src/tests/spi_tests.rs Outdated Show resolved Hide resolved
pgrx-tests/src/tests/spi_tests.rs Outdated Show resolved Hide resolved
#[allow(dead_code)]
status_code: SpiOkCodes,
table: Option<*mut pg_sys::SPITupleTable>,
table: Option<&'conn mut pg_sys::SPITupleTable>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jubilee, is there an action item here? I’m not really sure.

pgrx/src/spi.rs Outdated Show resolved Hide resolved
@eeeebbbbrrrr eeeebbbbrrrr marked this pull request as ready for review July 13, 2023 22:13
Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is an improvement and should resolve the immediate problem.

Comment on lines +319 to +328
impl<'conn> Query<'conn> for &str {
type Arguments = Option<Vec<(PgOid, Option<pg_sys::Datum>)>>;
type Result = Result<SpiTupleTable>;
type Result = Result<SpiTupleTable<'conn>>;

/// # Panics
///
/// This function will panic if somehow the specified query contains a null byte.
fn execute(
self,
_client: &SpiClient,
_client: &SpiClient<'conn>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is making me realize that I don't understand why Query is its own trait instead of being some kind of extension trait on SpiClient, if not to specifically help with this kind of unsound pattern, because it breaks the core lifetime relationship and is implemented the same way in all(?) cases.

However, I am not going to think about that too hard right now as... well, it's a preexisting oddity.

@eeeebbbbrrrr eeeebbbbrrrr merged commit 8ec818f into pgcentralfoundation:develop Jul 13, 2023
0 of 8 checks passed
eeeebbbbrrrr added a commit that referenced this pull request Jul 14, 2023
* Fix issue #1209

Cleanup lifetimes such that a `Spi::connect()` block can't return things that should be borrowed from the lifetime of that SPI connection, which includes cursors (`SpiCursor`).

Specifically, we were able to return a `SpiTupleTable` from within the `Spi::connect(|c| ...)` closure, which is strictly wrong.  We were also able to return `PreparedStatement`s, which is also wrong.

We also ensure that datums are `FromDatum`'d in the proper memory context when retrieving from an `SpiHeapTupleEntry`.
eeeebbbbrrrr added a commit that referenced this pull request Sep 5, 2023
This is the final release of v0.10.0. Thanks everyone for the beta
testing, pull requests, issues, and patience.

As always, install `cargo-pgrx` with `cargo install cargo-pgrx --locked`
and update your extension Cargo.toml files to use the `0.10.0` pgrx
dependencies.

This release includes support for Postgres 16RC1. Support for the
previous betas has been removed. As such, a fresh `cargo pgrx init` is
required.

## What's Changed Since v0.10.0-beta.4

* Fix `GetMemoryChunkContext` port by @workingjubilee in
#1273
* Better error messages when `pg_config` isn't found. by @eeeebbbbrrrr
in #1271
* Make `PostgresHash` also need `Eq` by @workingjubilee in
#1264
* Memoize git hash and extension metadata by @levkk in
#1274
* move to pg16rc1 by @eeeebbbbrrrr in
#1276
* Fix bgworker template up to 0.10.0-beta.4 by @workingjubilee in
#1270

## New Contributors
* @levkk made their first contribution in
#1274

**Changelog**:
v0.10.0-beta.4...v0.10.0

---

v0.10.0's full set of changes throughout the entire beta period are:

* Postgres 16beta1 Support by @eeeebbbbrrrr in
#1169
* Support building against macOS universal binaries by @clowder in
#1166
* list specific versions in feature gates by @eeeebbbbrrrr in
#1175
* Fix bug with converting a `pg_sys::Datum` into a `pgrx::Date` by
@eeeebbbbrrrr in #1177
* Fix Arrays with leading nulls by @eeeebbbbrrrr in
#1180
* Disable hello_versioned_so test by @workingjubilee in
#1192
* doc: fix link broken by @yihong0618 in
#1181
* fcinfo: fix incorrect length set in unsafe code by @Sasasu in
#1190
* update to pg16beta2 support by @eeeebbbbrrrr in
#1188
* Array-walking is aligned by @workingjubilee in
#1191
* Implement PGRXSharedMemory for Deque by @feikesteenbergen in
#1170
* Include security labels header by @daamien in
#1189
* Fixes macos-11 tests by @BradyBonnette in
#1197
* Pgcentralfoundation updates again by @eeeebbbbrrrr in
#1200
* Update version to 0.10.0-beta.0 by @eeeebbbbrrrr in
#1201
* Testing help by @eeeebbbbrrrr in
#1203
* Type testability cleanup by @eeeebbbbrrrr in
#1204
* Try to smartly propagate fs errors by @workingjubilee in
#1186
* Fix issue #1209 by @eeeebbbbrrrr in
#1210
* Type roundtrip tests by @eeeebbbbrrrr in
#1185
* Update version to 0.10.0-beta.1 by @eeeebbbbrrrr in
#1213
* Add a workaround for the pg16/homebrew/icu4c situation by @thomcc in
#1206
* Add security policy by @johnrballard in
#1207
* `AnyNumeric` is no longer backed by Postgres-allocated memory by
@eeeebbbbrrrr in #1216
* Modularize pgrx::spi by @workingjubilee in
#1219
* Stop SpiClient soundness from regressing by @workingjubilee in
#1214
* Add foreign table headers by @workingjubilee in
#1226
* Modularize the interior of pgrx-pg-sys by @workingjubilee in
#1227
* Initial valgrind support by @thomcc in
#1218
* Add support for handling SIGINT and SIGCHLD from bgworker by @JelteF
in #1229
* Ignores UI tests for MUSL environments by @BradyBonnette in
#1235
* Add a env flag that can be set to skip `#[pg_test]`-generated tests.
by @thomcc in #1239
* Fix issue #1076: Properly handle dependency graph of `Result<T, _>` by
@eeeebbbbrrrr in #1241
* Cleanup the error when cargo-pgrx version doesn't match Cargo.toml by
@eeeebbbbrrrr in #1240
* Add operator and cache related api by @VoVAllen in
#1242
* Addresses cargo-pgrx error reporting by @BradyBonnette in
#1238
* Update version to 0.10.0-beta.2 by @eeeebbbbrrrr in
#1244
* Bump cargo-metadata and clap-cargo by @thomcc in
#1246
* Derive Clone for Inet by @JelteF in
#1251
* Correct docs for datetime `From` impls by @workingjubilee in
#1253
* Only enable line tables for profile.dev by @thomcc in
#1249
* Remove references to master branch by @thomcc in
#1243
* Ensure bindgen gets all the `cppflags` it needs (on macOS, anyway) by
@thomcc in #1247
* update for pg16beta3 support by @eeeebbbbrrrr in
#1254
* Update version to 0.10.0-beta.3 by @eeeebbbbrrrr in
#1255
* Add proptest support by @workingjubilee in
#1258
* Misc reformatting and typo fixes by @workingjubilee in
#1260
* spi: simplify (optimize?) Datum preparation by @vrmiguel in
#1256
* Assume commutation when deriving PostgresEq by @workingjubilee in
#1261
* Demand Ord for PostgresOrd by @workingjubilee in
#1262
* Fix pgrx install causing postgresql coredump by @Sasasu in
#1263
* Update version to 0.10.0-beta.4 by @workingjubilee in
#1267

## New Contributors
* @clowder made their first contribution in
#1166
* @yihong0618 made their first contribution in
#1181
* @Sasasu made their first contribution in
#1190
* @daamien made their first contribution in
#1189
* @johnrballard made their first contribution in
#1207
* @VoVAllen made their first contribution in
#1242
* @vrmiguel made their first contribution in
#1256


**Full Changelog**:
v0.9.8...v0.10.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants