[tracking issue] Switch to upstream PGX #177

eeeebbbbrrrr · 2021-07-13T21:56:17Z

Good work and congrats on the release! 🎉

I noticed that y'all are using a fork of pgx that @JLockerman is maintaining.

I looked at the commits and they mostly look reasonable. We'd appreciate some PRs so we can get your changes merged upstream. Then y'all won't need to ask your users install a version of pgx that's not compatible with upstream.

Thanks and let me know how we can help.

The text was updated successfully, but these errors were encountered:

JLockerman · 2021-07-14T17:17:14Z

Thank you!

We would very much like to be on stock upstream, and I have plans to open PRs for the remaining stuff we're relying on. I've just been a bit too overwhelmed recently to translating our stuff into things that are actually worth upstreaming 😓

eeeebbbbrrrr · 2021-07-14T23:56:49Z

We're working on a new type resolution system that is replacing cargo-pgx' schema.rs.

Long story short -- it builds a bin/sql-generator.rs file based on how things in your extension code are annotated. And that builds a dependency graph and spits out nicely formatted and commented SQL in the proper order for CREATE EXTENSION to succeed. It's already working against ZDB, which is quite complex.

@Hoverbear is the one doing this work and might drop by and talk a little more about it.

It can even output graphviz .dot files so you can visualize your schema dependencies.

Then we've got plans for auto-generating extension schema upgrade scripts, and then, the cool thing is we're gonna actually parse your extension_sql!() and get them to tie into the dependency graph. extension_sql!() can do that now as opaque blocks, which is killer by itself.

I said all this because looking through your fork, I couldn't quite figure out the varlena_type! macro. I suspect we'll be breaking that so we need to figure out a proper solution.

But please, send us PRs. Happy to merge and publish to crates.io ASAP. I don't want a situation where we've fractured the pgx community and then users get confused by pgx-proper and pgx-tsdb.

Hoverbear · 2021-07-14T23:59:12Z

@JLockerman Oh my gosh another test case for the new code! I'd love to chat with you about this @JLockerman ! Are you on our discord? Feel free to ping me and we can discuss! I wanna make sure the new generator can handle the features you need, too!

JLockerman · 2021-07-16T15:59:39Z

@eeeebbbbrrrr oh that is very interesting!

varlena_type! is a bit of a hack to generate CREATE TYPE in more cases. As I understand the current schema generator searches for annotations in the rust code, such as #[derive(PgType)], and outputs the corresponding CREATE in the SQL. The issue I ran into is that for the way we create types the #[derive(PgType)] ends up being hidden in a place where the SQL codegen cannot find it (it's behind like 2 layers of macros). What varlena_type!(Foo) does is serve as a marker telling the schema generator "pretend you say #[derive(PgType)] on a type Foo" to get around that.

@Hoverbear I'm on discord, but not regularly, I'll stop by to discuss, and try to be around more often.

eeeebbbbrrrr · 2021-07-16T16:24:14Z

@eeeebbbbrrrr oh that is very interesting!

It's incredibly impressive, and I'm about to explain one reason why...

varlena_type! is a bit of a hack to generate CREATE TYPE in more cases. ... ...

I got cha. Now it makes sense that there's no pgx code behind it.

The new schema generator doesn't read the source .rs files at all. It instead ties into all the existing proc macros to generate the metadata necessary to build the schema.

This means it'll JustWork, regardless of the fancy Rust mechanisms you use to declare a struct and add a #[derive(PostgresType)] to it. So I'm pretty sure we won't need your varlena_type! macro at all going forward.

JLockerman · 2021-07-19T16:43:47Z

The new schema generator doesn't read the source .rs files at all. It instead ties into all the existing proc macros to generate the metadata necessary to build the schema.

@eeeebbbbrrrr

I was wondering if that was the plan 😄

Have you had any odd interaction with incremental compilation? IIUC when and if proc-macros are run is non-deterministic in that regime, which I could see causing issues for schema generation...

eeeebbbbrrrr · 2021-07-19T17:37:39Z

which I could see causing issues for schema generation...

not when you build a proper dependency graph from the metadata. ;)

JLockerman · 2021-07-20T17:47:44Z

Back on the topic of bringing timescale back in-sync with upstream; looking through our branches I a bunch of simple commits:

JLockerman/pgx@b19143a which adds an optimization pg_getarg() for types that can be instantiated without looking up their typeoid. I think it's in a good state as-is so I opened Don't look up the type oid for types that never use it pgcentralfoundation/pgrx#168 to upstream it.
JLockerman/pgx@fd865c3 adds the varlena_type!() macro which we discussed above. Do you guys want me to upstream it while the new schema generator is in development, or just wait until that is done and it's unneeded.
JLockerman/pgx@87e0460 with some panic!() fixes that were superseded by upstream.
JLockerman/pgx@cb32277 changes cargo pgx test to work a little more like cargo test; it only executes tests in (children of) the current directory, and no longer passes --no-default-features. This is mainly a quality-of-life change; if you have default features, or a multi-crate workspace, it makes running tests a bit easier, but it does mean that if you use cargo test directly, you generally need to pass in --features pg13 pg_test. I'm happy to clean it up open a PR if that's what you guys want, but also happy to just drop it.

Then we have the tricky ones: the code around forwarding rust-alloc to palloc.

JLockerman/pgx@429eada switches the CurrrentMemoryContext to TopMemoryContext when setting up the panic handler in order to ensure that the handler is still allocated whenever it's called. This should be pretty benign on it's own as it doesn't really do anything, but on the other hand it doesn't really do anything.
Then there's some very incomplete code conditionally disabling setjump guards (it's so bad I hesitate to link to it, but JLockerman/pgx@6f8ef73^...c824824). Promscale is very interested in being able to do this, since @cevian measured around 30% overhead to leaving them in, and with malloc forwarded to palloc this should work for their use case.

I think my preferred solution to these would be to add a feature flag to pgx that switches the allocator to use palloc and disables the guard, and I'd be happy to open a PR that adds it if you guys are willing to have that upstream, but I realize this isn't exactly the central use case for pgx, and I'm happy to hear other idea if you have a way you'd prefer to unify (or not unify) these.

eeeebbbbrrrr · 2021-07-20T18:04:41Z

We'll take 'em all except for the varlena_type! macro. The work @Hoverbear is doing is going to remove the need for that.

About switching "the CurrrentMemoryContext to TopMemoryContext when setting up the panic handler" -- I saw that commit and didn't really understand why it's necessary. Could you explain a little more? Postgres should take care of this for us, no? Regardless, I don't see why this would need to be feature gated.

Regarding the setjmp guard -- That definitely needs to be behind a feature flag. I don't believe it's the RightThing to do in general, but I absolutely see the use cases for it. That, combined with forwarding Rust's allocator to palloc/pfree could cause some very unexpected things for users. Such as Drop won't run and threads will cause segfaults. There's probably more too.

The changeset for that is kinda funny. I'd probably rather handle doing all of this work myself.

JLockerman · 2021-07-20T18:22:17Z

About switching "the CurrrentMemoryContext to TopMemoryContext when setting up the panic handler" -- I saw that commit and didn't really understand why it's necessary. Could you explain a little more? Postgres should take care of this for us, no? Regardless, I don't see why this would need to be feature gated.

This is me being unsure of what context register_pg_guard_panic_handler() is run in. The Box::new() in register_pg_guard_panic_handler() needs to be run using malloc() because if we use a shorter-term memory context the panic handler might be deallocated by the time we panic. Since I'm unsure which memory contexts we can be in at the point this function is called, I figured it'd be safer to do this in TopMemoryContext unconditionally.

Regarding the setjmp guard -- That definitely needs to be behind a feature flag. I don't believe it's the RightThing to do in general, but I absolutely see the use cases for it. That, combined with forwarding Rust's allocator to palloc/pfree could cause some very unexpected things for users. Such as Drop won't run and threads will cause segfaults. There's probably more too.

Yeah, I have to admit I'm not thrilled about it either, to the extent that this extension still compiles with the guards enabled. I think only people who should do it are people who really really care about performance and are willing to sacrifice on safety to get it; Promscale for instance found that the hit was bad enough that they'd rather write their extension in C rather than accept the performance hit, and that's something I'd prefer to prevent 😊.

The changeset for that is kinda funny. I'd probably rather handle doing all of this work myself.

Do you think something like

Add a if cfg!() to the guard rewriter that removes the setjmp
under the same #[cfg()] set the rust allocator to a palloc allocator that errors if called from the main thread
add memory context switches for init and iterator allocation in the SRF shim

would work? If so I can get a start on it (unless you're talking about my current changeset, which is entirely horrible, and was a one-off for benchmarking)

JLockerman · 2021-07-20T18:43:02Z

Thinking about this a little more, there may a simpler fix for our setjmp guard issue, I don't think we really care about removing the guard from internal calls to postgres functions right now, only for #[pg_extern], so with pgcentralfoundation/pgrx#168 #[pg_extern(no_guard)] (or however that's spelled) should work. (going to need to ask @cevian for his benchmarks)

eeeebbbbrrrr · 2021-07-20T19:13:04Z

This is me being unsure of what context register_pg_guard_panic_handler() is run in. The Box::new() in register_pg_guard_panic_handler() needs to be run using malloc() because if we use a shorter-term memory context the panic handler might be deallocated by the time we panic. Since I'm unsure which memory contexts we can be in at the point this function is called, I figured it'd be safer to do this in TopMemoryContext unconditionally.

I see. Yeah, that makes sense. register_pg_guard_panic_handler() only gets called once anyways.

Yeah, I have to admit I'm not thrilled about it either, to the extent that this extension still compiles with the guards enabled. I think only people who should do it are people who really really care about performance and are willing to sacrifice on safety to get it; Promscale for instance found that the hit was bad enough that they'd rather write their extension in C rather than accept the performance hit, and that's something I'd prefer to prevent 😊.

Agreed, especially with the last point. Rust is where it's at.

Do you think something like...

Not sure. Haven't thought about it enough yet.

Thinking about this a little more, there may a simpler fix for our setjmp guard issue, I don't think we really care about removing the guard from internal calls to postgres functions right now, only for #[pg_extern], so with pgcentralfoundation/pgrx#168 #[pg_extern(no_guard)] (or however that's spelled) should work.

Did you try profiling with the no_guard (or however that's spelled)? That would be the ideal solution for this part of things.

JLockerman · 2021-07-20T19:27:41Z

I opened pgcentralfoundation/pgrx#169 with the changes to cargo pgx test. It ended up being a little different than the version I had before:

I left in the --no-default-features flag; it looks like cfg-aware schema generation code I wrote is not aware of default features so it is more correct to pass in all the features you want to activate.
I added an --all flag to recover the current behavior.

JLockerman · 2021-07-20T19:42:10Z

Did you try profiling with the no_guard (or however that's spelled)? That would be the ideal solution for this part of things.

I'm waiting to hear back from @cevian who did the original benchmarking. Looking through the code it never calls any postgres functions except palloc()/repalloc()/pfree() through the rust allocator, so I'm optimistic. (In the worst case I wonder if it's possible for a more targeted intervention that just at those 🤔)

JLockerman · 2021-10-11T14:50:20Z

going to try and test the 0.2.0 beta next week and see if we can make the toolkit work with it

eeeebbbbrrrr · 2021-10-11T16:30:18Z

Wait for beta2, which I hope will happen tomorrow. Day 3 of fighting a terrible cold. My head is gonna explode

JLockerman · 2021-10-12T15:11:20Z

👍 will wait
Hope you feel better!

JLockerman · 2021-10-29T16:32:01Z

To all those following along: I'm in the process of getting the Toolkit to work on the 0.2.0 beta. It's through type checking, and now I'm getting it to work with the new schema generator. I hope to get to the stage where I have a PR open next week.

JLockerman · 2021-11-05T18:13:33Z

A snapshot of my progress to getting the Toolkit working on stock 0.2 can be found at PR #295. I've got it to generate a schema, but the schema is not yet correct.

JLockerman · 2021-11-10T17:10:26Z

We've updated to baseline 0.2 🎉

eeeebbbbrrrr added the feature-request And area of analysis that could be made easier label Jul 13, 2021

JLockerman changed the title ~~PRs for @JLockerman's pgx fork~~ Switch to upstream PGX Oct 29, 2021

JLockerman added the In Development Features that are being actively worked on, but are not yet merged label Oct 29, 2021

JLockerman changed the title ~~Switch to upstream PGX~~ [tracking issue] Switch to upstream PGX Oct 29, 2021

JLockerman mentioned this issue Nov 9, 2021

Update pgx to 0.2 #295

Merged

bors bot closed this as completed in 3e9beb1 Nov 10, 2021

bors bot closed this as completed in #295 Nov 10, 2021

JLockerman removed the In Development Features that are being actively worked on, but are not yet merged label Nov 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tracking issue] Switch to upstream PGX #177

[tracking issue] Switch to upstream PGX #177

eeeebbbbrrrr commented Jul 13, 2021

JLockerman commented Jul 14, 2021

eeeebbbbrrrr commented Jul 14, 2021

Hoverbear commented Jul 14, 2021

JLockerman commented Jul 16, 2021

eeeebbbbrrrr commented Jul 16, 2021

JLockerman commented Jul 19, 2021

eeeebbbbrrrr commented Jul 19, 2021

JLockerman commented Jul 20, 2021

eeeebbbbrrrr commented Jul 20, 2021

JLockerman commented Jul 20, 2021

JLockerman commented Jul 20, 2021 •

edited

eeeebbbbrrrr commented Jul 20, 2021

JLockerman commented Jul 20, 2021 •

edited

JLockerman commented Jul 20, 2021

JLockerman commented Oct 11, 2021

eeeebbbbrrrr commented Oct 11, 2021

JLockerman commented Oct 12, 2021

JLockerman commented Oct 29, 2021

JLockerman commented Nov 5, 2021

JLockerman commented Nov 10, 2021

[tracking issue] Switch to upstream PGX #177

[tracking issue] Switch to upstream PGX #177

Comments

eeeebbbbrrrr commented Jul 13, 2021

JLockerman commented Jul 14, 2021

eeeebbbbrrrr commented Jul 14, 2021

Hoverbear commented Jul 14, 2021

JLockerman commented Jul 16, 2021

eeeebbbbrrrr commented Jul 16, 2021

JLockerman commented Jul 19, 2021

eeeebbbbrrrr commented Jul 19, 2021

JLockerman commented Jul 20, 2021

eeeebbbbrrrr commented Jul 20, 2021

JLockerman commented Jul 20, 2021

JLockerman commented Jul 20, 2021 • edited

eeeebbbbrrrr commented Jul 20, 2021

JLockerman commented Jul 20, 2021 • edited

JLockerman commented Jul 20, 2021

JLockerman commented Oct 11, 2021

eeeebbbbrrrr commented Oct 11, 2021

JLockerman commented Oct 12, 2021

JLockerman commented Oct 29, 2021

JLockerman commented Nov 5, 2021

JLockerman commented Nov 10, 2021

JLockerman commented Jul 20, 2021 •

edited

JLockerman commented Jul 20, 2021 •

edited