Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement miniserde-based deserialization #523

Merged
merged 1 commit into from
Apr 10, 2024

Conversation

mzeitlin11
Copy link
Collaborator

@mzeitlin11 mzeitlin11 commented Mar 24, 2024

Built on #452, only the last commit is new. Very much a first pass, still some questions to answer. Initial results look promising and hopeful that profiling will reveal some other low-hanging fruit to further improve compile time.

For some quite unscientific timings, a clean release build of examples/endpoints goes from ~4m to 50s with min-ser enabled. More importantly though, the actual time to build just the binary goes from 75s to 7s, so incremental builds for code depending on async-stripe should be much faster.

Stripped binary size of the examples/endpoints binary also went from ~70MB to ~20MB. There's likely room for further binary size improvement since a fat LTO build shrunk this further to ~13 MB.

Feature Flags

The most important question to answer is how to expose the option of using miniserde. This PR takes the cautious approach of adding a min-ser feature which skips implementing serde::Deserialize and instead always implements miniserde::Deserialize where necessary for requests. The advantage with this approach is that min-ser could left as an "experimental" feature flag, making an initial release less breaking by requiring explicitly opting into min-ser. There are some annoyances with this approach, though

  • Cargo features are additive, but min-ser is not additive - it removes functionality, so using it can be unintuitive (and might lead to similar feature incompatibility pains as the current runtime* features.
  • Additional code complexity - see for example the added StripeDeserialize trait, added to abstract over the fact that a request might need either serde or miniserde to deserialize. Also, some types become feature flag dependent, e.g. a field might be miniserde::Value or serde_json::Value, which can't be shown easily in docs. (And we have to add a bunch of cfg blocks)

The other alternative would be to always use miniserde for library functionality (deserializing requests, webhooks, etc.). Then an additional additive serde-deserialize flag could be added to ask for serde::Deserialize to be derived on all types (useful for testing purposes). This would solve the complexity issues above at the cost of making miniserde impossible to opt out of.

Implementation Details

As seen by the diff, this adds a bunch of generated code :) . This new code is essentially using our codegen mechanism to mimic what the miniserde derive is already doing to allow deserialization cases that miniserde cannot support:

Expandable<T>:

  • We manually derive Deserialize for T so that we can publicly expose the underlying Builder type so that Expandable<T> can use the same underlying implementation as T.

"Union of objects" types where the type is determined with the "object" field.

  • This is implemented similarly to how serde(tag = "x") works - the data gets deserialized into an untyped JSON representation, which can then be converted to the correct variant using the "object" key. miniserde provides a convenient miniserde::Value for this purpose, but we then need to also generate impls for Value -> T for each T that we need to deserialize.

"Deleted or not" types where the type is determined by whether there is a deleted: true field.

  • This is implemented pretty much identically to the "object" case above, just discriminating between variants using the "deleted" boolean instead.

Copy link

codecov bot commented Mar 24, 2024

Codecov Report

Attention: Patch coverage is 7.51193% with 6587 lines in your changes are missing coverage. Please review.

Project coverage is 7.38%. Comparing base (7686a0b) to head (7f9d426).

❗ Current head 7f9d426 differs from pull request most recent head f568feb. Consider uploading reports for the commit f568feb to get more accurate results

Files Patch % Lines
...d/stripe_checkout/src/checkout_session/requests.rs 0.00% 410 Missing ⚠️
generated/stripe_billing/src/invoice/requests.rs 0.00% 353 Missing ⚠️
...erated/stripe_billing/src/subscription/requests.rs 0.00% 283 Missing ⚠️
...ated/stripe_checkout/src/checkout_session/types.rs 63.09% 131 Missing ⚠️
...stripe_billing/src/billing_portal_session/types.rs 0.00% 90 Missing ⚠️
...heckout/src/checkout_acss_debit_mandate_options.rs 0.00% 90 Missing ⚠️
..._billing/src/billing_portal_configuration/types.rs 0.00% 88 Missing ⚠️
.../src/checkout_acss_debit_payment_method_options.rs 0.00% 87 Missing ⚠️
...out/src/checkout_session_payment_method_options.rs 50.32% 77 Missing ⚠️
...rc/payment_pages_checkout_session_custom_fields.rs 0.00% 77 Missing ⚠️
... and 102 more
Additional details and impacted files
@@           Coverage Diff            @@
##            next    #523      +/-   ##
========================================
+ Coverage   5.55%   7.38%   +1.82%     
========================================
  Files        932     934       +2     
  Lines      38837   97550   +58713     
========================================
+ Hits        2159    7205    +5046     
- Misses     36678   90345   +53667     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mzeitlin11 mzeitlin11 marked this pull request as draft March 25, 2024 04:18
@arlyon
Copy link
Owner

arlyon commented Apr 4, 2024

OK. Flagging is a hard one.

I can see a world where people would want to store the stripe structs verbatim so having serde would be useful but I am also tempted to say lets just fire it off without serde for now and wait for someone to complain. I think as far as the lib is concerned we should only speak miniserde, and then, as you said, you can add a flag to get all that compile time hell back if you'd like it. It wouldn't be too hard to have the clients speak both miniserde and serde so I think we could end up at some point just letting the user pick one or both (or neither) but lets aim for miniserde for this change.

The motivation here is that I'd like to see if I can do some graph analysis and split these crates up also. If we can do individual clients and split the apis into a few crates that would be lovely since you could then opt-in to serde for a subset of the API which may be even more helpful still. We just need to be careful here so I will call that a stretch goal. The compile time improvements we have seen so far are more more more than enough. 13MB is 'large' but not obscene. It is very common for web servers to hit that just with tokio and a framework so I don't think it is an issue.

@arlyon
Copy link
Owner

arlyon commented Apr 4, 2024

Implementation details all look solid to me.

@arlyon
Copy link
Owner

arlyon commented Apr 4, 2024

BTW I sent you a maintainer invite. Master needs a PR but you can push / merge freely to next

@mzeitlin11
Copy link
Collaborator Author

BTW I sent you a maintainer invite. Master needs a PR but you can push / merge freely to next

Thanks, have accepted!

@mzeitlin11
Copy link
Collaborator Author

I can see a world where people would want to store the stripe structs verbatim so having serde would be useful but I am also tempted to say lets just fire it off without serde for now and wait for someone to complain. I think as far as the lib is concerned we should only speak miniserde, and then, as you said, you can add a flag to get all that compile time hell back if you'd like it.

I like this approach! I think in that case this would benefit from a bunch more testing to minimize the chance these miniserde changes cause regressions (which are particularly annoying because miniserde purposefully reports no errors. But if we generate requests properly, this should not be an issue!). It would be great to find a way to automate some deserialization tests (probably using the Stripe OpenAPI fixtures, but needs more investigation).

@mzeitlin11
Copy link
Collaborator Author

The motivation here is that I'd like to see if I can do some graph analysis and split these crates up also. If we can do individual clients and split the apis into a few crates that would be lovely since you could then opt-in to serde for a subset of the API which may be even more helpful still.

That would be wonderful - there's some basic graph analysis implemented to help infer crates, but could certainly be fleshed out. The main annoyance I saw was that there's one huge connected component (which right now gets shoved into stripe_shared to avoid circular dependencies, but is the limiting factor on deserialization-related compile time)

@mzeitlin11 mzeitlin11 force-pushed the miniserde_exp branch 2 times, most recently from d8adc6e to 7f9d426 Compare April 7, 2024 02:20
@mzeitlin11
Copy link
Collaborator Author

Rebased and implemented the alternative mentioned above about speaking miniserde and a feature flag for enabling full serde de/serialization

@mzeitlin11 mzeitlin11 marked this pull request as ready for review April 7, 2024 23:25
@mzeitlin11
Copy link
Collaborator Author

mzeitlin11 commented Apr 7, 2024

Added some basic testing using the OpenAPI provided fixtures in the last commit. There are still some tweaks I'd like to make (for example, see the ugly addition of single variant enums to ensure the "object" key is properly serialized, only necessary so we better match stripe when serializing for testing purposes), but that don't need to be part of this pr.

@mzeitlin11 mzeitlin11 force-pushed the miniserde_exp branch 2 times, most recently from 42cf51c to 4cc2678 Compare April 8, 2024 00:46
@mzeitlin11 mzeitlin11 changed the title [Experiment] Miniserde Implement miniserde-based deserialization Apr 8, 2024
@mzeitlin11
Copy link
Collaborator Author

After using this, thought it might be better to split the serde feature into serialize and deserialize (done in latest commit). Think it makes sense because these features are mostly useful for testing contexts - deserialize is the big compile time offender, and avoidable by using miniserde in tests instead. So the flexibility of just enabling the lighter weight serialize on its own seems useful.

Copy link
Owner

@arlyon arlyon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with this now!

Add fixture testing and serialize object keys

Fix object keys in tests

Add object field for serialization without adding field

Split serde gate into serialize and deserialize
@mzeitlin11 mzeitlin11 merged commit 020065b into arlyon:next Apr 10, 2024
18 checks passed
@mzeitlin11 mzeitlin11 deleted the miniserde_exp branch April 10, 2024 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants