Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: [offline] implement one-file-per-query #1770

Closed
wants to merge 2 commits into from

Conversation

abonander
Copy link
Collaborator

@abonander abonander commented Mar 30, 2022

continuation of/supercedes #1183
closes #570
closes #1005

cc @jplatte

TODO:

  • choose a binary format
    • We want the most efficient deserialization since it'll be compiled in debug mode.
    • Less churn in Git diffs if files are not textual (maybe cover this with .gitattributes?)
  • update documentation
  • tests?

@abonander abonander added this to the 0.6.0 milestone Mar 30, 2022
@abonander abonander force-pushed the ab/one-file-per-query branch 4 times, most recently from 0408707 to 891cd45 Compare March 30, 2022 23:53
@jplatte
Copy link
Contributor

jplatte commented Mar 31, 2022

Title is not accurate (and wasn't on my PR for a while 😅), since the prepare command is still there (and I think that is the right thing).

Also, since this touches the whole cargo-metadata thingy, it may be a good idea to fix #1706 along the way (see latest comment for how it can likely be solved).

@abonander
Copy link
Collaborator Author

The title was automatically generated from your commit message, which I cherry-picked to keep correct attribution on the parts of your code that were useful.

@abonander
Copy link
Collaborator Author

Re: the performance of the query macros, I've been testing this branch along with a couple that switch the serialization from serde_json to bincode and ciborium (CBOR) and the performance is near identical on a project with ~100 query macro invocations.

Adding the following to the Cargo.toml at the workspace root improves performance by roughly 10-20%:

[profile.dev.build-override]
opt-level = 3
debug = false
debug-assertions = false
incremental = false
overflow-checks = false

The caching implemented by @LovecraftianHorror in #1684 was actually the biggest performance win for offline mode as it cut the runtime of the macros by roughly a factor of 2-4 over v0.5.10, and is comparable in performance to this branch and its siblings.

SQLx 0.5.11 with the caching of sqlx-data.json and optimizations enabled as above is actually faster than any of my branches by about 20%.

However, the total runtime of the macro expansion pass as given by -Z time-passes is still dwarfed by the time spent in codegen and LLVM, which doesn't change. The project takes between 11-13 seconds to compile in debug mode across all configurations.

While we probably still want this change for its ergonomic benefits, efforts in optimizing the compile time of SQLx are probably better spent in optimizing the code we generate and not necessarily how we generate it.

@abonander
Copy link
Collaborator Author

Here's a flamegraph of the compilation with v0.5.11 without optimizations, code in libsqlx_macros counts for only 1.62% of the total time:
image

@abonander abonander changed the title WIP: [offline] Remove sqlx-data.json and sqlx prepare command WIP: [offline] implement one-file-per-query Apr 1, 2022
@abonander abonander force-pushed the ab/one-file-per-query branch 10 times, most recently from 82385f4 to 4c2ef77 Compare April 8, 2022 20:26
jplatte and others added 2 commits April 14, 2022 16:22
Query data is now stored in .sqlx/{query_hash}.json directly by the macro
invocations, rather than first writing to target/sqlx/{input_span_hash}.json
and then collecting those into sqlx-data.json separately.
@cycraig
Copy link
Contributor

cycraig commented Aug 29, 2022

👋 @abonander Do you still intend to revisit this PR when you have time or would you be open to someone taking over? I see this is quite a long-standing issue.

There seem to be only a few tasks outstanding:

I expect merging main will probably fix one or two issues regarding workspaces.

Or am I underestimating the work left, is there something I missed?

@jplatte
Copy link
Contributor

jplatte commented Aug 30, 2022

Regarding .sqlx at the workspace level vs .sqlx at the package level, I think the latter makes more sense as that makes it possible to publish crates containing sqlx queries with the metadata cache included.

@cycraig
Copy link
Contributor

cycraig commented Sep 1, 2022

Well, it took longer than expected but I continued the branch in my fork. It's cleaned up and pretty much finished apart from documentation. If there's any interest, you can view the changes here (didn't want to open a PR without permission):
main...cycraig:sqlx:feat/one-file-per-query

I kept pretty close to the current PR (e.g. kept the offline in-memory caching implementation rather than adapting the version of it from main, both would work fine but not sure if there's a performance difference yet).

What did I change then? Pretty much what was listed above and some more:

  • Fixed merge conflicts with main.
  • Implemented sqlx prepare --check which was a TODO.
  • Replaced CargoMetadata with the Metadata from main.
  • Fixed several bugs:
    • prepare in a sub-package trying to use the workspace .sqlx folder and failing, fixed with the new SQLX_OFFLINE_DIR variable (which was the intention I believe). It now outputs .sqlx at the package level unless --workspace is used.
    • IO/OS race conditions with the new file renaming/moving strategy when saving offline query data to disk.
    • CARGO_TARGET_DIR problems when compiling a sub-package of a workspace with multiple targets (weird).

I manually tested cargo sqlx prepare and cargo sqlx prepare --check in a project with a:

  • Single crate.
  • Virtual workspace with three crates.
  • Workspace with three crates and a top-level crate.

Which confirmed the behaviour should be the same as before in terms of which queries are included or not compared to sqlx-data.json.

Take a look if you have the chance and let me know if it's worth finishing this off (just documentation and testing/benchmarking, unless you see anything that needs changing, like the in-memory caching) or pursuing a different direction, such as starting from scratch?

@abonander abonander modified the milestones: 0.6.0, 0.7.0 Sep 3, 2022
@abonander
Copy link
Collaborator Author

I'm working on a large refactor for 0.7.0 and we want to land this as part of that release. @cycraig if you'd open a new PR based against 0.7-dev I'll be glad to look at it.

@abonander
Copy link
Collaborator Author

Completed in #2363

@abonander abonander closed this Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cargo sqlx prepare with MSSQL Move to one-file-per-query for sqlx-cli prepare
3 participants