WIP: [offline] implement one-file-per-query #1770

abonander · 2022-03-30T21:20:47Z

continuation of/supercedes #1183
closes #570
closes #1005

TODO:

choose a binary format
- We want the most efficient deserialization since it'll be compiled in debug mode.
- Less churn in Git diffs if files are not textual (maybe cover this with .gitattributes?)
update documentation
tests?

jplatte · 2022-03-31T09:55:44Z

Title is not accurate (and wasn't on my PR for a while 😅), since the prepare command is still there (and I think that is the right thing).

Also, since this touches the whole cargo-metadata thingy, it may be a good idea to fix #1706 along the way (see latest comment for how it can likely be solved).

abonander · 2022-03-31T20:51:56Z

The title was automatically generated from your commit message, which I cherry-picked to keep correct attribution on the parts of your code that were useful.

abonander · 2022-03-31T22:41:12Z

Re: the performance of the query macros, I've been testing this branch along with a couple that switch the serialization from serde_json to bincode and ciborium (CBOR) and the performance is near identical on a project with ~100 query macro invocations.

Adding the following to the Cargo.toml at the workspace root improves performance by roughly 10-20%:

[profile.dev.build-override]
opt-level = 3
debug = false
debug-assertions = false
incremental = false
overflow-checks = false

The caching implemented by @LovecraftianHorror in #1684 was actually the biggest performance win for offline mode as it cut the runtime of the macros by roughly a factor of 2-4 over v0.5.10, and is comparable in performance to this branch and its siblings.

SQLx 0.5.11 with the caching of sqlx-data.json and optimizations enabled as above is actually faster than any of my branches by about 20%.

However, the total runtime of the macro expansion pass as given by -Z time-passes is still dwarfed by the time spent in codegen and LLVM, which doesn't change. The project takes between 11-13 seconds to compile in debug mode across all configurations.

While we probably still want this change for its ergonomic benefits, efforts in optimizing the compile time of SQLx are probably better spent in optimizing the code we generate and not necessarily how we generate it.

abonander · 2022-03-31T23:45:04Z

Here's a flamegraph of the compilation with v0.5.11 without optimizations, code in libsqlx_macros counts for only 1.62% of the total time:

Query data is now stored in .sqlx/{query_hash}.json directly by the macro invocations, rather than first writing to target/sqlx/{input_span_hash}.json and then collecting those into sqlx-data.json separately.

cycraig · 2022-08-29T21:38:46Z

👋 @abonander Do you still intend to revisit this PR when you have time or would you be open to someone taking over? I see this is quite a long-standing issue.

There seem to be only a few tasks outstanding:

Address TODO in prepare.rs to implement --check.
Improve support for workspace cases (currently --workspace does not generate queries for sub-crates when the workspace root is also a crate, and queries are not generated for some sub-crates when in a virtual workspace, not sure why yet).
Fix an empty .sqlx folder being generated when run in a sub-crate in a workspace, causing an error that the root .sqlx folder is missing (if it doesn't happen to exist already). Is it intended that all queries in a workspace should be stored in the root .sqlx?
Fix merge conflicts with main (replace cargo.rs with metadata.rs etc.).
Update documentation:
- https://github.com/launchbadge/sqlx/blob/main/sqlx-cli/README.md#enable-building-in-offline-mode-with-query
- https://docs.rs/sqlx/latest/sqlx/macro.query.html#offline-mode-requires-the-offline-feature
- Update comments, doc-comments and error messages mentioning offline mode/sqlx-data.json.

I expect merging main will probably fix one or two issues regarding workspaces.

Or am I underestimating the work left, is there something I missed?

jplatte · 2022-08-30T04:30:28Z

Regarding .sqlx at the workspace level vs .sqlx at the package level, I think the latter makes more sense as that makes it possible to publish crates containing sqlx queries with the metadata cache included.

cycraig · 2022-09-01T21:56:49Z

Well, it took longer than expected but I continued the branch in my fork. It's cleaned up and pretty much finished apart from documentation. If there's any interest, you can view the changes here (didn't want to open a PR without permission):
main...cycraig:sqlx:feat/one-file-per-query

I kept pretty close to the current PR (e.g. kept the offline in-memory caching implementation rather than adapting the version of it from main, both would work fine but not sure if there's a performance difference yet).

What did I change then? Pretty much what was listed above and some more:

Fixed merge conflicts with main.
Implemented sqlx prepare --check which was a TODO.
Replaced CargoMetadata with the Metadata from main.
Fixed several bugs:
- prepare in a sub-package trying to use the workspace .sqlx folder and failing, fixed with the new SQLX_OFFLINE_DIR variable (which was the intention I believe). It now outputs .sqlx at the package level unless --workspace is used.
- IO/OS race conditions with the new file renaming/moving strategy when saving offline query data to disk.
- CARGO_TARGET_DIR problems when compiling a sub-package of a workspace with multiple targets (weird).

I manually tested cargo sqlx prepare and cargo sqlx prepare --check in a project with a:

Single crate.
Virtual workspace with three crates.
Workspace with three crates and a top-level crate.

Which confirmed the behaviour should be the same as before in terms of which queries are included or not compared to sqlx-data.json.

Take a look if you have the chance and let me know if it's worth finishing this off (just documentation and testing/benchmarking, unless you see anything that needs changing, like the in-memory caching) or pursuing a different direction, such as starting from scratch?

abonander · 2022-09-15T23:59:03Z

I'm working on a large refactor for 0.7.0 and we want to land this as part of that release. @cycraig if you'd open a new PR based against 0.7-dev I'll be glad to look at it.

abonander · 2023-03-03T00:06:16Z

Completed in #2363

abonander added this to the 0.6.0 milestone Mar 30, 2022

abonander force-pushed the ab/one-file-per-query branch 4 times, most recently from 0408707 to 891cd45 Compare March 30, 2022 23:53

abonander changed the title ~~WIP: [offline] Remove sqlx-data.json and sqlx prepare command~~ WIP: [offline] implement one-file-per-query Apr 1, 2022

abonander force-pushed the ab/one-file-per-query branch 10 times, most recently from 82385f4 to 4c2ef77 Compare April 8, 2022 20:26

abonander mentioned this pull request Apr 11, 2022

Overhauling cargo sqlx prepare --merged #1793

Closed

jplatte and others added 2 commits April 14, 2022 16:22

feat(macros): move to one-file-per-query for offline mode

1fa2381

Query data is now stored in .sqlx/{query_hash}.json directly by the macro invocations, rather than first writing to target/sqlx/{input_span_hash}.json and then collecting those into sqlx-data.json separately.

chore: test macros' offline mode in CI

b532eb3

abonander force-pushed the ab/one-file-per-query branch from 4c2ef77 to b532eb3 Compare April 14, 2022 23:22

abonander mentioned this pull request Aug 25, 2022

Fix prepare race condition in workspaces #2069

Merged

abonander modified the milestones: 0.6.0, 0.7.0 Sep 3, 2022

cycraig mentioned this pull request Sep 21, 2022

[offline] Change prepare to one-file-per-query #2110

Closed

2 tasks

abonander force-pushed the main branch from eebfeeb to 6cf15b0 Compare February 21, 2023 22:06

cycraig mentioned this pull request Feb 21, 2023

[offline] Change prepare to one-file-per-query #2363

Merged

abonander force-pushed the main branch from 6cf15b0 to eade49c Compare February 21, 2023 22:56

abonander closed this Mar 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: [offline] implement one-file-per-query #1770

WIP: [offline] implement one-file-per-query #1770

abonander commented Mar 30, 2022 •

edited

Loading

jplatte commented Mar 31, 2022

abonander commented Mar 31, 2022

abonander commented Mar 31, 2022

abonander commented Mar 31, 2022

cycraig commented Aug 29, 2022 •

edited

Loading

jplatte commented Aug 30, 2022

cycraig commented Sep 1, 2022 •

edited

Loading

abonander commented Sep 15, 2022

abonander commented Mar 3, 2023

WIP: [offline] implement one-file-per-query #1770

WIP: [offline] implement one-file-per-query #1770

Conversation

abonander commented Mar 30, 2022 • edited Loading

jplatte commented Mar 31, 2022

abonander commented Mar 31, 2022

abonander commented Mar 31, 2022

abonander commented Mar 31, 2022

cycraig commented Aug 29, 2022 • edited Loading

jplatte commented Aug 30, 2022

cycraig commented Sep 1, 2022 • edited Loading

abonander commented Sep 15, 2022

abonander commented Mar 3, 2023

abonander commented Mar 30, 2022 •

edited

Loading

cycraig commented Aug 29, 2022 •

edited

Loading

cycraig commented Sep 1, 2022 •

edited

Loading