Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: New Architecture for sqlx::query!() Macros #1598

Open
abonander opened this issue Dec 28, 2021 · 16 comments
Open

Proposal: New Architecture for sqlx::query!() Macros #1598

abonander opened this issue Dec 28, 2021 · 16 comments

Comments

@abonander
Copy link
Collaborator

abonander commented Dec 28, 2021

Status Quo

Currently, the default mode of operation for the sqlx::query!() family of macros is to connect to a running database server (or open that database in-process for SQLite) as specified by the DATABASE_URL environment variable, and then use database-specific commands to parse and analyze the passed query, to get information on bind parameters and output columns and generate code based on that information.

As an option, the macros can also operate in an "offline" mode where instead of connecting to a database server, they read a sqlx-data.json file created by cargo sqlx prepare which contains the same data and use that to generate code.

The implementation is relatively straightforward and functions well enough, but has some drawbacks:

Compiling a project with a lot of query!() macro invocations is painfully slow.

Since each macro invocation involves opening and working with a TCP connection to a database server from within an asynchronous runtime, the query macros execute a lot more code than your typical procedural macro. This is compounded by the fact that procedural macros are exclusively compiled in debug mode which means no optimizations (which async code relies pretty heavily on to be fast).

I have tried playing around with setting rustflags in a config.toml to force sqlx-macros to compile in release mode but it didn't really help and seemed to actually worsen compile times if it did anything at all; the most noticeable thing that it did was clobber all the cached compiler artifacts for the whole dependency tree, requiring a full recompile.

Macro invocations are currently executed serially which further compounds the problem.

Currently as a mitigation we only start a single runtime and then cache it in a lazy_static!(); since proc macros are executed in a separate process that is kept running for the duration of the compiler, this saves some overhead per invocation. I've been meaning to see if caching a Pool of connections saves any overhead as well. I imagine it would, at least somewhat. We could cache analysis results as well, but that would only help for queries that are copy-pasted a lot.

Many people find the idea of procedural macros opening arbitrary TCP connections to be highly unpalatable.

Oftentimes when someone comes across SQLx and starts reading about the query macros, their first reaction is something along the lines of, "wow, the compiler actually lets you do that? That's disgusting!"

Fair enough, the current procedural macro environment is extremely laissez-faire and there have been calls to lock down what procedural macros (and build scripts) can actually do, thanks to sensationalized proof-of-concept projects showing how a proc macro executed by a transitive dependency could steal sensitive information like SSH keys and the like.

Personally, I think the threat is kind of overblown, or at least rather moot, considering a malicious crate in your dependency tree could accomplish all kinds of nastiness already, just at runtime instead of compile time. I fear that a big push to sandbox procedural macros and build scripts could lure people into a false sense of security and make them feel like they don't need to audit their dependencies. But, I digress.

The query!() macros don't play that well with IDEs.

This situation has improved drastically compared to when we first introduced SQLx, to the point where IntelliJ-Rust and Rust Analyzer can both provide code completion for sqlx::query!(). However, it can still take some time to get results, in part thanks to point 1 here, and the macros aren't "pure" in the sense that their output could change from run to run based on schema changes, without the compiler or IDEs being able to pick up on it. They should be working from cached output where possible, and be able to invalidate that cache in a way that IDE plugins can learn about.

Offline mode is finicky to use and it's easy to forget to run cargo sqlx prepare before committing.

Offline mode isn't easily discoverable (in part thanks to just how dense the documentation on sqlx::query!() has gotten... we really need to move that stuff to an MDBook or something). It's also easy to forget to run cargo sqlx prepare before committing, which often leads to really confusing CI errors.

In the past, what I've done is add cargo sqlx prepare && git add sqlx-data.json to .git/hooks/precommit to make sure I don't forget, but this makes committing take forever thanks to point 1.

The macros don't have a way to ensure that the database is up-to-date with migrations.

This is somewhat on purpose as we don't want the macros automatically executing migrations which may end up breaking the user's local development database and/or clobbering manually generated test data. Still, it does get annoying.

Proposal

Instead of having the macros connect directly to the database server at compile time, they could communicate with a daemon process that the user starts and leaves running while working on their project.

I'm imagining this daemon process as another subcommand of sqlx-cli, perhaps cargo sqlx watch or something like that. It would watch target/sqlx which the macros would write their query strings to as files. I considered using a crate like interprocess to let the macros and daemon talk directly, but I think communicating with the filesystem is better because that's a medium that the compiler natively understands as I'll elaborate on in a second.

Since a cargo installed binary is built in release mode by default, the latency of the actual request to parse and analyze the query should improve quite a bit.

The macros could submit their request to the daemon and then return some dummy code (or even a temporary error, like Waiting for query analysis. Is `cargo sqlx watch` running?), telling the compiler to watch target/sqlx for the daemon to write the result of processing the query. Then the compiler could move on to the next macro invocation instead of blocking on the completion of the first one. This would let the macros essentially run in parallel, which should significantly reduce overall compile times.

Assuming #570 is implemented as part of this rearchitecting, the daemon could also automatically update entries in .sqlx/ for successfully parsed queries so that cargo sqlx prepare doesn't have to be manually run before committing.

This would also be more amenable to sandboxing as we would only require write access to target/ and read access to the project, which wouldn't be doing anything more than what build scripts are already expected to do, and any sandboxing scheme that is implemented would be required to support. Support for Windows named pipes or Unix domain sockets would probably be right out in this case unless we specifically asked for it, and that would be an uphill battle.

The daemon could also periodically re-run the analysis of queries to pick up changes to the database schema.

@mehcode suggested that the daemon could automatically migrate the database if necessary, so no other manual setup would be needed. It could even spin up a temporary database so migrations could be automatically tested and synchronized with the query macros without breaking a local development database.

@abonander
Copy link
Collaborator Author

I'm wondering if there's a way that we can support incremental adoption of cargo sqlx watch into people's workflows. It is yet another process that needs to be running to compile your code, and some might find that annoying regardless of the potential benefits.

Perhaps cargo sqlx watch could create target/sqlx/watch.lock or something as a temporary file and the macros could just check if that file exists or not to know if cargo sqlx watch is running. If the file doesn't exist, they can fall back to their current behavior.

@jplatte
Copy link
Contributor

jplatte commented Dec 28, 2021

If #1183 could be finished and the workspace situation improved (see discussion in #1223), I think that would solve my own pain points. I'm not convinced a separate daemon would make working with SQLx easier at all.

For context: The database part of the service I work on is updated infrequently enough that an offline-first workflow works well (.env contains SQLX_OFFLINE=true and the macros only connect to the DB as part of cargo sqlx prepare).

@abonander
Copy link
Collaborator Author

@jplatte hmm, then as a compromise, maybe the macros could use the cached data first if it exists and the migrations haven't changed. Then only the modified macro invocation will actually have to talk to the database.

It's certainly worth trying as a first step at least.

@bennofs
Copy link

bennofs commented Jan 8, 2022

I have thought a lot about how to handle these compile-time query checking features (not in rust, but also in other languages). Needing access to a database of the same kind as the production base during builds is almost always hard and leads to complex build system. This is why I want to suggest an alternate approach:

  • during compile time, only record assertions that need to hold on the DB for queries to be valid
  • provide a simple method to execute these assertions against any database after compilation

So in the end, you generate a check_compat(connection) method that the compiled program can use to check whether the DB satisfies all the things it expects directly during startup and fail fast if it does not. You could also provide a CLI command that does the same. To me, this has the following advantages:

  • no need to access the DB during compile time
  • also catches deployment issues, such as migrations that haven't been applied, since the compat check can be run against the real DB
  • still fails fast: after application startup, no query should fail due to trivial syntactic/query issues
  • you can test that old versions of apps are still compatible with the DB after running certain migrations (necessary if you do rolling deploys where different versions of the code are accessing the same DB), simply by running the check_compat function of the old version. This does not require a rebuilt of the old version.

Is there any use case where getting the error directly at application startup (or by running the checker function directly via cli after build or even as an automatic post-build step) is significantly worse than getting them during build? The main motivation for compile-time query checking from my experience is that you want to avoid situations where you first have to trigger a certain code path to see the error, which leads to long cycle times and is inconvenient. But if you can check all queries directly at application startup or with a CLI command, that solves the same issue without the disadvantages of compile-time network access.

@jplatte
Copy link
Contributor

jplatte commented Feb 1, 2022

during compile time, only record assertions that need to hold on the DB for queries to be valid

This is quite simply not feasible. SQLx checks the queries directly against the database (or against cached metadata previously obtained from the database) because the alternative is parsing / interpreting the query string, which could be using many, many different syntax constructs and features depending on the DB and even how it is set up.

As a little thought experiment, consider this simple query: select my_function(). It's super basic, but at the same time it's impossible to know what type it will return for software other than the actual database you would use it with.

@pythoneer
Copy link

pythoneer commented Feb 2, 2022

Not to derail the conversation topic to badly but i see a use case in the mechanics that @bennofs proposed with check_compat(connection) but i also agree with @jplatte that this is not really the way to go. I just imagined (and its probably the best to create a new issue for this) a way of having both. I am not aware if the current cli-tool or any migrate-style macro handles this but it could be helpful if you can take a sqlx-data.json created in the CI-Job or on your dev machine to the deployment target and have the cli-tool use the sqlx-data.json you brought with you and "compare" it against the live database and/or have a migrate-like macro to check the brought with you sqlx-data.json against the live database before starting? This would give you more confidence if you can't build against the live database (which isn't a good idea anyway i think).

Edit:
maybe just generating a second sqlx-data.json on the deployment machine and just diffing it is enough – idk how "stable" the sqlx-data.json output is.

@jplatte
Copy link
Contributor

jplatte commented Feb 2, 2022

@pythoneer It sounds like you are describing cargo sqlx prepare --check, which is already a thing and useful also as a CI job to make sure any new migrations are accounted for in the code (and the query cache).

@bennofs
Copy link

bennofs commented Feb 2, 2022

As a little thought experiment, consider this simple query: select my_function(). It's super basic, but at the same time it's impossible to know what type it will return for software other than the actual database you would use it with.

Hmm, couldn't you specify the return type explicitly in the code? Are there many other things apart from return types and argument types that you'd need to know about the query? Ideally you'd be able to record the (parametrized) query, return and argument types and then check against the database that those types are convertible to/from the the actual query parameters/return value.

@jplatte
Copy link
Contributor

jplatte commented Feb 2, 2022

Hmm, couldn't you specify the return type explicitly in the code?

In theory yes, but that would be a lot of unnecessary extra boilerplate.

Are there many other things apart from return types and argument types that you'd need to know about the query?

Yes, a bunch of things. And you would have to be able to fully parse the query to get that information, which may be reasonable for a simple DB, but SQLx supports multiple very complex databases. Even "just" a self-made parser for PostgreSQL's SQL flavor would be a huge project.

@bennofs
Copy link

bennofs commented Feb 2, 2022

I definitely agree that parsing queries should be avoided. As far as I can see, with query_as!, the rust types of output/input are known (parameters must be known because they are rust values, and output type is determined by the named struct). But yes, specifying the type for the query! macro could become annoying, although personally I'd take the tradeoff of having to specify types explicitly in my code vs needing to keep files depending on db schema up to date.

Anyway, this is not core of sqlx so I may just try to implement that myself as a separate crate, and see how well it works :)

@abonander
Copy link
Collaborator Author

As far as I can see, with query_as!, the rust types of output/input are known (parameters must be known because they are rust values, and output type is determined by the named struct)

The processing is still driven by the metadata returned from the database. Proc macros can't introspect any code that isn't in their input, so all query_as!() does it take the name of the struct and emit a literal value for it the same way query!() does, but without generating an ad-hoc struct first.

It will happily emit fields that aren't in the struct or omit ones that are if they aren't in the query, and it's up to the compiler to sanity check that.

@loafofpiecrust
Copy link

loafofpiecrust commented Feb 9, 2022

At least considering Postgres, couldn't this tool use an exported schema using pg_dump -s database or a custom JSON format that isn't dependent on specific queries? This way, you could run cargo sqlx prepare only once after all migrations are run on the DB, then the file it generates can be used to validate any query. Then I don't have to generate whenever I change queries. Would that be too slow or complex? I like some of these ideas around automating testing for migrations w/o breaking current database, but I'm honestly not sure if I love running another daemon for project development.

@abonander
Copy link
Collaborator Author

@loafofpiecrust have a look at this FAQ answer: https://github.com/launchbadge/sqlx/blob/master/FAQ.md#why-cant-sqlx-just-look-at-my-database-schemamigrations-and-parse-the-sql-itself

It's a bit snarky but the suggestion comes up a lot and so it gets a bit annoying to respond to.

@loafofpiecrust
Copy link

Yup @abonander sorry for the noise.

@andoriyu
Copy link

Also related, I find using query! cumbersome sometimes:

  • I run my migrations from my binary, to use query!() I need to run migrations with sqlx-cli first during development. It's not the end of the world, but makes me dread working on schema related things.
  • It locks you into using a single backend per crate.
    • I either need to introduce additional crates, one per backend
    • Misusing features make them mutually exclusive, this solves issue only partially - still can't SQLite and PostgreSQL both using query!()

I stopped using it due to that, but then I added a crate dependency that uses it, and it's even worse - it is misusing features and provides offline support only for PostgreSQL.

I think you're on the right track that macro opening any kind of network connections is pretty wild design. Is there any way to do something like:

  • if query can be done with sqlx-data.json do that
  • if not in release mode and query can't be done with offline support - just do it the same way regular query works.
    • emit a warning or whatever
  • if in release mode and query can't be done with sqlx-data.json bailout.
    • I don't mind checking that sqlx-data.json is up-to-date during CI
  • This obviously depends on sqlx-data.json to support multiple backends at the same time.

I'd rather not daemon:

  • On my local machine during test/check...maybe? It does complicate on-boarding a little, but oh well.
  • For sure don't want to run in my "lab" (Sometimes I run whole stack in container where code is rebuilt on change)

Sorry for the wall of text. sqlx is a great crate, you are all doing a great job.

@tyrelr
Copy link
Contributor

tyrelr commented May 5, 2023

This is an interesting idea, and is a much bigger idea than my thought of having a sqlx describe command to inspect the results of a query.

I do see significant potential advantages to separating the query analysis from the rust compilation:

  • It provides a familiar workflow for anyone used to code-gen tools (xsd, swagger, proto, etc.)
    • Might enable outright code-gen in the future, allowing sqlx type-checking without macros
  • Introduces an offline-cache of the results for all queries:
    • very likely to speed up compilation
    • partial progress towards an optional 'query sanity check' method at runtime (describe query at runtime; compare result to compile time result; return errors for differences)
  • Could provide better access to sqlite-explain logs (proc macro swallows stdout/stderr output, which is where the sqllite-explain logging goes for 99.9% of uses).
    • Possibly better formatting options (I develop sqlx sqlite analysis changes using a fork of sqlx which formats the logs as graphviz dot format, because plain text execution traces get hard to visualize fast)

The potential disadvantages I see are:

  • compilation instructions for projects using sqlx would diverge from typical rust instructions (you either need to run a tool, or launch the daemon)
  • It might be tricky for the daemon to decide when to re-describe queries in the face of schema migrations (assuming it's cherry-picking what to describe)
    • the built-in migration implementation might provide a trigger, but using the built-in migrations is optional
    • some databases might have DDL timestamps/logs which could provide a trigger
  • Is there a way to signal to rust-analyzer that a 'describe result has changed'? If describe is asynchronous, will rust-analyzer react when some dummy/result file is updated with the actual results?

Relying on something resembling 'offline' mode, with an external tool to do the sql query analysis, seems like a better default than the current default.

For how I use sqlx, I would still prefer if cargo run/check was capable of handling everything, even if it was opt-in. But even if that was functionality was lost, sqlx would STILL be better than only finding out about a typo after the application tries to run the query.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants