New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query pipelining #408
Comments
I'm wary of implicit query batching. In Supporting query pipelining is fairly simple at the protocol level, nearly all major sync and async clients (in other languages, in Rust, it seems only It's not the priority currently as we tend to write API servers and execute at most 2-3, complex queries per request. With that said, I do want to see this supported in SQLx in an explicit form once we figure out the details of how that form looks. What follows is what I'm currently thinking so it's available for discussion. // Batch or Pipeline or ?
let b = Batch::new(&mut conn);
let q1 = sqlx::query("SELECT 1").fetch_one(&b);
let q2 = sqlx::query("SELECT 2").fetch_all(&b);
let q3 = sqlx::query("SELECT 3").fetch_optional(&b);
// run all queries in a try_join_all
// this would entirely be optional sugar over using try_join_all on all the queries
b.join().await?;
// these now return immediately
let v1 = q1.await?;
let v2 = q2.await?;
let v3 = q3.now_or_never().unwrap()?; |
I'm also not looking for any kind of implicit solution, I think we're in
complete agreement here. Instead of a new object `Batch`, could we make it
work with the existing `Transaction`? Simultaneous queries against a pool
are already executed concurrently by the pooling logic.
…On Sun, Jun 14, 2020, 12:16 AM Ryan Leckey ***@***.***> wrote:
I'm wary of *implicit* query batching. In tokio-postgres, batching
happens automatically if multiple futures from query executions are polled
concurrently.
Supporting query pipelining is fairly simple at the protocol level, nearly
all major sync and async clients (in other languages, in Rust, it seems
only tokio-postgres does) support pipelining.
It's of no interest to us currently as we tend to write API servers and
execute at most 2-3, *complex* queries per request.
With that said, I *do* want to see this supported in SQLx in an explicit
form once we figure out the details of how that form looks. What follows is
what I'm currently thinking so it's up for discussion.
let b = Batch::new(&mut conn);
let q1 = sqlx::query("SELECT 1").fetch_one(&b);let q2 = sqlx::query("SELECT 2").fetch_all(&b);let q3 = sqlx::query("SELECT 3").fetch_optional(&b);
// run all queries in a try_join_all// this would entirely be optional sugar over using try_join_all on all the queries
b.join().await?;
// these now return immediatelylet v1 = q1.await?;let v2 = q2.await?;let v3 = q3.now_or_never().unwrap()?;
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#408 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACUNUACMCRCT6YQZYKVYBDRWP3FBANCNFSM4N5EY65Q>
.
|
@mehcode My suggestion: // Batch insert
// futures::stream::StreamExt
let (s, r) = Batch::new(&mut conn).split();
let item_stream = stream::iter(items.into_iter().map(|i| sqlx::query("INSERT ... RETURNING").bind(i)));
let send_all = s.send_all(&mut item_stream);
let result = r.map(|cursor| cursor.next()).collect();
send_all.join(result).await Or for infinite pipeline let (s, r) = Batch::new(&mut conn);
// forward an infinite stream of input to DB
let stream_in = socket.map(|data| sqlx::query("...").bind(data)).forward(s);
// forward an infinite stream of output to socket
let stream_out = r.map(|cursor| rows_to_data(cursor)).forward(socket);
stream_in.join(stream_out).await |
I tend to use futures::stream::StreamExt::for_each_concurrent a lot when I need to batch any async work. Would this Batch type work with that API. tokio-postgres's implicit approach should. |
We can make pipeline creation generic similar to how transaction creation is, so you can start a new pipeline with
In my idea of how this would be implemented, there is no difference in behavior depending on the order you await the queries. The batch ran them all at once and they are all thin wrappers over oneshot channels.
It may not seem like it but we actually need something similar to With the same API I proposed, this is possible: // I'm now leaning towards `Pipeline` as a type name over `Batch`
let p = Pipeline::new(&mut conn);
let r = Vec::new();
for _ in 0..10 {
r.push(query("INSERT INTO ...").bind(10).bind(20).fetch_one(&p));
}
p.join().await?;
for cursor in r {
let row = cursor.await?; // result from fetch_one
// do something with that row
} Please note, that just like the let p = Pipeline::new(&mut conn);
for _ in 0..10 {
r.push(query("INSERT INTO ...").bind(10).bind(20).fetch_one(&p));
}
p.join().await?; The examples I've shown so far are "prepare a ton of queries, then execute them all in one pipeline". There is another use case which is "setup a pipeline and let me shove queries at it and keep them executing". // let p = conn.pipeline();
let p = Pipeline::new(&mut conn);
// spawns us in the background now so we can execute queries while more are added
p.spawn();
for _ in 0..10 {
p.execute("INSERT INTO ..");
}
p.join().await?; @Cocalus We could do something like: let p = conn.pipeline(None); // no limit
let p = conn.pipeline(5); // max 5 queries at once |
@mehcode What would be the type safety story for Could something like this work? let q1 = sqlx::query("SELECT 1").fetch_one(&b);
let q2 = sqlx::query("SELECT 2").fetch_all(&b);
let (res1, res2) = conn.pipeline((q1, q2)).await?; I think that to have type safety, this Either way, this feels less ergonomic than |
+1 |
Imho, when I used |
I've made an implementation attempt and was invited to this discussion. The implemented pipeline runs queries in a single transaction. The main motivation was not to run explicit transactions against CockroachDB and rely on its mechanism of automatic transaction retries. The mechanism handles serialization failures of concurrent implicit transactions. My current needs to run several related INSERTs transactionally are covered just by pipeline.execute() but I also provided fetch_pipeline method. The raw streaming API is definitely not easy and not user friendly but doesn't set any limits on how query results are processed. |
Guys, I extended this discussion to explicit transaction pipelines and providing the API that doesn't return stale data - #2082. |
Currently, query execution functions require either a const borrow of a connection pool, or a mutable borrow of a Transaction. This makes it impossible to prepare multiple queries and execute them in a pipelined fashion within a single transaction.
tokio-postgres
has this and it's an impressive feature.The text was updated successfully, but these errors were encountered: