Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Dynamic Query Builder Design #291

Closed
abonander opened this issue Apr 29, 2020 · 31 comments · Fixed by #1790
Closed

[Discussion] Dynamic Query Builder Design #291

abonander opened this issue Apr 29, 2020 · 31 comments · Fixed by #1790
Labels
E-medium enhancement New feature or request help wanted Extra attention is needed proposal

Comments

@abonander
Copy link
Collaborator

abonander commented Apr 29, 2020

We have received a lot of requests, and have a significant need ourselves, for a dynamic query builder in SQLx that can generate and execute valid SQL with the appropriate bind parameters for a given database backend at runtime.

This is to do things like:

  • Dynamically add columns from the query based on what data is needed
  • Dynamically add joins to the query from other tables
  • Dynamically add where clauses based on user input
  • Generate INSERT INTO foo(...) VALUES(...) statements with a variable number of records in the VALUES expression, taking into account the SQL flavor's max number of bind parameters per query (which we can build a more specialized API for on top of this)

MVP

I'm thinking the minimum viable product for this would be something like the following:

pub struct QueryBuilder<DB> { ... }

impl<DB> QueryBuilder<DB> {
    /// Create a builder with the initial SQL literal to append to
    pub fn new(init: impl Into<String>) -> Self { ... }

    /// Push more SQL code
    pub fn push(&mut self, sql: impl Display) -> Self { ... }

    /// Push a bind parameter to the SQL and add encode its value
    /// RFC: should we be taking values here, or bind once the query is built?
    pub fn push_bind(&mut self, value: impl Encode<DB>) -> Self { ... }

    /// Append a comma separated list with the closure being repeatedly called to add the values between commas until it returns `false`.
    ///
    /// This seems a bit higher level but it would be really useful for generating `VALUES` lists with binds
    pub fn comma_sep(&mut self, impl FnMut(&mut QueryBuilder<DB>) -> bool) -> Self { ... }

    /// Finalize the query so it can be executed
    /// 
    /// I'm thinking this clones the query so the builder can be reused
    pub fn build(&mut self) -> sqlx::Query<DB, 'static> { ... }

    /// Alternatively, we have `.execute()`, `.fetch()`, `.fetch_all()`, etc. on this struct
}

Thoughts to the Future

  • Some sort of macro that can do interpolations and dynamic repetitions like quote::quote!()
    • Perhaps still somehow checking the query at compile time
    • The catch here is that the syntax would have to be tokenizable by Rust's lexer and I think there's a few things about SQL which violate that (single-quote strings, for one)
  • A dedicated API for generating efficient INSERT INTO statements
    • Perhaps a proc-macro that generates an insertion builder for a given table with specific columns and return values

Non-Goals

  • We don't want to go overboard and implement a fully fledged query DSL yet, that's a discussion for another day

cc @mehcode @izik1 @danielakhterov

@connec
Copy link

connec commented May 1, 2020

I guess that would result in usage like the following to perform a multi-row insert:

// E.g.
let records: Vec<(Uuid, i32)> = vec![...];

let records_iter = records.iter();
let query = QueryBuilder::new("INSERT INTO users VALUES ")
    .comma_sep(move |builder| {
        if let Some(record) = records_iter.next() {
            builder.push("(");
            builder.push_bind(record.0);
            builder.push(", ");
            builder.push_bind(record.1);
            builder.push(")");
            true
        } else {
            false
        }
    })
    .build();


let result = query.execute(&conn).await?;
assert_eq!(result, 2);

This would be a decent starting point, though the inner building would be a bit painful for large structs.

@foldu
Copy link
Contributor

foldu commented May 1, 2020

Prisma has a working SQL builder: https://github.com/prisma/quaint

@mehcode
Copy link
Member

mehcode commented May 1, 2020

Note that this interface would be aimed at being used by quaint. Not at replacing Quaint.


We want to reach for low-level while still doing what we can with our db-specific knowledge ( placeholders, ID quoting, etc. ).

@abonander
Copy link
Collaborator Author

This would be a decent starting point, though the inner building would be a bit painful for large structs.

Maybe instead we have .separated(char) -> Separated where Separated also has .push() and .push_bind() but appends the separator between calls or something.

@elertan
Copy link

elertan commented Aug 16, 2020

Any progress on this? I feel like this is very much a necessary feature for real world applications.
I am currently really in need of some dynamic query building system, specifically dynamically building up a where clause based on user input.
I don't know how to do this in a temporary fashion either, that does not fall back to copy pasting logic with different parameters...

@elertan
Copy link

elertan commented Sep 10, 2020

Any update on this?

@nkconnor
Copy link
Contributor

/// I'm thinking this clones the query so the builder can be reused
pub fn build(&mut self) -> sqlx::Query<DB, 'static> { ... }

Wouldn't that by virtue of the builder also clone the arguments? It seems like that would be a bit heavy. Alternatively, if the design had a placeholder type param, we could clone the query "plan" and later bind the args.

@abonander
Copy link
Collaborator Author

Wouldn't that by virtue of the builder also clone the arguments?

Hmm, no, the arguments would be wrapped in Option internally and it would probably panic if called more than once. Really though, the only reason to take &mut self here is so that it can be called at the end of a method chain.

@nkconnor
Copy link
Contributor

nkconnor commented Sep 11, 2020 via email

@elertan
Copy link

elertan commented Sep 15, 2020

I wouldn't mind trying to implement a basic query builder similar to the one you had suggested @abonander . Should I give it a shot?

@abonander
Copy link
Collaborator Author

@nkconnor .push_bind() both pushes a bind parameter expression to the query string and also adds the bind value to the arguments, so it recreating the arguments doesn't make much sense. I'm thinking the query builder is going to be a one-and-done kinda thing. Although maybe instead of building the Arguments concurrently to the query string, we just have the user call .bind() on the Query returned by build(), so theoretically it could be used more than once, but I think that's going to be very error-prone.

@elertan please do! We unfortunately don't have much time to dedicate to SQLx these days but we're trying to get back on it.

@shiftrtech
Copy link

Hi, anyone can advise me about how to make a massive database ingestion (> 200k rows) with SQLX, there is a bulk operation builder or somethig like that?, thanks

@imbolc
Copy link
Contributor

imbolc commented Nov 2, 2020

@shiftrtech from my experience wrapping bunch of separate inserts into a transaction gives roughly the same speed

@shiftrtech
Copy link

inserts

Can you give me some basic example?

@imbolc
Copy link
Contributor

imbolc commented Nov 2, 2020

let mut tx = db.begin().await?;
for row in your_data {
    sqlx::query!("INSERT ...").execute(&mut tx).await?;
}
tx.commit().await?;

@anitnilay20
Copy link

anitnilay20 commented Feb 9, 2021

@abonander @elertan Hey, did you guys started working on this? If not I can take this up.

@elertan
Copy link

elertan commented Feb 9, 2021

By all means go at it, I honestly kind of left this in the dust for quite a while, but would be great to have this feature implemented.
@anitnilay20

@anitnilay20
Copy link

@elertan Awesome.. will start on it right away and will keep this thread updated.

@trusch
Copy link

trusch commented May 5, 2021

Is there anything that can be used by now? All I can find are some incomplete snippets... I really like sqlx, but I'm at a point where I need to construct a filter based on user input (the filters for a list endpoint) and I don't see a way of doing this except writing my own query builder which assembles the sql string and the arguments. Is there a way that I'm just overseeing?

@imbolc
Copy link
Contributor

imbolc commented May 5, 2021

@trusch I ended up just building the filter on the db level. As you can see in the following example some parameters are optional and you can check if they're exist on postgres level.

async fn fetch_items(
    db: &PgPool,
    start: DateTime<FixedOffset>,
    finish: DateTime<FixedOffset>,
    company: Option<i32>,
    purpose: Option<&str>,
    from_id: Option<i32>,
    limit: i64,
) -> Result<Vec<Transaction>> {
    Ok(sqlx::query_as!(
        Transaction,
        r#"
            SELECT
                -- total number of rows before limit applied
                count(*) OVER() as full_count,
                t.id,
                c.id as company_id,
                c.name as company_name,
                t.type as kind,
                CAST(t.amount as DOUBLE PRECISION),
                CAST(t.aws_amount as DOUBLE PRECISION),
                CAST(t.current_balance as DOUBLE PRECISION),
                t.user_id,
                -- hotfix for a sqlx bug - it doesn't see the field as nullable
                u.full_name as "user_name?",
                t.extra,
                t.created,
                t.charge_for,
                t.software
            FROM billing_transaction t
            JOIN app_company c on c.id = t.account_id
            LEFT JOIN user_user u on u.id = t.user_id
            WHERE t.created >= $1
              AND t.created < $2
              AND ($3::int IS null OR t.account_id = $3)
              AND ($4::text IS null OR t.extra -> 'computer' -> 'type' ->> 'purpose' = $4::text)
              AND ($5::int IS null OR t.id < $4::int)
            ORDER BY t.created DESC
            LIMIT $6
            "#,
        start,
        finish,
        company,
        purpose,
        from_id,
        limit,
    )
    .fetch_all(db)
    .await?)
}

@trusch
Copy link

trusch commented May 5, 2021

@imbolc Thanks for the hint! Unfortunatly my usecase is even more dynamic since I want to filter on a jsonb field and the structure of it is only known at runtime. So I really get something like a tuple of fieldname, operation and value and need to translate that to something like props->>'fieldName' = 'value'.

@mehcode
Copy link
Member

mehcode commented May 5, 2021

Dynamic behavior like that is simply not something that SQLx tries to target. Even with this proposal we aren't going to be adding a SQL builder that lets you fluently build up a query. This is a minimal proposal to make it easier for a library to wrap SQLx to provide such an API.

The library ormx may work for you. You can add optional clauses to your query.

Another library choice could be any SQL builder. Something like https://github.com/perdumonocle/sql-builder that you then pass the resulting SQL into SQLx.

@red010182
Copy link

red010182 commented Feb 8, 2022

  • Dynamically add where clauses based on user input

How about use a static query to deal with dynamic user input?

// assume age is an optional variable from user input
// age: Option<i32> 

let users = sqlx::query_as!(User, r#"
    SELECT * From "User" WHERE (age = $1 OR 0 = $2 );
  "#,
  age.unwrap_or_default(),
  age.unwrap_or_default()
).fetch_all(&ctx.db).await.ok().unwrap_or_default();

pros:

  • use static query to handle dynamic user input, compile-time validation still works.

cons:

  • need to hardcode the default value in query
  • tedious unwrap_or_default() but may be solved by macro (by default call unwrap_or_default for all optional variables)

@treydempsey
Copy link

I can second https://github.com/perdumonocle/sql-builder . It does what's being asked here and integrates nicely with sqlx.

@0xdeafbeef
Copy link
Contributor

0xdeafbeef commented Feb 8, 2022

I can second https://github.com/perdumonocle/sql-builder . It does what's being asked here and integrates nicely with sqlx.

Is it sql injections free?

@elertan
Copy link

elertan commented Feb 8, 2022

I can second https://github.com/perdumonocle/sql-builder . It does what's being asked here and integrates nicely with sqlx.

I disagree. SqlBuilder is nice and simple, but it lacks some required APIs to be able to build any query with it. It does include preventive measures such as 'quote' against sql injection and the like.

I recently stumbled upon SeaQuery which is a little more complicated but is more flexible than SqlBuilder is.
But it might be better for sqlx to not focus on dynamic query building after all and let that be the work of another crate.
Even though there should be more documentation for these very common use cases.

@dragonnn
Copy link
Contributor

dragonnn commented Feb 8, 2022

I recently too start using SeaQuery and really like it, and the best thing is that you can construct a SeaQuery connection from a sqlx pool, so you can with easy have a shared sqlx pool for static queries and using the same pool for SeaQuery too.

@marcoslopes
Copy link

marcoslopes commented Mar 30, 2022

Hey I hope I don't derail the focus on this issue and I am happy to create this somewhere else, but have you guys ever thought about a a declarative api as well, in addition to the "fluent" query builder proposal ?
One of the most pleasant api's I have worked with was JDBI, on both links above, someone have already gave it a crack here but it does not seem fully fledged, something that the sqlx community might be better equipped to keep momentum on.
Thanks heaps for all the work! ❤️

@longfellowone
Copy link

Anyone given this a try? Looks simple

https://docs.rs/quaint/latest/quaint/#using-the-ast-module

@mike-lloyd03
Copy link

Can someone help me understand how push_bind is intended to work?

If I have this:

fn get_all(pool: &PgPool, sort_by = Option<String>) -> ... {
let mut query: QueryBuilder<Postgres> = QueryBuilder::new(
    r#"SELECT
        id,
        name
    FROM thing_users"#
    )

    if let Some(s) = sort_by {
        query.push("ORDER BY ").push_bind(s);
    };
}

query.build_query_as::<User>().fetch_all(pool).await

The ORDER BY field never seems to make a difference. The data returned is always ordered the same. I'm also having trouble trying to use this to generate a WHERE clause and am having similar results. Obviously, I can do something like query.push("ORDER BY ").push(s) but that's basically just blind SQL string concatenation which I'd like to avoid.

I think I'm fundamentally misunderstanding how push_bind works.

@abonander
Copy link
Collaborator Author

If s is some string "foo" then what you'll get out of that is

SELECT
    id,
    name
FROM thing_users ORDER BY $1

with $1 bound to the string 'foo', which is meaningless in an ORDER BY expression because the comparison for every row will essentially be 'foo' <= 'foo', and thus the ordering will be whatever order that Postgres produces the rows in.

If you're trying to order by the column foo you need to use query.push(s) to push the contents of the string to the query and you'll get

SELECT
    id,
    name
FROM thing_users ORDER BY foo

Of course, if sort_by is from, say, the query segment of a request URL then you're essentially injecting an attacker-controlled string into your SQL. That's what the warning on .push() pointing to .push_bind() is about. However, since you're actually trying to modify the SQL here then you should use .push() while checking that s is only one of the valid column names to sort by.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E-medium enhancement New feature or request help wanted Extra attention is needed proposal
Projects
None yet
Development

Successfully merging a pull request may close this issue.