Skip to content

glue_sql should allow whitespace between asterisk and closing bracket #218

@pnacht

Description

@pnacht

I often write long SQL queries with some parameters spread throughout. This is handled beautifully by glue_sql, especially due to its asterisk operator which allows us to transform vectors into comma-separated lists.

However, glue's benefit of allowing code and string constants to be elegantly blended can also be a bit of a downside when dealing with large queries, since it becomes hard to tell at a glance what is a parameter (a single word surrounded by brackets vanishes when in the middle of a query with a dozen joins.

The inelegance (not to mention security risks) of using paste() at least has the benefit of making it painfully clear what is constant and what is variable, especially in an IDE like RStudio with different color-coding for strings and variables.

Something I've found helpful is adding spaces between the word and the brackets: WHERE x = { x } instead of WHERE x = {x}. This was inspired by the tidyverse style guide's recommended practice for the embracing operator {{ x }}.

However, this doesn't work neatly with the asterisk operator:

x <- 1:3
con <- dbConnect(RSQLite::SQLite(), ":memory:")

glue::glue_sql(.con = con, "{ x* }")
#> Error in parse(text = text, keep.source = FALSE): <text>:2:0: unexpected end of input
#> 1:  x* 
#>    ^
glue::glue_sql(.con = con, "{ x *}")
#> <SQL> 1, 2, 3

Created on 2021-05-17 by the reprex package (v2.0.0)

The asterisk must be the character immediately preceding the closing bracket. But I'd argue that { x* } is more natural than { x *}.

At a glance, this could be solved with a small change to glue:::sql_quote_transformer, which includes the following test:

should_collapse <- grepl("[*]$", text)

This could be trivially modified into

should_collapse <- grepl("[*] *$", text)

thereby allowing any number of trailing spaces. (Maybe even use \w to allow any sort of white space?)

I'm no expert in regex performance, but I'm guessing this would come at almost no performance cost.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions