Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: get evaluated code from SQL chunks (and others) #2131

Closed
3 tasks done
dpprdan opened this issue May 24, 2022 · 3 comments
Closed
3 tasks done

Feature request: get evaluated code from SQL chunks (and others) #2131

dpprdan opened this issue May 24, 2022 · 3 comments

Comments

@dpprdan
Copy link

@dpprdan dpprdan commented May 24, 2022

@cderv asked me to open this issue over at RStudio Community

I would like to get the code from a SQL chunk in rmarkdown, with the variables (?myvar) and parameters (@myparam) evaluated. So, when I have a chunk like this

```{sql, connection=con, output.var="iris_subset"}
SELECT * FROM iris WHERE Species = ?species
```

I would like to get this SQL string back (i.e. the string should contain 'setosa', not ?species):

SELECT * FROM iris WHERE Species = 'setosa'

In other words, I'd like to see the SQL command that would actually be sent to the DB (preferably without sending it, though this could be optional).

This would be helpful when debugging SQL code that has variables/parameters, because one could see for example, whether the variables are quoted correctly (depending on the requirement as strings or as DB objects, e.g. columns) or contain the expected values.

I suspect this to apply to other engines as well. For example, I've tried the {glue}'s glue_sql engine as an alternative. I do get the evaluated SQL code with that engine when the document is knitted successfully, but that does not help when debugging, because the document does not knit successfully. (See the following on the differences in the SQL code from the sql and the glue_sql engines).

Sidenote: Different behaviour of vars in `sql` vs `glue_sql` chunks

The following rmarkdown document

---
title: "Untitled"
output: github_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(DBI)
con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
mtcars$model <- rownames(mtcars)
DBI::dbWriteTable(con, "mtcars", mtcars)
```

```{r}
var <- "mpg"
tbl <- "mtcars"
num <- 150
```

```{sql, connection = con, output.var="mt_sql"}
SELECT `model`, `hp`, ?var
FROM ?tbl
WHERE ?tbl.`hp` > ?num
```

```{glue_sql, connection = con, output.var="mt_glue_sql"}
SELECT `model`, `hp`, {`var`}
FROM {`tbl`}
WHERE {`tbl`}.`hp` > {num}
```

renders as:

Untitled

var <- "mpg"
tbl <- "mtcars"
num <- 150
SELECT `model`, `hp`, ?var
FROM ?tbl
WHERE ?tbl.`hp` > ?num
SELECT `model`, `hp`, `mpg`
FROM `mtcars`
WHERE `mtcars`.`hp` > 150

Note that the sql chunk in the output still has ?var, ?tbl and ?num, whereas in the glue_sql chunk it's mpg, mtcars (both quoted) and 150, respectively. I don't know if that is a bug or a feature - let me know in case I should open a separate issue for that.

Sidenote: `glue_sql` chunks might be slightly easier to debug than `sql` at the moment

In general, glue_sql is somewhat easier to debug, because I could copy+paste the code from the chunk into the glue_sql() function and the the SQL code back. In practice, I might have to deal with quoted object names in the SQL code and therefore escape all double or single quotation marks.


By filing an issue to this repo, I promise that

  • I have fully read the issue guide at https://yihui.org/issue/.
  • I have provided the necessary information about my issue.
    • If I'm asking a question, I have already asked it on Stack Overflow or RStudio Community, waited for at least 24 hours, and included a link to my question there.
    • If I'm filing a bug report, I have included a minimal, self-contained, and reproducible example, and have also included xfun::session_info('knitr'). I have upgraded all my packages to their latest versions (e.g., R, RStudio, and R packages), and also tried the development version: remotes::install_github('yihui/knitr').
    • If I have posted the same issue elsewhere, I have also mentioned it in this issue.
  • I have learned the Github Markdown syntax, and formatted my issue correctly.

I understand that my issue may be closed if I don't fulfill my promises.

@cderv
Copy link
Collaborator

@cderv cderv commented May 24, 2022

What is the odd ! I was just looking into that and was answering on the community.

In fact, this has been done since the beginning (#1357) and you can set sql.show_interpolated = TRUE on the chunk

---
title: "Get SQL with evaluated variables from SQL chunks"
output: html_document
---
  
In order to provide a reproducible example, we use dplyr and dbplyr with `tbl_memdb` to load a table into a memory db.

```{r}
library(dplyr)
library(dbplyr)
iris_db <- tbl_memdb(iris)
# we isolate the connection object 
con <- src_memdb()$con
species <- "setosa"
```

We check that it works with SQL chunk to filter data into an R object

```{sql, connection=con, output.var="iris_subset", sql.show_interpolated = TRUE}
SELECT * FROM iris WHERE Species = ?species
```

```{r}
head(iris_subset)
```

image

Is this what you are looking for ?

@dpprdan
Copy link
Author

@dpprdan dpprdan commented May 24, 2022

Yes it is, thank you!!

One needs to knit() the document. If I don't want to send the SQL chunk to the DB when debugging, I can use the eval=FALSE chunk option (or error=TRUE if I want to ignore potential SQL errors).

Example

My initial sql / glue_sql example with sql.show_interpolated=TRUE and eval=FALSE. Note that I am using a wrong table name here (mtcarz), so this wouldn't render without eval=FALSE.

---
title: "Get SQL with evaluated variables from SQL chunks"
output: github_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(DBI)
con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
mtcars$model <- rownames(mtcars)
DBI::dbWriteTable(con, "mtcars", mtcars)
```

```{r}
var <- "mpg"
tbl <- "mtcarz"
num <- 150
```

Use `sql.show_interpolated=TRUE` to evaluate variables in `sql` chunks in the output and `eval=FALSE` to not send the query to the DB (e.g. when debugging). 

```{sql connection=con, output.var="mt_sql", sql.show_interpolated=TRUE, eval=FALSE}
SELECT `model`, `hp`, ?var
FROM ?tbl
WHERE ?tbl.`hp` > ?num
```

`glue_sql` chunks always show SQL code with evaluated variables, so no need for an `sql.show_interpolated=TRUE` equivalent. 

```{glue_sql, connection = con, output.var="mt_glue_sql", eval=FALSE}
SELECT `model`, `hp`, {`var`}
FROM {`tbl`}
WHERE {`tbl`}.`hp` > {num}
```

Get SQL with evaluated variables from SQL chunks

var <- "mpg"
tbl <- "mtcarz"
num <- 150

Use sql.show_interpolated=TRUE to evaluate variables in sql chunks
in the output and eval=FALSE to not send the query to the DB
(e.g. when debugging).

SELECT `model`, `hp`, 'mpg'
FROM 'mtcarz'
WHERE 'mtcarz'.`hp` > 150

glue_sql chunks always show SQL code with evaluated variables, so no
need for an sql.show_interpolated=TRUE equivalent.

SELECT `model`, `hp`, `mpg`
FROM `mtcarz`
WHERE `mtcarz`.`hp` > 150

@dpprdan dpprdan closed this as completed May 24, 2022
@cderv
Copy link
Collaborator

@cderv cderv commented May 24, 2022

One needs to knit() the document.

Yes you need to knit the document as this is a knitr feature. So it means the interpolation is done during knitting. Using a chunk engine feature requires knitting.

You can also use standard R code chunk and use DBI package directly instead of the sql chunk engine if you prefer to have interactive execution available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants