Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbListObjects() does not list empty schemata #388

Open
dpprdan opened this issue Dec 30, 2021 · 4 comments
Open

dbListObjects() does not list empty schemata #388

dpprdan opened this issue Dec 30, 2021 · 4 comments

Comments

@dpprdan
Copy link
Contributor

dpprdan commented Dec 30, 2021

dbListObjects() does not list empty schemata, i.e. a schema without a table.

library(RPostgres)
con <- postgresDefault()
dbExecute(con, paste0('CREATE SCHEMA "Robert"'))
#> [1] 0
dbListObjects(con)
#>                              table is_prefix
#> 1 <Id> schema = information_schema      TRUE
#> 2         <Id> schema = pg_catalog      TRUE

The schema is listed in information_schema.schemata or pg_catalog.pg_namespace (the (empty) public schema is also not listed by dbListObjects() BTW)

dbGetQuery(con, 'SELECT schema_name FROM information_schema.schemata;')
#>          schema_name
#> 1           pg_toast
#> 2         pg_catalog
#> 3             public
#> 4 information_schema
#> 5             Robert
dbGetQuery(con, 'SELECT * FROM pg_catalog.pg_namespace;')
#>     oid            nspname nspowner                              nspacl
#> 1    99           pg_toast       10                                <NA>
#> 2    11         pg_catalog       10  {postgres=UC/postgres,=U/postgres}
#> 3  2200             public       10 {postgres=UC/postgres,=UC/postgres}
#> 4 13111 information_schema       10  {postgres=UC/postgres,=U/postgres}
#> 5 16417             Robert       10                                <NA>

The schema is also listed by dbListObjects() after adding a table

table_qid <- dbQuoteIdentifier(con, Id(schema = 'Robert', table = "mtcars"))
dbWriteTable(con, table_qid, mtcars)
dbListObjects(con)
#>                              table is_prefix
#> 1 <Id> schema = information_schema      TRUE
#> 2             <Id> schema = Robert      TRUE
#> 3         <Id> schema = pg_catalog      TRUE

# clean-up
dbExecute(con, paste0('DROP TABLE ', table_qid))
#> [1] 0
dbExecute(con, paste0('DROP SCHEMA "Robert"'))
#> [1] 0
dbDisconnect(con)

The reason for this is that dbListObjects() uses information_schema.tables to look up tables and schemata, but naturally the information_schema.tables view does not list schemata if they do not contain a table.

Session info
sessioninfo::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value
#>  version  R version 4.1.2 (2021-11-01)
#>  os       Windows 10 x64 (build 19043)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language en
#>  collate  German_Germany.1252
#>  ctype    German_Germany.1252
#>  tz       Europe/Berlin
#>  date     2021-12-30
#>  pandoc   2.16.2 @ C:/PROGRA~1/Pandoc/ (via rmarkdown)
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version date (UTC) lib source
#>  backports     1.4.1   2021-12-13 [1] CRAN (R 4.1.2)
#>  bit           4.0.4   2020-08-04 [1] CRAN (R 4.1.0)
#>  bit64         4.0.5   2020-08-30 [1] CRAN (R 4.1.0)
#>  blob          1.2.2   2021-07-23 [1] CRAN (R 4.1.0)
#>  cli           3.1.0   2021-10-27 [1] CRAN (R 4.1.1)
#>  crayon        1.4.2   2021-10-29 [1] CRAN (R 4.1.1)
#>  DBI           1.1.2   2021-12-20 [1] CRAN (R 4.1.2)
#>  digest        0.6.29  2021-12-01 [1] CRAN (R 4.1.2)
#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.1.0)
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 4.1.0)
#>  fansi         0.5.0   2021-05-25 [1] CRAN (R 4.1.0)
#>  fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.1.0)
#>  fs            1.5.2   2021-12-08 [1] CRAN (R 4.1.2)
#>  generics      0.1.1   2021-10-25 [1] CRAN (R 4.1.1)
#>  glue          1.6.0   2021-12-17 [1] CRAN (R 4.1.2)
#>  highr         0.9     2021-04-16 [1] CRAN (R 4.1.0)
#>  hms           1.1.1   2021-09-26 [1] CRAN (R 4.1.1)
#>  htmltools     0.5.2   2021-08-25 [1] CRAN (R 4.1.1)
#>  knitr         1.37    2021-12-16 [1] CRAN (R 4.1.2)
#>  lifecycle     1.0.1   2021-09-24 [1] CRAN (R 4.1.1)
#>  lubridate     1.8.0   2021-10-07 [1] CRAN (R 4.1.1)
#>  magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.1.0)
#>  pillar        1.6.4   2021-10-18 [1] CRAN (R 4.1.1)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.1.0)
#>  purrr         0.3.4   2020-04-17 [1] CRAN (R 4.1.0)
#>  R.cache       0.15.0  2021-04-30 [1] CRAN (R 4.1.0)
#>  R.methodsS3   1.8.1   2020-08-26 [1] CRAN (R 4.1.0)
#>  R.oo          1.24.0  2020-08-26 [1] CRAN (R 4.1.0)
#>  R.utils       2.11.0  2021-09-26 [1] CRAN (R 4.1.1)
#>  Rcpp          1.0.7   2021-07-07 [1] CRAN (R 4.1.0)
#>  reprex        2.0.1   2021-08-05 [1] CRAN (R 4.1.0)
#>  rlang         0.4.12  2021-10-18 [1] CRAN (R 4.1.1)
#>  rmarkdown     2.11    2021-09-14 [1] CRAN (R 4.1.1)
#>  RPostgres   * 1.4.3   2021-12-20 [1] CRAN (R 4.1.2)
#>  rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.1.0)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.1.2)
#>  stringi       1.7.6   2021-11-29 [1] CRAN (R 4.1.2)
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.1.0)
#>  styler        1.6.2   2021-09-23 [1] CRAN (R 4.1.1)
#>  tibble        3.1.6   2021-11-07 [1] CRAN (R 4.1.2)
#>  utf8          1.2.2   2021-07-24 [1] CRAN (R 4.1.0)
#>  vctrs         0.3.8   2021-04-29 [1] CRAN (R 4.1.0)
#>  withr         2.4.3   2021-11-30 [1] CRAN (R 4.1.2)
#>  xfun          0.29    2021-12-14 [1] CRAN (R 4.1.2)
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.1.0)
#> 
#>  [1] C:/Users/Daniel.AK-HAMBURG/Documents/R/win-library/4.1
#>  [2] C:/Program Files/R/R-4.1.2/library
#> 
#> ------------------------------------------------------------------------------
@krlmlr
Copy link
Member

krlmlr commented Jan 3, 2022

Thanks. I wonder if it's worth fixing here.

In dm I'm working on an abstraction of INFORMATION_SCHEMA that works across data sources: cynkra/dm#517. I need to sit down and put the bits and pieces together. Feedback welcome.

@dpprdan
Copy link
Contributor Author

dpprdan commented Jan 3, 2022

I wonder if it's worth fixing here.

Agree, the practical relevance is rather limited (though not entirely absent e.g. like in my reprex: creating a schema, then checking whether it's there). Maybe I can tackle this together with #251 / #261.

In dm I'm working on an abstraction of INFORMATION_SCHEMA that works across data sources: cynkra/dm#517. I need to sit down and put the bits and pieces together. Feedback welcome.

This certainly looks interesting, but I'd have to dig in deeper to add something meaningful and cannot promise if/when that'll happen.

I'll just state the (now) obvious, that INFORMATION_SCHEMA may not contain engine-specific extensions (e.g. materialized views in postgres, cf. #251), though I am not sure in how relevant this is for what you have in mind there.

@dpprdan
Copy link
Contributor Author

dpprdan commented Nov 24, 2022

Note to self: We need to filter pg_toast* and pg_temp* schemata, (should we want to address this).

Details From an "empty" DB after running the RPostgres tests (i.e. adding and removing some tables).
> dbListObjects(pg_con)
                             table is_prefix
1 <Id> schema = information_schema      TRUE
2         <Id> schema = pg_catalog      TRUE

> dbGetQuery(pg_con, 'SELECT * FROM pg_catalog.pg_namespace;')
     oid            nspname nspowner                                                        nspacl
1     99           pg_toast       10                                                          <NA>
2     11         pg_catalog       10                            {postgres=UC/postgres,=U/postgres}
3   2200             public     6171 {pg_database_owner=UC/pg_database_owner,=U/pg_database_owner}
4  13186 information_schema       10                            {postgres=UC/postgres,=U/postgres}
5  16388          pg_temp_4       10                                                          <NA>
6  16389    pg_toast_temp_4       10                                                          <NA>
7  16395          pg_temp_5       10                                                          <NA>
8  16396    pg_toast_temp_5       10                                                          <NA>
9  18767         pg_temp_10       10                                                          <NA>
10 18768   pg_toast_temp_10       10                                                          <NA>
11 18824          pg_temp_7       10                                                          <NA>
12 18825    pg_toast_temp_7       10                                                          <NA>
13 20812          pg_temp_9       10                                                          <NA>
14 20813    pg_toast_temp_9       10                                                          <NA>
15 27016          pg_temp_3       10                                                          <NA>
16 27017    pg_toast_temp_3       10                                                          <NA>
17 27024             Robert       10                                                          <NA>
> dbGetQuery(pg_con, 'SELECT schema_name FROM information_schema.schemata;')
          schema_name
1              public
2              Robert
3     pg_toast_temp_3
4           pg_temp_3
5     pg_toast_temp_9
6           pg_temp_9
7     pg_toast_temp_7
8           pg_temp_7
9    pg_toast_temp_10
10         pg_temp_10
11    pg_toast_temp_5
12          pg_temp_5
13    pg_toast_temp_4
14          pg_temp_4
15 information_schema
16         pg_catalog
17           pg_toast

@dpprdan
Copy link
Contributor Author

dpprdan commented Nov 27, 2022

Then again, we may need to list the pg_temp_* schema(ta?) that contain user-created temporary tables (dbListObjects() does currently does).

We cannot tell from information_schema.schemata which one that is (AFAIK). So we'd probably need to do something like this:

SELECT schema_name AS schema, NULL::varchar AS table 
FROM information_schema.schemata
WHERE schema_name NOT LIKE 'pg_toast%' AND schema_name NOT LIKE 'pg_temp%'

UNION

SELECT DISTINCT table_schema AS schema, NULL::varchar AS table 
FROM (SELECT table_schema, table_name FROM information_schema.tables) as schema_query

When using the system catalogs, we can get the temp schema with also with:

SELECT nspname
FROM   pg_namespace
WHERE  oid = pg_my_temp_schema();

The question remains, whether returning the temp schema is really necessary/useful, see https://dba.stackexchange.com/a/76516.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants