Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: dbListTables() lists materialized views #414

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dpprdan
Copy link
Contributor

@dpprdan dpprdan commented Dec 15, 2022

This PR extends #413 and adds the queries of the system catalogs in order for dbList*() and dbExistsTable() to return information about materialized views in PostgreSQL. For background see #413 and #251.

In order to make the use of the system catalogs vis-a-vis information_schema.tables optional, I added a system_catalogs argument to dbConnect() (as suggested in #261). I set the default to TRUE (i.e. for regular PqConnections).

Redshift connections do not have the (explicit) argument, because there is (probably) no necessary extra information in it's system catalog views (i.e. above what's in information_schema). Since Redshift connections inherit from PqConnection, system_catalogs is set to FALSE for them.

I've added tests for materialized views, i.e. whether dbList*() and dbExistsTable() return information about matviews on the search path and on a custom schema.

closes #251

@dpprdan dpprdan marked this pull request as ready for review December 21, 2022 16:00
@krlmlr
Copy link
Member

krlmlr commented Mar 16, 2023

This now has conflicts.

@dpprdan dpprdan changed the title list materialized views in Postgres feat: list materialized views in Postgres Mar 16, 2023
Copy link
Contributor

aviator-app bot commented Nov 9, 2023

Current Aviator status

Aviator will automatically update this comment as the status of the PR changes.
Comment /aviator refresh to force Aviator to re-examine your PR (or learn about other /aviator commands).

This pull request is currently open (not queued).

How to merge

To merge this PR, comment /aviator merge or add the mergequeue label.


See the real-time status of this PR on the Aviator webapp.
Use the Aviator Chrome Extension to see the status of your PR within GitHub.

@krlmlr krlmlr force-pushed the feat/list_matviews branch 3 times, most recently from 0524d53 to 15b3cef Compare April 1, 2024 14:53
@krlmlr krlmlr changed the title feat: list materialized views in Postgres feat: dbListTables() lists materialized views Apr 1, 2024
Copy link
Member

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I now disentangled and merged most preparatory PRs. Should we take another stab here?

@@ -14,11 +14,11 @@ dbListObjects_PqConnection_ANY <- function(conn, prefix = NULL, ...) {
}
query <- paste0(
"SELECT ", null_varchar, " AS schema, table_name AS table FROM ( \n",
list_tables(conn = conn, order_by = "table_type, table_name"),
list_tables(conn = conn, order_by = "table_type, table_name"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

styler won't like it. Maybe use braces or parens to enforce indent here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with not having an extra indent here and prefer that to extra braces or parentheses.

If I run styler::style_package() with tidyverse_style() it'll style a lot more files than those relevant for this PR. So I'll just run styler on the latter and leave the others for a separate PR?!

if (exists_table(conn, id)) {
# we know from exists_table() that id@name["table"] exists
# and the user has access priviledges
tname_str <- stats::na.omit(id@name[c("schema", "table")])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Id() now has unnamed components only. How to deal with this here?

Copy link
Contributor Author

@dpprdan dpprdan Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have more places where we rely on named Id()s, I am afraid. Also in main, which are not caught by current tests. 😬
I can take a stab at r-dbi/DBItest#367 if you want? Either way, I'll try to amend the tests for matviews first and work my way up from there:

https://github.com/dpprdan/RPostgres/blob/a3dfc02dff54b84e944efe2b68c9e4e8cc447fd8/tests/testthat/test-list_matviews.R#L35-L40

Which name variants are (supposed to be) supported now?

Relatedly: Is r-dbi/DBItest#340 relevant for RPostgres?

@@ -60,7 +67,8 @@
dbConnect_PqDriver <- function(drv, dbname = NULL,
host = NULL, port = NULL, password = NULL, user = NULL, service = NULL, ...,
bigint = c("integer64", "integer", "numeric", "character"),
check_interrupts = FALSE, timezone = "UTC", timezone_out = NULL) {
check_interrupts = FALSE, system_catalogs = TRUE,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new argument name is not very clear, perhaps use something like meta = c("system", "information_schema") ?

I'm not in love with meta either, can you think of a better name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meta seems quite broad and therefore not very clear as well, so how about meta_schema?

If we'd want to stick with a boolean, it could also be information_schema, but it's probably better to use a character argument to keep it extendable?

(Just documenting some related info:)
PostgreSQL calls the tables with the system information System Catalogs. There are also System Views, however, like pg_matviews. Both the Catalogs and the Views reside in the pg_catalog schema.

Redshift also has Materialized Views (out of scope for this PR). Their meta info resides (mostly) in ST_MV_INFO, one of the "system tables and views". The "System catalog tables", which seem to be legacy tables from the PostgreSQL v8 times, are a part of those as well.

Long story short: meta_schema = c("system", "information_schema") seems to be the most fitting argument to specify which schema should be queried for information on tables, (materialized) views and columns.

@dpprdan
Copy link
Contributor Author

dpprdan commented Apr 3, 2024

Thanks. I now disentangled and merged most preparatory PRs. Should we take another stab here?

🥳 I'll look into it, but again, it might take a couple of days until I'll get around to it.

Full history: 699813b

refactor `dbListTables()` with `list_tables()`, now orders result by `table_type` and `table_name`
refactor `dbExistsTable()` with `list_tables()`
refactor `dbListObjects()` with `list_tables()`
merge `find_table()` code into `list_fields()`
`find_table()` isn't used anywhere else anymore (e.g. `exists_table()`)
simplify the "get current_schemas() as table" code
pass full `id` to `list_fields()`
align `dbExistsTable()` with `dbListFields()`
add some comments and whitespace
simplify `where_schema` in `list_tables()`
align `where_table` with `where_schema` in `list_tables()`
add `system_catalogs` argument to `dbConnect()`
add materialized view tests
`list_tables()`: query system catalogs if available
`list_fields()`: query system catalogs if available
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

materialized views in dbListTables()
2 participants