Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix snowflake sometimes does not find tables #37620

Closed
wants to merge 9 commits into from

Conversation

qnkhuat
Copy link
Contributor

@qnkhuat qnkhuat commented Jan 12, 2024

Fixes #38135

Fix https://discourse.metabase.com/t/snowflake-broken-in-0-48-1/63849

We changed to use getTables JDBC method to sync snowflake in a recent PR
because I thought using JDBC will be more reliable and it'll be consistent across the drivers.

But it turns out it's not; Snowflake driver has a history of being buggy like this, and it's biting us again. In fact, we have this comment from 3 years ago,

;; using the JDBC `.getTables` method seems to be pretty buggy -- it works sometimes but other times randomly
, but unfortunately, it was removed by accident I think.

I added back a comment with the word "IMPORTANT" to it, hopefully it won't get deleted this time.

I can't reproduce the bug and so is Luiggi.

@qnkhuat qnkhuat requested a review from camsaul as a code owner January 12, 2024 02:25
@qnkhuat qnkhuat requested a review from a team January 12, 2024 02:25
@metabase-bot metabase-bot bot added the .Team/BackendComponents also known as BEC label Jan 12, 2024
@qnkhuat qnkhuat added the backport Automatically create PR on current release branch on merge label Jan 12, 2024
Copy link

replay-io bot commented Jan 12, 2024

StatusIn Progress ↗︎ 51 / 52
Commit9e41f88
Results
2225 Passed

@qnkhuat qnkhuat added this to the 0.48.4 milestone Jan 12, 2024
@calherries
Copy link
Contributor

In theory this should fix the bug, but doesn't this imply we should eliminate usage of getTables everywhere it is used with snowflake e.g. here? This PR only changes one place where getTables is used.

           ;; IMPORTANT: using the JDBC `.getTables` method is buggy -- it works sometimes but other times randomly
           ;; we have tried to use it but people reported that some schemas disappear on their instances.
           ;; See https://discourse.metabase.com/t/snowflake-broken-in-0-48-1/63849

(first (jdbc/query {:connection conn}
(show-command-sql
"DYNAMIC TABLES"
:database db-name-or-nil :schema schema-name :like table-name))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the implemenation of show-command-sql, providing :database and :schema should be mutually exclusive with each other. If a :schema is provided, then :database should not be provided and vice versa. If both are provided it would compile to SHOW DYNAMIC TABLES IN DATABASE <database> SCHEMA <schema> which is invalid. I think that should be in the docstring or preconditions of show-command-sql too, to prevent misuse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updatd to make it works for both cases

@qnkhuat qnkhuat mentioned this pull request Jan 16, 2024
Copy link
Contributor

@calherries calherries left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@qnkhuat qnkhuat enabled auto-merge (squash) January 17, 2024 01:04
@qnkhuat qnkhuat disabled auto-merge January 17, 2024 03:06
@qnkhuat
Copy link
Contributor Author

qnkhuat commented Jan 17, 2024

On second thought, let's see if this actually solve people's problem in https://discourse.metabase.com/t/snowflake-broken-in-0-48-1/63849

@paoliniluis would you bind sending them the latest build?

@paoliniluis
Copy link
Contributor

sure! thanks @qnkhuat!

@paoliniluis
Copy link
Contributor

@qnkhuat when I test with the release branch I get schemas and tables as "show objects..." returns fine and then the "select true as ..." query also returns fine.

When I test this branch, the "show objects..." returns fine but the "select true as..." queries fail with "Cannot perform SELECT. This session does not have a current database. Call 'USE DATABASE', or use a qualified name."

same user, same permissions. The user and password is saved on our password manager as metabasesuccess

Copy link
Contributor

@paoliniluis paoliniluis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested and does not return active tables

@@ -406,28 +405,62 @@
(sql-jdbc/query driver database {:select [:*]
:from [[(qp.store/with-metadata-provider (u/the-id database)
(sql.qp/->honeysql driver table))]]}))
(defn- show-command-sql
;; IMPORTANT: prefer using this command over using the JDBC `.getTables` method because the jdbc is buggy
;; it works sometimes but other times randomly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
;; it works sometimes but other times randomly.
;; it works sometimes but other times randomly does not.

;; we have tried to use it but people reported that some schemas disappear on their instances.
;; See https://discourse.metabase.com/t/snowflake-broken-in-0-48-1/63849
[object-type & {:keys [like database schema]}]
;; in case the name is escaped for schema using escape-name-for-metadata, we undo that just to be sure
Copy link
Member

@camsaul camsaul Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if you really need to unescape stuff here, since you're doing a custom implementation maybe you can just not escape it in the first place? It's always better to not escape stuff in the first place than to escape it and unescape it, less moving parts and less ways to accidentally do something wrong

;; it works sometimes but other times randomly.
;; we have tried to use it but people reported that some schemas disappear on their instances.
;; See https://discourse.metabase.com/t/snowflake-broken-in-0-48-1/63849
[object-type & {:keys [like database schema]}]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this an mu/defn and giving object-type and :enum schema would make it a lot clearer for people using this what kinds of things it supports. A docstring with an example usage would be helpful too

@@ -420,3 +420,15 @@
result-comment (second (re-find #"-- (\{.*\})" result-query))
result-map (json/read-str result-comment)]
(is (= expected-map result-map))))))

(deftest show-command-sql-test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(deftest show-command-sql-test
(deftest ^:parallel show-command-sql-test

Copy link
Member

@camsaul camsaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks really good to me, but I think we need to be super careful around escaping stuff and keep in mind that Snowflake identifiers are allowed to contain wacky characters https://docs.snowflake.com/en/sql-reference/identifiers-syntax#double-quoted-identifiers

I would be more comfortable if we just made sure we never called escape-name-for-metadata in the first place (which should be easy enough since the one place that uses it is in the default implementation of describe-database that we're replacing, and then made sure we're properly escaping quotes in that show-command-sql function.

;; we have tried to use it but people reported that some schemas disappear on their instances.
;; See https://discourse.metabase.com/t/snowflake-broken-in-0-48-1/63849
[object-type & {:keys [like database schema]}]
;; in case the name is escaped for schema using escape-name-for-metadata, we undo that just to be sure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems dangerous because it seems like technically a Snowflake identifier can contain slashes and underscores, like it's totally valid to have a table named My table\_2. It seems like things are going to break in that situation.

https://docs.snowflake.com/en/sql-reference/identifiers-syntax

" "
(cond-> [(format "SHOW %s" object-type)]
(some? like)
(conj (format "LIKE '%s'" like))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to escape like, because AFAIK Snowflake table names are allowed to contain single quotes https://docs.snowflake.com/en/sql-reference/identifiers-syntax#double-quoted-identifiers

If you name your table something with a quote mark in it you're asking for trouble, but we should try to be careful and not explode even if people do dumb things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually for that matter basically all of the identifiers here... are we sure that Snowflake doesn't allow double quotes in identifier names? Do we need to be escaping those?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh actually the dox do say you can create a Table with double quotes in its name

To use the double quote character inside a quoted identifier, use two quotes. For example:

create table "quote""andunquote""" ...

creates a table named:

quote"andunquote"

(and database schema)
(conj (format "%s" (str/join "." (remove nil? [(quote-and-unescape database) (quote-and-unescape schema)]))))

(and (some? database) (nil? schema))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(and (some? database) (nil? schema))
database

You're just checking basic truthiness in the previous condition, not (and (some? database) (some? schema)), so it's weird and risky to do the related checks using different logic. No need to make this more complicated than it needs to be

(conj "IN")

(and database schema)
(conj (format "%s" (str/join "." (remove nil? [(quote-and-unescape database) (quote-and-unescape schema)]))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(conj (format "%s" (str/join "." (remove nil? [(quote-and-unescape database) (quote-and-unescape schema)]))))
(conj (str/join "." (remove nil? [(quote-and-unescape database) (quote-and-unescape schema)])))

(format "%s") doesn't really do anything

(and (some? database) (nil? schema))
(conj (format "DATABASE %s" (quote-and-unescape database)))

(and (nil? database) (some? schema))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(and (nil? database) (some? schema))
schema

@paoliniluis
Copy link
Contributor

@camsaul what do you think about sending schema.table instead of database.schema.table when we do the simple-select-probe-query on sync?

@qnkhuat
Copy link
Contributor Author

qnkhuat commented Jan 25, 2024

Thanks for the review, but I'll wait to see if this will solve people's issues before proceeding further.

@paoliniluis
Copy link
Contributor

#38135

@camsaul camsaul removed this from the 0.48.4 milestone Jan 26, 2024
@qnkhuat
Copy link
Contributor Author

qnkhuat commented Jan 26, 2024

Closing in favor of https://github.com/metabase/metabase/pull/38160/files

@qnkhuat qnkhuat closed this Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Automatically create PR on current release branch on merge .Team/BackendComponents also known as BEC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metabase won't see schemas other than public or the default namespace of the user in snowflake
4 participants