Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sql lab] option to disable cross schema search #4551

Merged
merged 2 commits into from
Mar 9, 2018

Conversation

mistercrunch
Copy link
Member

This is killing our metastore as people type it emits large
all-table-dump as they hit the keystroke. It never returns as it times
out and hammers the poor metastore.

Also some improvements around the disabling the table select on the left
panel and having the table name not be sticky.

@mistercrunch mistercrunch changed the title [sql lab] disable cross schema search [sql lab] option to disable cross schema search Mar 6, 2018
@codecov-io
Copy link

codecov-io commented Mar 7, 2018

Codecov Report

Merging #4551 into master will decrease coverage by 0.01%.
The diff coverage is 88.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4551      +/-   ##
==========================================
- Coverage   71.17%   71.16%   -0.02%     
==========================================
  Files         187      187              
  Lines       14809    14824      +15     
  Branches     1085     1086       +1     
==========================================
+ Hits        10540    10549       +9     
- Misses       4266     4272       +6     
  Partials        3        3
Impacted Files Coverage Δ
superset/views/core.py 71.2% <ø> (ø) ⬆️
...assets/javascripts/SqlLab/components/SqlEditor.jsx 74.46% <ø> (ø) ⬆️
superset/models/core.py 86.7% <100%> (-0.62%) ⬇️
...javascripts/SqlLab/components/SqlEditorLeftBar.jsx 92.45% <83.33%> (-0.55%) ⬇️
superset/sql_lab.py 74.21% <0%> (-0.13%) ⬇️
superset/config.py 92.18% <0%> (+0.06%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c85eea3...e78d788. Read the comment docs.

@@ -256,6 +258,10 @@ class DatabaseView(SupersetModelView, DeleteMixin, YamlExportMixin): # noqa
'If Hive and hive.server2.enable.doAs is enabled, will run the queries as '
'service account, but impersonate the currently logged on user '
'via hive.server2.proxy.user property.'),
'allow_multi_schema_metadata_fetch': _(
'Allow SQL Lab to fetch a list of all table and all views across '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: list of all tables

@hughhhh
Copy link
Member

hughhhh commented Mar 9, 2018

🚢

This is killing our metastore as people type it emits large
all-table-dump as they hit the keystroke. It never returns as it times
out and hammers the poor metastore.

Also some improvements around the disabling the table select on the left
panel and having the table name not be sticky.
@mistercrunch mistercrunch merged commit d522292 into apache:master Mar 9, 2018
@mistercrunch mistercrunch deleted the disable_xschema_search branch March 9, 2018 22:54
mistercrunch added a commit to lyft/incubator-superset that referenced this pull request Mar 9, 2018
* [sql lab] disable cross schema search

This is killing our metastore as people type it emits large
all-table-dump as they hit the keystroke. It never returns as it times
out and hammers the poor metastore.

Also some improvements around the disabling the table select on the left
panel and having the table name not be sticky.

* typo

(cherry picked from commit d522292)
john-bodley pushed a commit to john-bodley/superset that referenced this pull request Mar 27, 2018
* [sql lab] disable cross schema search

This is killing our metastore as people type it emits large
all-table-dump as they hit the keystroke. It never returns as it times
out and hammers the poor metastore.

Also some improvements around the disabling the table select on the left
panel and having the table name not be sticky.

* typo

(cherry picked from commit d522292)
@john-bodley
Copy link
Member

@mistercrunch it seems after the Alembic upgrade the allow_multi_schema_metadata_fetch is NULL rather than True (which ensures backwards compatibility). I think you need to include logic of the form,

from sqlalchemy.sql import table, column

dbs = table('dbs', column('allow_multi_schema_metadata_fetch'))       
op.execute(dbs.update().values(allow_multi_schema_metadata_fetch=True))  

@mistercrunch
Copy link
Member Author

@john-bodley do you think I should write a db migration script to handle this?

@john-bodley
Copy link
Member

Because it’s a fairly recent migration how do you feel about augmenting the existing migration? Alternatively you could make the default in the code False (thus making it a breaking change) which evaluates the same as None (NULL).

I think for us, having this disabled is preferred.

@john-bodley
Copy link
Member

john-bodley commented Mar 29, 2018

@mistercrunch my other concern is if one sets allow_multi_schema_metadata_fetch to False or None then in essence you're disabling the caching logic provided by the update_datasources_cache command which caches the tables associated with every schema per here.

@john-bodley
Copy link
Member

Sorry @mistercrunch I'm mistaken, the update_datasource_cache and the table caching logic is only used if allow_multi_schema_metadata_fetch is True, it calls the /tables API endpoint with schema as NULL.

@mistercrunch
Copy link
Member Author

mistercrunch commented Apr 2, 2018

If caching is off it's pretty brutal on the poor database as it hits the endpoint multiple times as users type. A confused user trying to get this to work could results in dozens of stuck calls that scan the metadata tables. On large databases it's not great.

Given that, False is a better default here, I was just trying to avoid breaking changes.

Also if you haven't run that migration, I suggest you schedule downtime for this one. It locked up on our side...

@john-bodley
Copy link
Member

@mistercrunch this migration didn't cause a lock on our end. I wonder whether there's also merit in caching when someone chooses a schema. I think there's plan for us to analyze the Presto logs to see the frequency of SHOW SCHEMAS and SHOW TABLES queries and whether there is reason to alleviate additional load on the database.

michellethomas pushed a commit to michellethomas/panoramix that referenced this pull request May 24, 2018
* [sql lab] disable cross schema search

This is killing our metastore as people type it emits large
all-table-dump as they hit the keystroke. It never returns as it times
out and hammers the poor metastore.

Also some improvements around the disabling the table select on the left
panel and having the table name not be sticky.

* typo
wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018
* [sql lab] disable cross schema search

This is killing our metastore as people type it emits large
all-table-dump as they hit the keystroke. It never returns as it times
out and hammers the poor metastore.

Also some improvements around the disabling the table select on the left
panel and having the table name not be sticky.

* typo
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.24.0 labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.24.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants