Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[get_df] Adding support for multi-statement SQL #5060

Merged

Conversation

john-bodley
Copy link
Member

I'm working part of an Airbnb project to which leverages a custom SQLAlchemy dialect executes multiple-statements as it needs to create a series of temporary views before issuing the final select.

Many engines don't support multiple statements, i.e., one cannot provide a single string containing multiple-statements separated by the ; character and thus the only viable option seems to be to to mutate get_df to execute each statement separately using the same connection. Only the results from the last statement are fetched.

@mistercrunch I'm not sure how you feel about this change as it does not directly benefit any existing Superset use cases. I'm definitely open to other suggestion, though I sense this is fairly benign.

to: @GabeLoins @michellethomas @mistercrunch

@codecov-io
Copy link

Codecov Report

Merging #5060 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5060      +/-   ##
==========================================
+ Coverage   77.52%   77.52%   +<.01%     
==========================================
  Files          44       44              
  Lines        8707     8709       +2     
==========================================
+ Hits         6750     6752       +2     
  Misses       1957     1957
Impacted Files Coverage Δ
superset/models/core.py 86.52% <100%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4c44223...794f4ae. Read the comment docs.

Copy link
Member

@mistercrunch mistercrunch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're looking at allowing multi-statements in SQL Lab as well soon and will probably do this using sqlparse as any SQL parsing is somewhat brittle, especially when allowing free-form SQL.

We'll also probably at that time remove the reliance on pd.read_sql_query as it doesn't allow us the level of control we need for SQL Lab, and we'd like to have a common SQL execution path for both SQL Lab and Explore.

@betodealmeida FYI

@john-bodley john-bodley merged commit 17d6464 into apache:master May 23, 2018
@john-bodley john-bodley deleted the john-bodley-get-df-multi-statement branch May 23, 2018 18:40
john-bodley added a commit that referenced this pull request May 24, 2018
timifasubaa pushed a commit to timifasubaa/incubator-superset that referenced this pull request May 31, 2018
timifasubaa pushed a commit to airbnb/superset-fork that referenced this pull request Jul 25, 2018
wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.26.0 labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.26.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants