Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[get_df] Updating multi-statement logic #5517

Merged

Conversation

john-bodley
Copy link
Member

@john-bodley john-bodley commented Jul 29, 2018

This PR fixes an issue we've been running into with multi-statements in Presto which is asynchronous. When using the cursor associated with a raw connection using multiple statements, i.e.,

cursor = engine.raw_connection().cursor()
cursor.execute(...)
cursor.execute(...)

for Presto it wouldn't poll to check that the first statement had finished before executing the second. It seems that some DB-API's have the nextset method (though this isn't universal) but reading through code snippets it seems the best way to ensure that multi-statements execute sequentially for all engine types is to simply call fetchall from the cursor which waits until the statement is done, i.e., here's the logic for Presto.

The other option is to call db_engine_spec.handle_cursor after every statement (which is what SQL Lab does) which polls when necessary, however this logic is very much configured for SQL Lab queries.

Note that when using the sqlalchemy.engine.base.Connection (without a cursor) the following would works. I believe the reason is because execute returns a result set which enforces syncronicity:

conn = engine.connect()
conn.execute(...)
conn.execute(...)

However this approach isn't viable as we need to use the cursor to execute queries via db_engine_spec.execute.

to: @graceguo-supercat @michellethomas @mistercrunch @timifasubaa

@codecov-io
Copy link

Codecov Report

Merging #5517 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5517      +/-   ##
==========================================
+ Coverage   63.28%   63.28%   +<.01%     
==========================================
  Files         349      349              
  Lines       22121    22123       +2     
  Branches     2457     2457              
==========================================
+ Hits        13999    14001       +2     
  Misses       8108     8108              
  Partials       14       14
Impacted Files Coverage Δ
superset/models/core.py 87% <100%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b6cafc...d3a1de6. Read the comment docs.

Copy link
Contributor

@michellethomas michellethomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@john-bodley john-bodley merged commit e7d0512 into apache:master Jul 31, 2018
@john-bodley john-bodley deleted the john-bodley-get-df-multistatement branch July 31, 2018 21:52
john-bodley added a commit to john-bodley/superset that referenced this pull request Jul 31, 2018
wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018
@villebro villebro mentioned this pull request May 15, 2019
12 tasks
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.28.0 labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.28.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants