Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL Lab queries can't be stopped #17011

Open
2 of 3 tasks
ValentinC-BR opened this issue Oct 7, 2021 · 25 comments
Open
2 of 3 tasks

SQL Lab queries can't be stopped #17011

ValentinC-BR opened this issue Oct 7, 2021 · 25 comments
Assignees
Labels
#bug Bug report good first issue Good first issues for new contributors need:community-volunteer Requires a community volunteer sqllab Namespace | Anything related to the SQL Lab

Comments

@ValentinC-BR
Copy link

When a SQL query is running is SQL Lab, it can't be stopped.

How to reproduce the bug

  1. Go to SQL Lab
  2. Write a query and run it
  3. Click on STOP
  4. (Nothing happens)

Expected results

The query should be stopped

Actual results

The query is still running

Screenshots

(Imagine a query running in the SQL Lab...)

Environment

(please complete the following information):

  • browser type and version: Google Chrome Version 94.0.4606.71 (Oficial build) (x86_64)
  • superset version: 1.3.0
  • python version: 3.7.9
  • node.js version: doesn't apply, I run on Kubernetes, using gunicorn as server
  • source : AWS Athena
  • any feature flags active: /

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven't found one similar.

Additional context

The queries could be stopped in Superset v1.1

@ValentinC-BR ValentinC-BR added the #bug Bug report label Oct 7, 2021
@junlincc junlincc added the sqllab Namespace | Anything related to the SQL Lab label Oct 8, 2021
@eschutho
Copy link
Member

Hi @ValentinC-BR, when you say nothing happens when you hit the stop button, if you look in your network tab, is your browser making the call to the stop_query endpoint and if so, what is the response?

@ValentinC-BR
Copy link
Author

I actually see it in the network tab but after 2 sec of "pending" status, I get a 500 error :
image

@eschutho
Copy link
Member

eschutho commented Oct 13, 2021

Ok, interesting. Do you have access to your logs to see what the error was? We were able to reproduce with Athena, and are looking more into this.

@quenchua
Copy link

I have noticed this as well.
Basically, the query gets successfully stopped as far as Superset is concerned.
But, if you check recent queries in the Athena UI, the query is still running.

@stale
Copy link

stale bot commented Apr 17, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

@stale stale bot added the inactive Inactive for >= 30 days label Apr 17, 2022
@ValentinC-BR
Copy link
Author

Was it fixed in the latest release ?

@stale stale bot removed the inactive Inactive for >= 30 days label Apr 20, 2022
@mdeshmu
Copy link
Contributor

mdeshmu commented Jun 24, 2022

@eschutho I have also observed same behavior with Athena. I was not able to stop one of my Athena queries. On clicking stop button i get error "Failed at stopping query."
Timer doesn't stop in SQL Lab
image
I queried Superset meta database to find out query id.
image (1)
It stopped when i ran this SQL on Superset backend database.
UPDATE query SET status = 'failed' WHERE id = 97;

Here are few logs:
2022-06-21 14:13:32,655:INFO:backoff:Backing off stop_query(...) for 0.3s (superset.exceptions.SupersetCancelQueryException: Could not cancel query)
2022-06-21 14:13:32,948:INFO:backoff:Backing off stop_query(...) for 0.1s (superset.exceptions.SupersetCancelQueryException: Could not cancel query)
2022-06-21 14:13:33,092:INFO:backoff:Backing off stop_query(...) for 0.6s (superset.exceptions.SupersetCancelQueryException: Could not cancel query)
2022-06-21 14:13:33,668:INFO:backoff:Backing off stop_query(...) for 1.0s (superset.exceptions.SupersetCancelQueryException: Could not cancel query)
2022-06-21 14:13:34,649:ERROR:backoff:Giving up stop_query(...) after 5 tries (superset.exceptions.SupersetCancelQueryException: Could not cancel query)
x.x.x.x - - [21/Jun/2022:14:13:34 +0000] "POST /superset/stop_query/ HTTP/1.1" 422 35 "https://datainsights-dev.com/superset/sqllab/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.0.0 Safari/537.36"

@eschutho
Copy link
Member

Cancel query has only been implemented for mysql, postgres, and snowflake. Thus the error that you're seeing on Athena. It's a fairly simple integration needed where you would run that sql statement that you ran directly above. Here's an example for Mysql if someone is interested in contributing this fix: https://github.com/preset-io/superset/blob/02032ee8a478ab9578cf6e13a7645a8b4106be58/superset/db_engine_specs/mysql.py#L240

@eschutho eschutho added good first issue Good first issues for new contributors need:community-volunteer Requires a community volunteer labels Jun 24, 2022
@mdeshmu
Copy link
Contributor

mdeshmu commented Jun 27, 2022

@eschutho I ran query against Superset metadata databases's query table.
I believe we will have to run the cancel query against Athena to stop the actual query execution.
As per my understanding, Cancel query implementations for mysql, postgres, and snowflake run cancellation against respective database engines and not against Superset's metatdata database. Cancelling the Athena query is done via stop_query_execution which requires QueryExecutionId. PyAthena has a cancel method for this purpose.
We will have to store QueryExecutionId somewhere in Superset's metadata database so that it can be fetched later via get_cancel_query_id. I have raised this on pyathena github project to check if there is an easy way to QueryExecutionId.

@eschutho
Copy link
Member

@mdeshmu correct on all points. Ok, let's see if they have any answers for that lookup. Also I found this if it helps: https://github.com/laughingman7743/PyAthena/blob/master/pyathena/cursor.py#L65

@mdeshmu
Copy link
Contributor

mdeshmu commented Jun 29, 2022

@eschutho In case of Athena, we will have to get QueryExecutionId after running sql statements. So this code won't help us since it is run before the query execution. Looking at this reply, I understand that the cursor.execute method blocks until the query execution is finished. So probably we need some kind of multi-threading to capture QueryExecutionId.

@eschutho
Copy link
Member

eschutho commented Jun 29, 2022

Correct. The cancel query action is executed here so it will run in a separate thread. If you know how to get the QueryExecutionId, we'll need to add to the Athena db_engine_spec these two methods: get_cancel_query_id and cancel_query

@eschutho
Copy link
Member

Here's an example of a recent PR to add the cancel functionality to Redshift for reference: #16326

@mdeshmu
Copy link
Contributor

mdeshmu commented Jun 29, 2022

@eschutho What i meant is, To capture QueryExecutionId, we would need a separate thread utilizing the same cursor that is used for running sql statements from SQL Lab but current location of invocation of get_cancel_query_id wouldn't give us QueryExecutionId because its invoked before execution of SQL statements.

Even if we move this after execution of SQL statements then also its of no use as cursor would be blocked until the query execution is finished and get_cancel_query_id would be called after query execution has finished, hence of no use.

Please correct me if I am wrong.

@eschutho
Copy link
Member

eschutho commented Jul 6, 2022

So this is the abbreviated code for the whole flow of running a query to stopping it. It should work with Athena as long as we build out the two required methods cancel_query and get_cancel_query_id for Athena in the db engine spec:

with closing(engine.raw_connection()) as conn:
        # get cursor
        cursor = conn.cursor() 

        # fetch cancel query id from  cursor
        cancel_query_id = db_engine_spec.get_cancel_query_id(cursor, query)

       # save cursor in metadata on query model
         if cancel_query_id is not None:
            query.set_extra_json_key(cancel_query_key, cancel_query_id)
            session.commit()

       ...
       # execute statement (thread is blocked)
                result_set = execute_sql_statement(
                    statement,
                    query,
                    session,
                    cursor,
                    log_params,
                    apply_ctas,
                )

(separate thread after clicking the "STOP" button)

    @expose("/stop_query/", methods=["POST"])
    ...
    # fetch cancel_query_key from query model metadata
    cancel_query_id = query.extra.get(cancel_query_key)

    .....
    # pass the cancel_query_id to the db_engine_spec and cancel the query
            return query.database.db_engine_spec.cancel_query(
                cursor, query, cancel_query_id
            )

@mdeshmu
Copy link
Contributor

mdeshmu commented Jul 7, 2022

@eschutho In Athena, query id is not available immediately after initiation of cursor.

Athena

with closing(engine.raw_connection()) as conn:
        # get cursor
        cursor = conn.cursor() 

        # fetch cancel query id from  cursor
        cancel_query_id = db_engine_spec.get_cancel_query_id(cursor, query)  --> THIS WONT RETURN ANYTHING in case of Athena

       # save cursor in metadata on query model
         if cancel_query_id is not None:
            query.set_extra_json_key(cancel_query_key, cancel_query_id)
            session.commit()

       ...
       # execute statement (thread is blocked)
                result_set = execute_sql_statement(
                    statement,
                    query,
                    session,
                    cursor,
                    log_params,
                    apply_ctas,
                )                               --> WE WILL NEED A PARALLEL THREAD HERE TO call get_cancel_query_id here.

@jplanckeel
Copy link
Contributor

In pyAthena in version 1.11 we can cancel a query in cursor. Actually superset use pyAthena 1.10

@mdeshmu
Copy link
Contributor

mdeshmu commented Sep 1, 2022

@jplanckeel Superset works with latest PyAthena version. It seems you are referring to query cancellation on KeyboardInterrupt. Can you explain how it can be applied in this case?

@eschutho
Copy link
Member

eschutho commented Sep 7, 2022

Thanks for starting a PR @jplanckeel. We'll review or give feedback when ready.

@rusackas
Copy link
Member

Hey all... I know that some DBs made progress in this area, but I'm not sure about Athena. Since the PR is not moving forward, and this thread has been silent for about a year and a half, I wondered if y'all think we should keep this open, or let it go. In particular, since it's arguably a feature request as much as a bug ¯\_(ツ)_/¯

@rumbin
Copy link
Contributor

rumbin commented Feb 12, 2024

I'm not directly affected any longer, as we moved to Snowflake.
But I can confirm that this issue is still present in Superset 3.0.3.

I would definitely not call it a feature, since users expect the STOP button in SQLLab to work.
If Superset does not provide this functionality, the STOP button should at least be invisible or grayed out.

@ajunior
Copy link

ajunior commented Feb 15, 2024

Hey folks! Any news about this issue? It's still happening here with Athena.
I'm manually updating the query status to failed as a workaround, as mentioned by @mdeshmu almost two years ago.

@rusackas
Copy link
Member

@ajunior if you (or anyone reading this) want to pick up the abandoned PR, you may be able to fix it. We'd appreciate the contribution!

@vyasmanmohan
Copy link

Possible to assign me? I will have a look. @rusackas

@rusackas
Copy link
Member

@vyasmanmohan thanks for stepping up! I've assigned the issue to you, though I can't assign the PR. If you want me to assign that, too, you need to comment on that thread as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
#bug Bug report good first issue Good first issues for new contributors need:community-volunteer Requires a community volunteer sqllab Namespace | Anything related to the SQL Lab
Projects
None yet
Development

No branches or pull requests

11 participants