Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: trino thread app missing full context #29981

Conversation

dpgaspar
Copy link
Member

@dpgaspar dpgaspar commented Aug 21, 2024

SUMMARY

Fixes the issue where Trino SQL execution was missing Flask's g values in the context.
Flask's requests and g are local to the request thread or greenlet, by creating a new thread we loose this context.

This PR passes a copy of g into the thread, g is then recreated inside the thread, this g will be local to the thread but will contain all the necessary context of the current flask request

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

Copy link

codecov bot commented Aug 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.64%. Comparing base (76d897e) to head (6ed299d).
Report is 1094 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #29981       +/-   ##
===========================================
+ Coverage   60.48%   83.64%   +23.16%     
===========================================
  Files        1931      528     -1403     
  Lines       76236    38150    -38086     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    31912    -14202     
+ Misses      28017     6238    -21779     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 48.95% <33.33%> (-0.22%) ⬇️
javascript ?
mysql 76.64% <33.33%> (?)
postgres 76.73% <33.33%> (?)
presto 53.50% <33.33%> (-0.31%) ⬇️
python 83.64% <100.00%> (+20.16%) ⬆️
sqlite 76.22% <33.33%> (?)
unit 60.31% <100.00%> (+2.68%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dpgaspar dpgaspar marked this pull request as ready for review August 21, 2024 12:04
@dosubot dosubot bot added data:connect Namespace | Anything related to db connections / integrations data:connect:trino Related to Trino labels Aug 21, 2024
# When you spawn a new thread, it does not inherit the contexts
# from the parent thread,
# meaning the g object and other context-bound variables are not
# accessible
with app.app_context():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpgaspar I did a quick GPT search.

from flask import current_app
import threading

def my_thread():
    with current_app.app_context():
        # Access Flask context within the thread
        app = current_app._get_current_object()
        # Do something with the app

# Create and start the thread
thread = threading.Thread(target=my_thread)
thread.start()

Would something like this work for our use cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems basically similar, except I'm passing the current_app into the thread, also we need to recreate g inside the thread

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we use copy_current_request_context?

@copy_current_request_context
def worker():
    # Access Flask's g object here
    # Your code goes here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only the request context is copied according to: pallets/flask#3306

Copy link
Contributor

@joaoferrao joaoferrao Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed not including @ copy_current_request_context makes the thread's downstream calls url_for function calls fail with the screenshot here #27631 (comment)

I found this out in the context adapting #27631 for Trino. My understanding is that, when using OAuth2 config, the class OAuth2ClientConfigSchema will be called in a separated thread and has this missing information.

@dpgaspar
Copy link
Member Author

Thank you for the review @michael-s-molina

@sadpandajoe sadpandajoe added the v4.1 Label added by the release manager to track PRs to be included in the 4.1 branch label Aug 22, 2024
@dpgaspar dpgaspar merged commit 4d821f4 into apache:master Aug 22, 2024
55 of 68 checks passed
@dpgaspar dpgaspar deleted the danielgaspar/sc-82301/it-s-not-possible-to-use-trino-db-connections branch August 22, 2024 17:01
sadpandajoe pushed a commit that referenced this pull request Aug 22, 2024
@github-actions github-actions bot added 🍒 4.1.0 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels labels Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels data:connect:trino Related to Trino data:connect Namespace | Anything related to db connections / integrations size/M v4.1 Label added by the release manager to track PRs to be included in the 4.1 branch 🍒 4.1.0 🍒 4.1.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants