Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Report] Reports not working with OIDC auth #14330

Closed
3 tasks done
dusatvoj opened this issue Apr 24, 2021 · 31 comments
Closed
3 tasks done

[Report] Reports not working with OIDC auth #14330

dusatvoj opened this issue Apr 24, 2021 · 31 comments
Labels
install:config Installation - Configuration settings question & help wanted Use Github discussions instead

Comments

@dusatvoj
Copy link

Can't send Report emails with error Report Schedule sellenium user not found.

Expected results

Can send Report emails

Actual results

Almost nothing in logs but Report Schedule sellenium user not found error in reports action log.

Screenshots

image

How to reproduce the bug

Setup smth like this

SCREENSHOT_LOCATE_WAIT = 100
SCREENSHOT_LOAD_WAIT = 600

ENABLE_ALERTS = True
FEATURE_FLAGS = {
    'ALERT_REPORTS': True
}

WEBDRIVER_TYPE = "chrome"
#WEBDRIVER_OPTION_ARGS = [
    "--force-device-scale-factor=2.0",
    "--high-dpi-support=2.0",
    "--headless",
    "--disable-gpu",
    "--disable-dev-shm-usage",
    "--no-sandbox",
    "--disable-setuid-sandbox",
    "--disable-extensions",
]

# This is for internal use, you can keep http
WEBDRIVER_BASEURL="http://localhost:8088"
# This is the link sent to the recipient, change to your domain eg. https://superset.mydomain.com
WEBDRIVER_BASEURL_USER_FRIENDLY="https://<BASE_URL>"

in config but no emails sent (or tried to send).

Environment

(please complete the following information):

  • superset version: Superset 1.1.0
  • python version: Python 3.8.7
  • node.js version: v12.21.0

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven't found one similar.

Additional context

Apr 24 21:00:00 node6 celery[1401583]: Report state: Report Schedule sellenium user not found
Apr 24 21:00:00 node6 celery[1401583]: [2021-04-24 21:00:00,239: INFO/ForkPoolWorker-1] Report state: Report Schedule sellenium user not found

(every hour)

@dusatvoj dusatvoj added the #bug Bug report label Apr 24, 2021
@junlincc junlincc changed the title Report Schedule sellenium user not found [Report] Schedule sellenium user not found Apr 28, 2021
@junlincc junlincc added validation:required A committer should validate the issue and removed #bug Bug report labels Apr 28, 2021
@junlincc
Copy link
Member

cc @nytai

@willbarrett
Copy link
Member

cc @dpgaspar

@dpgaspar
Copy link
Member

@dusatvoj

what's the value for THUMBNAIL_SELENIUM_USER? and make sure that user exists on the database

@dusatvoj
Copy link
Author

What? Can't see this variable in the docs 🤔 🙃

@dpgaspar
Copy link
Member

Just double checked it is there: https://superset.apache.org/docs/installation/alerts-reports

Were you able to make it work?

@dusatvoj
Copy link
Author

It's written under "Old Reports feature (version 0.38 and below)" but it's still relevant? Weird

Anyway: I've tried to set it up BUT There's another error Report Schedule execution failed when generating a screenshot.

Apr 29 09:01:45 node6 celery[1496765]: Selenium timed out requesting url http://localhost:8088/superset/dashboard/7/
Apr 29 09:01:45 node6 celery[1496765]: [2021-04-29 09:01:45,886: ERROR/ForkPoolWorker-1] Selenium timed out requesting url http://localhost:8088/superset/dashboard/7/
Apr 29 09:01:46 node6 celery[1496765]: Report state: Report Schedule execution failed when generating a screenshot.
Apr 29 09:01:46 node6 celery[1496765]: [2021-04-29 09:01:46,060: INFO/ForkPoolWorker-1] Report state: Report Schedule execution failed when generating a screenshot.

Maybe It's caused by the fact we are using OIDC login (https://stackoverflow.com/questions/54010314/using-keycloakopenid-connect-with-apache-superset ... this solution bcs solution in your docs were not working too - same as this feature)

# curl http://localhost:8088/superset/dashboard/7/
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>Redirecting...</title>
<h1>Redirecting...</h1>
<p>You should be redirected automatically to target URL: <a href="/login/?next=http%3A%2F%2Flocalhost%3A8088%2Fsuperset%2Fdashboard%2F7%2F">/login/?next=http%3A%2F%2Flocalhost%3A8088%2Fsuperset%2Fdashboard%2F7%2F</a>.  If not click the link.

@dpgaspar
Copy link
Member

@dusatvoj good point regarding the docs, We'll fix it.

Are you using docker? have you executed the curl from inside the container?

@dusatvoj
Copy link
Author

dusatvoj commented Apr 29, 2021

No, I have dedicated VM for superset ... and I've executed curl in the VM

@dpgaspar
Copy link
Member

it should not be OIDC since your still using session cookies. that's what the worker is generating for THUMBNAIL_SELENIUM_USER. One more question/confirmation, I'm assuming that your VM has the worker and the web app executing on the same instance?

@dusatvoj
Copy link
Author

yes, on the same machine

@dpgaspar
Copy link
Member

ok, do you have geckodriver installed and firefox or chromedriver and chrome?
Also before the worker timeout do you have any frontend logs, if it's a login failure you should have an HTTP 302 for http://localhost:8088/superset/dashboard/7/

@dusatvoj
Copy link
Author

I have installed chromedriver and firefox-esr package on Debian 10. I've tried both engines with no luck.
No, I've found a log in journal...

May 11 12:00:00 node6 celery[1817263]: [2021-05-11 12:00:00,282: INFO/ForkPoolWorker-1] Init selenium driver
May 11 12:00:00 node6 celery[1817263]: Failed at generating thumbnail [Errno 13] Permission denied: 'geckodriver.log'
May 11 12:00:00 node6 celery[1817263]: [2021-05-11 12:00:00,282: ERROR/ForkPoolWorker-1] Failed at generating thumbnail [Errno 13] Permission denied: 'geckodriver.log'
May 11 12:00:00 node6 celery[1817263]: Report state: Report Schedule execution failed when generating a screenshot.
May 11 12:00:00 node6 celery[1817263]: [2021-05-11 12:00:00,314: INFO/ForkPoolWorker-1] Report state: Report Schedule execution failed when generating a screenshot.

(I've switched experimentally to firefox-esr)

@dpgaspar
Copy link
Member

Failed at generating thumbnail [Errno 13] Permission denied: 'geckodriver.log'

@dpgaspar dpgaspar added install:config Installation - Configuration settings question & help wanted Use Github discussions instead and removed validation:required A committer should validate the issue labels May 11, 2021
@dusatvoj
Copy link
Author

What should be configured in superset_config.py or another config? I want to log everything to stdout / stderr (to journal).

@dusatvoj
Copy link
Author

@dpgaspar I've found a solution. There were issue with bad WorkingDirectory but now there's another issue with generating screenshots 😓 .
There's timeouting selenium with firefox and chrome too. I don't know how to debug it but I think there's issue with our OAuth2 solution (https://stackoverflow.com/questions/54010314/using-keycloakopenid-connect-with-apache-superset) ... which is the only working OAuth2 SSO solution I've found. 🙃

@dusatvoj
Copy link
Author

There's related log from celery ...

May 18 19:02:09 node6 celery[5345]: Selenium timed out requesting url http://localhost:8088/superset/dashboard/7/
May 18 19:02:09 node6 celery[5345]: [2021-05-18 19:02:09,007: ERROR/ForkPoolWorker-1] Selenium timed out requesting url http://localhost:8088/superset/dashboard/7/
May 18 19:02:09 node6 celery[5345]: Report state: Report Schedule execution failed when generating a screenshot.
May 18 19:02:09 node6 celery[5345]: [2021-05-18 19:02:09,596: INFO/ForkPoolWorker-1] Report state: Report Schedule execution failed when generating a screenshot.

@dpgaspar
Copy link
Member

@dusatvoj if you login to your server/container are you able to launch firefox and geckodriver?

@dusatvoj
Copy link
Author

superset@node6:~$ firefox
Error: no DISPLAY environment variable specified
superset@node6:~$ geckodriver
1621422469338	geckodriver	INFO	Listening on 127.0.0.1:4444
^C
superset@node6:~$ 

It looks that geckodriver works fine 🤔

@dpgaspar
Copy link
Member

dpgaspar commented May 19, 2021

@dusatvoj

I'm troubleshooting firefox right now, it seems that everything works just fine with root user but when switching to a lower user eg: superset you get: Failed at generating thumbnail [Errno 13] Permission denied: 'geckodriver.log'

Solved that one with:

WEBDRIVER_CONFIGURATION={
    "service_log_path": "/dev/null"
}

did you do this to solve one of your first problems?

regarding the timeout to http://localhost:8088/superset/dashboard/7/ do you get to see any logs on the frontend/app?

@dusatvoj
Copy link
Author

I've solved permissions issue by changing WorkingDirectory(in systemd service file) to $HOME of the user but /dev/null looks good too :D
I've changed URL to fronted (apache proxy) and it looks it's stuck on login page at keycloak ....

Headless browser is redirected to /login ...

[19/May/2021:18:46:01 +0200] "GET /login/ HTTP/1.1" 302 6863 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/90.0.4430.212 Safari/537.36"

and after that it's redirected to Keycloak instance (OIDC provider) ...

[19/May/2021:18:46:01 +0200] "GET /auth/realms/<REALM>/openid-connect/auth?...

... and again and again ... (/login -> keycloak instance -> nothing) ... but this type of authentication is the only working solution for user friendly OIDC login ... and we want to use reports too 😕 😕 😕

@dusatvoj dusatvoj changed the title [Report] Schedule sellenium user not found [Report] Reports not working with OIDC auth May 19, 2021
@dpgaspar
Copy link
Member

@dusatvoj

I see so it may be because of: https://github.com/apache/superset/blob/master/superset/utils/machine_auth.py#L53
by default the test request is /login if it redirects immediately to openid then that could be your problem. You can write your own MachineAuthProvider and set it on MACHINE_AUTH_PROVIDER_CLASS config key.

@Asturias-sam
Copy link

Asturias-sam commented Jun 28, 2021

@dpgaspar
I have a similar problem

Selenium timed out requesting url http://<IP>/superset/dashboard/9/
Report Schedule execution failed when generating a screenshot.

I am running it in the virtual environment without docker we have LDAP enabled , i have the property THUMBNAIL_SELENIUM_USER enables any suggestion ?

@CountRedClaw
Copy link

@dpgaspar
Is it a good WA to replace
driver.get(headless_url("/login/"))
with
driver.get(headless_url("/nonexistent_url"))
in order to avoid redirection to the identity provider and therefore to set cookies to proper domain?

@dusatvoj
Copy link
Author

@CountRedClaw good WA but not enough 😕 I still cant generate because of the cookie 🤔 Do you have an idea how to set it properly?

@heul
Copy link

heul commented Jul 23, 2021

@dpgaspar , thanks for the hint. It might even suffice to overwrite the webdriver auth function (defined by WEBDRIVER_AUTH_FUNC). We use Azure as OAuth provider and any /login request is redirected there and then selenium times out. Works perfectly, when we added this to our superset_config.py:

from superset.utils.urls import headless_url
from superset.utils.machine_auth import MachineAuthProvider

def auth_driver(driver, user):
    # Setting cookies requires doing a request first, but /login is redirected to oauth provider, and stuck there.
    driver.get(headless_url("/doesnotexist"))

    cookies = MachineAuthProvider.get_auth_cookies(user)

    for cookie_name, cookie_val in cookies.items():
        driver.add_cookie(dict(name=cookie_name, value=cookie_val))

    return driver

WEBDRIVER_AUTH_FUNC = auth_driver

@jensenity
Copy link

what's the solution for this?

@dusatvoj
Copy link
Author

dusatvoj commented Dec 26, 2021

There's just a workaround with nonexistent url

@vivekpradhan
Copy link

vivekpradhan commented Dec 28, 2021

I am facing the same issue. Can someone tell me what to add to superset config to enable TRACE level logging for the geckodriver so that I can see why I am getting this error:

[2021-12-28 10:10:42,315: WARNING/ForkPoolWorker-1] Selenium timed out requesting url https://example.com/superset/dashboard/81/?standalone=3
Traceback (most recent call last):
  File "/app/superset/utils/webdriver.py", line 125, in get_screenshot
    EC.presence_of_element_located((By.CLASS_NAME, element_name))
  File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/support/wait.py", line 80, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: 

Report state: Report Schedule execution failed when generating a screenshot.
[2021-12-28 10:10:46,115: INFO/ForkPoolWorker-1] Report state: Report Schedule execution failed when generating a screenshot.

@vivekpradhan
Copy link

I found the logs listing HTTP requests in the main superset app container. The chart is not loading with 401 status.

127.0.0.1 - - [28/Dec/2021:15:09:09 +0000] "GET /static/assets/5053927271b77cc901e5.chunk.js HTTP/1.1" 200 1258 "http://localhost:8080/superset/dashboard/81/?standalone=3" "Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0"
127.0.0.1 - - [28/Dec/2021:15:09:09 +0000] "GET /api/v1/dashboard/81/datasets HTTP/1.1" 401 39 "http://localhost:8080/superset/dashboard/81/?standalone=3" "Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0"
127.0.0.1 - - [28/Dec/2021:15:09:09 +0000] "GET /api/v1/dashboard/81/charts HTTP/1.1" 401 39 "http://localhost:8080/superset/dashboard/81/?standalone=3" "Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0"
127.0.0.1 - - [28/Dec/2021:15:09:09 +0000] "GET /api/v1/dashboard/81 HTTP/1.1" 401 39 "http://localhost:8080/superset/dashboard/81/?standalone=3" "Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0"

The default authenticate_webdriver function is not working I also tried something similar what @heul suggested. Can you tell me what the WEBDRIVER_AUTH_FUNC should be to help my user login to superset. Our superset uses google oauth.

@dusatvoj
Copy link
Author

I'm using self-hosted keycloak and the workaround from @heul 's comment. It works as expected.

@vivekpradhan
Copy link

So @heul 's solution now works for me. The problem was I did not set THUMBNAIL_SELENIUM_USER. The default "Admin" was being used, but I had deactivated that account.

@apache apache locked and limited conversation to collaborators Feb 2, 2022
@geido geido converted this issue into discussion #18289 Feb 2, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
install:config Installation - Configuration settings question & help wanted Use Github discussions instead
Projects
None yet
Development

No branches or pull requests

9 participants