Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save and Export (nbconvert) generates 403 with error "'_xsrf' argument missing from GET". #16040

Closed
mikedarcy opened this issue Mar 21, 2024 · 14 comments
Labels
bug status:Needs Info status:Needs Triage Applied to new issues that need triage

Comments

@mikedarcy
Copy link

mikedarcy commented Mar 21, 2024

Description

I've got JupyterLab 4.1.5 and JupyterHub 4.1.0 setup reverse proxied behind an Apache HTTPD. It has a bindUrl of 'http://127.0.0.1:8000/jupyter/' and everything is working as expected, except for Save and Export which always returns a 403 with _xsrf' argument missing from GET. The nbconvert URL indeed does not have the _xsrf parameter present e.g.:

https://my-host.org/jupyter/user/me/nbconvert/markdown/test.ipynb?download=true

Reproduce

Python 3.10.13 virtual env/Amazon Linux 2/Apache HTTPD
I am using an OAuthenticator but it also happens with PAM.

Output of pip freeze

alembic==1.13.1
anyio==4.3.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens==2.4.1
async-generator==1.10
async-lru==2.0.4
attrs==23.2.0
Babel==2.14.0
beautifulsoup4==4.12.3
bleach==6.1.0
certifi==2024.2.2
certipy==0.1.3
cffi==1.16.0
charset-normalizer==3.3.2
comm==0.2.2
cryptography==42.0.5
debugpy==1.8.1
decorator==5.1.1
defusedxml==0.7.1
exceptiongroup==1.2.0
executing==2.0.1
fastjsonschema==2.19.1
fqdn==1.5.1
greenlet==3.0.3
h11==0.14.0
httpcore==1.0.4
httpx==0.27.0
idna==3.6
ipykernel==6.29.3
ipython==8.22.2
ipywidgets==8.1.2
isoduration==20.11.0
jedi==0.19.1
Jinja2==3.1.3
json5==0.9.24
jsonpointer==2.4
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
jupyter-events==0.10.0
jupyter-lsp==2.2.4
jupyter-telemetry==0.1.0
jupyter_client==8.6.1
jupyter_core==5.7.2
jupyter_server==2.13.0
jupyter_server_terminals==0.5.3
jupyterhub==4.1.0
jupyterlab==4.1.5
jupyterlab_pygments==0.3.0
jupyterlab_server==2.25.4
jupyterlab_widgets==3.0.10
Mako==1.3.2
MarkupSafe==2.1.5
matplotlib-inline==0.1.6
mistune==3.0.2
nbclient==0.10.0
nbconvert==7.16.2
nbformat==5.10.3
nest-asyncio==1.6.0
notebook_shim==0.2.4
oauthenticator==16.3.0
oauthlib==3.2.2
overrides==7.7.0
packaging==24.0
pamela==1.1.0
pandocfilters==1.5.1
parso==0.8.3
pexpect==4.9.0
platformdirs==4.2.0
prometheus_client==0.20.0
prompt-toolkit==3.0.43
psutil==5.9.8
ptyprocess==0.7.0
pure-eval==0.2.2
pycparser==2.21
Pygments==2.17.2
PyJWT==2.8.0
pyOpenSSL==24.1.0
python-dateutil==2.9.0.post0
python-json-logger==2.0.7
PyYAML==6.0.1
pyzmq==25.1.2
referencing==0.34.0
requests==2.31.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds-py==0.18.0
ruamel.yaml==0.18.6
ruamel.yaml.clib==0.2.8
Send2Trash==1.8.2
six==1.16.0
sniffio==1.3.1
soupsieve==2.5
SQLAlchemy==2.0.28
stack-data==0.6.3
terminado==0.18.1
tinycss2==1.2.1
tomli==2.0.1
tornado==6.4
traitlets==5.14.2
types-python-dateutil==2.9.0.20240316
typing_extensions==4.10.0
uri-template==1.3.0
urllib3==2.2.1
wcwidth==0.2.13
webcolors==1.13
webencodings==0.5.1
websocket-client==1.7.0
widgetsnbextension==4.0.10

ssl.conf relevant snippet:

  # Use RewriteEngine to handle WebSocket connection upgrades
  RewriteEngine On
  RewriteCond %{HTTP:Connection} Upgrade [NC]
  RewriteCond %{HTTP:Upgrade} websocket [NC]
  RewriteRule /jupyter/(.*) ws://127.0.0.1:8000/jupyter/$1 [P,L]
  RewriteRule /jupyter/(.*) http://127.0.0.1:8000/jupyter/$1 [P,L]

  RedirectMatch "^/$" "/jupyter/"
  AllowEncodedSlashes On

  <Location /jupyter>
    # preserve Host header to avoid cross-origin problems
    ProxyPreserveHost on
    # proxy to JupyterHub
    ProxyPass http://127.0.0.1:8000/jupyter
    ProxyPassReverse  http://127.0.0.1:8000/jupyter
    RequestHeader     set "X-Forwarded-Proto" expr=%{REQUEST_SCHEME}
  </Location>
  1. Go to any notebook.
  2. Click on File->Save and Export Notebook As->any format
  3. See error in new paged spawned with:
403 : Forbidden
The error was:

'_xsrf' argument missing from GET

Expected behavior

In a previous 3.7.x setup, this all worked fine. From what I have learned, that is due to xsrf support not being present in that version. Anyway, I would expect this to work just like Download which does include the _xsrf query parameter when you try to download the notebook.

Incidentally, when I add the _wsrf parameter to the URL manually and GET it in my browser, it works:

https:/my-host.org/jupyter/user/me/nbconvert/markdown/test_tensorflow.ipynb?download=true&_xsrf=MnwxOjB8MTA6MTcxMTA2MTk4MHw1Ol94c3JmfDEzMjpNRGt3WWpWaVlXWTJOalV6TkRreE4yRm1aVFprTTJSbE1HSXhNbVpqTWpRNk1HTTNaakF6TW1Sa01EQm1AswAsWVRjM1pEVmhZekl6TURnMlptSmhNMlUyTm1NNE9UVmtOamswWlRoaFpETTNNMkkzWXprMlpqUXpNamd4T1dGaE1RPT18ZWZmZmM5NjI1MmVjY2Y0MjZjMDdjZTViMzVmZmE4Y2YwMDM4OThlM2YyZjk2M2JhMTExNjEyMjM0OWMzNTI1MQ

I looked at some of the Javascript in the console and I saw something like this for getDownloadURL:

                let i = "";
                try {
                    i = document.cookie
                } catch (o) {}
                const s = i.match("\\b_xsrf=([^;]*)\\b");
                if (s) {
                    const e = new URL(n);
                    e.searchParams.append("_xsrf", s[1]);
                    n = e.toString()
                }

But, I did not see anything like that for getNBConvertURL:

  export function getNBConvertURL({
    path,
    format,
    download
  }: {
    path: string;
    format: string;
    download: boolean;
  }): string {
    const notebookPath = URLExt.encodeParts(path);
    const url = URLExt.join(getBaseUrl(), 'nbconvert', format, notebookPath);
    if (download) {
      return url + '?download=true';
    }
    return url;
  }

I'm pretty new to JupyterHub so I don't know if this is all intended and works in some way I don't understand or is it more obvious like the difference in the two functions above.

Context

  • Operating System and version: Amazon Linux 2
  • Browser and version: Chrome Version 122.0.6261.129
  • JupyterLab version: 4.1.5/4.1.0 (JupyterHub)
@mikedarcy mikedarcy added the bug label Mar 21, 2024
Copy link

welcome bot commented Mar 21, 2024

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@jupyterlab-probot jupyterlab-probot bot added the status:Needs Triage Applied to new issues that need triage label Mar 21, 2024
@JasonWeill
Copy link
Contributor

@mikedarcy Thank you for opening this issue! Could you please try upgrading JupyterHub to 4.1.1 or newer, and seeing whether this issue recurs? You can see an option to disable this check in the JupyterHub changelog: https://jupyterhub.readthedocs.io/en/stable/reference/changelog.html#id3 (thanks @krassowski )

@mikedarcy
Copy link
Author

Thanks for looking into this. I have upgraded JupyterHub to 4.1.3 and it is running fine.

Mar 26 19:17:52 compute.<redacted>.org jupyterhub[17090]: [I 2024-03-26 19:17:52.500 JupyterHub app:2885] Running JupyterHub version 4.1.3

I added c.ServerApp.disable_check_xsrf=True to my config and I do see it working for some requests:

Mar 26 19:19:26 compute.<redacted>.org jupyterhub[17090]: [W 2024-03-26 19:19:26.397 ServerApp] Skipping XSRF check for insecure request GET /jupyter/user/mdarcy/api/kernels/3a6ef36b-b79a-4a13-b15d-e5035e28

But unfortunately the nbconvert is still failing with the 403 due to mising _xsrf param:

Mar 26 19:20:24 compute.<redacted>.org jupyterhub[17090]: [W 2024-03-26 19:20:24.200 ServerApp] 403 GET /jupyter/user/mdarcy/nbconvert/markdown/test_tensorflow.ipynb?download=true (mdarcy@127.0.0.1) 1.52ms

Questions:

I am curious about the security implications of globally bypassing the XSRF checks just for "Save and Export" which I thought was core "out-of-the-box" functionality. Wouldn't it be safer to have it enabled and just fix the nbconvert URLs to include the parameter? I don't understand why the notebook "Download" URL includes the _xsrf parameter whereas the "Save and Export" (which also effectively "downloads" something) doesn't.

Anyway, as I mentioned I am new to JupyterLab/JupyterHub and may not fully understand some of the technical subtleties at work here. Thanks for the support!

@yelban
Copy link

yelban commented Mar 28, 2024

Adding
c.Spawner.args = ['--ServerApp.disable_check_xsrf=True']
in jpuyterhub_config.py, Save and Export notebook are now reinstated for downloading.

@mikedarcy
Copy link
Author

Thanks @yelban, I can confirm that this config line does work for me. Yet I am still concerned about having to disable XSRF checking to have this standard feature function properly. As a temporary workaround, it is not a problem. However. this seems like a bug to me, since if the _xsrf parameter is simply added to the generated nbconvert URL (which I tested manually), it works with XSRF checks enabled. Am I missing something? @JasonWeill Any thoughts?

@minrk
Copy link
Contributor

minrk commented Mar 29, 2024

It would be reasonable to include _xsrf in this URL, which would fix it. The change is in jupyterhub 4.1, which increases some strictness of XSRF checks. It is not meant to apply to regular navigation requests like this one, though, which ought to be fixed by jupyterhub/jupyterhub#4759

@krassowski
Copy link
Member

Can folks confirm that JupyterHub 4.1.4 fixed the issue?

@jrdnbradford
Copy link

@krassowski confirming that jupyterhub 4.1.4 resolved this issue for the-littlest-jupyterhub 1.0.0.

@yelban
Copy link

yelban commented Mar 30, 2024

I can't do it here.

@jrdnbradford
Copy link

@krassowski while jupyterhub 4.1.4 seems to have resolved this issue for the-littlest-jupyterhub 1.0.0, I've received the same 403 error on the K8s hub after upgrading.

@jrdnbradford
Copy link

jrdnbradford commented Mar 30, 2024

@yelban and @krassowski I think I found the issue. After updating the k8s-hub image and helm chart to 3.3.6 to get jupyterhub 4.1.4, my hub UI showed 4.1.4 but a pip list in my newly built JupyterLab user environment revealed jupyterhub 4.1.3. After updating my JupyterLab docker image, nbconvert worked as expected on the K8s hub:

RUN pip install --force-reinstall --no-cache-dir \
    jupyterhub==4.1.4

The lab docker image just needs to catch up.

@mikedarcy
Copy link
Author

I can also confirm that jupyterhub 4.1.4 fixes the issue for me. Thanks!

@yelban
Copy link

yelban commented Mar 31, 2024

I can also confirm that jupyterhub 4.1.4 fixes the issue. Thanks!

@jrdnbradford Thank you for your guidance. I just discovered that everything is back to normal after updating to the quay.io/jupyter/minimal-notebook image that updated 18 hours ago. There is no need to disable '_xsrf' specifically, and there won't be any issues with exporting downloads. Cheers!

The updated minimal-notebook image contains:
jupyterhub 4.1.4 pyh31011fe_0 conda-forge
jupyterlab 4.1.5 pyhd8ed1ab_0 conda-forge

@JasonWeill
Copy link
Contributor

Closing because this is fixed with JupyterHub 4.1.4. Thank you all for your contributions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug status:Needs Info status:Needs Triage Applied to new issues that need triage
Projects
None yet
Development

No branches or pull requests

6 participants