Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

infinite failure loop - emacs locked up #750

Closed
sam-s opened this issue Oct 20, 2020 · 10 comments
Closed

infinite failure loop - emacs locked up #750

sam-s opened this issue Oct 20, 2020 · 10 comments
Labels
readme Please follow the README in reporting bugs

Comments

@sam-s
Copy link
Contributor

sam-s commented Oct 20, 2020

I tried to connect to a jupyter server which works fine with chrome, and I got the notebooklist correctly.
however, when I clicked on [Open], I got an infinite error loop which I cannot get out of.
Here is a screen shot:
image

C-g and C-] have no effect.
The echo area is blinking with "cannot send message to a closed web server"

@dickmao
Copy link
Collaborator

dickmao commented Oct 20, 2020

That emacs has dutifully trapped the error and issued a backtrace, but cannot yield to user input, suggests circumstances unique to your configuration. Perhaps one day you will follow the README and try to reproduce the bug with -Q. Until then, even if I could magically will a login into braze.com's private cloud, the problem looks squarely within websocket.el's bailiwick.

@dickmao dickmao closed this as completed Oct 20, 2020
@dickmao dickmao added the readme Please follow the README in reporting bugs label Oct 20, 2020
@sam-s
Copy link
Contributor Author

sam-s commented Oct 20, 2020

This is easily reproducible with emacs -Q and (setq debug-on-error t).
Without debug-on-error, Emacs merely hangs for a while with cannot send message to a closed web server and then asks Buffer " *stream buffer*" has a running process, kill it?.
Then I see

...
ein: [info] WS action [(wrong-type-argument ein:$websocket nil)] on-close (wss://data-science.k8s.region-001.p-use-1.braze.com:443/api/kernels/36fc6b15-f366-4b81-8856-6029272ef998/channels?session_id=ba51c70d-9aad-493a-933b-41b8c07f9d27)
ein: [info] WS action [(websocket-received-error-http-response 403)] on-open (wss://data-science.k8s.region-001.p-use-1.braze.com:443/api/kernels/36fc6b15-f366-4b81-8856-6029272ef998/channels?session_id=ba51c70d-9aad-493a-933b-41b8c07f9d27)
ein: [info] WS action [(wrong-type-argument ein:$websocket nil)] on-open (wss://data-science.k8s.region-001.p-use-1.braze.com:443/api/kernels/36fc6b15-f366-4b81-8856-6029272ef998/channels?session_id=ba51c70d-9aad-493a-933b-41b8c07f9d27)
ein: [info] WS action [(json-readtable-error 79)] on-message (wss://data-science.k8s.region-001.p-use-1.braze.com:443/api/kernels/36fc6b15-f366-4b81-8856-6029272ef998/channels?session_id=ba51c70d-9aad-493a-933b-41b8c07f9d27)
ein: [info] WS action [(json-unknown-keyword ter)] on-message (wss://data-science.k8s.region-001.p-use-1.braze.com:443/api/kernels/36fc6b15-f366-4b81-8856-6029272ef998/channels?session_id=ba51c70d-9aad-493a-933b-41b8c07f9d27)
....
error in process filter: file-remote-p: Variable binding depth exceeds max-specpdl-size
error in process filter: Variable binding depth exceeds max-specpdl-size

in *Messages*.
Note that the abundance (thousands!) of ein messages appears to contradict your claim that websocket is to blame.

I do get the notebook buffer which is asking for C-x C-r, which reproduces the error.

@dickmao dickmao reopened this Oct 20, 2020
@dickmao
Copy link
Collaborator

dickmao commented Oct 21, 2020

Note that the abundance (thousands!) of ein messages appears to contradict your claim that websocket is to blame.

Because you turned off debug-on-error, ein is doing its level best to keep up with websocket's refusal to let the connection die.

Reactivate debug-on-error, and get the backtrace. Now figure out why

C-g and C-] have no effect.
The echo area is blinking with "cannot send message to a closed web server"

This was at the outset my confusion. EIN could not possibly be blamed for this.

Either way your k8s cluster is killing websocket's entreaties with extreme prejudice, without so much as a handshake. It's not clear if your company is running a military-grade jupyter instance or the k8s networking is nixing non-browser connections out of hand. If your waning interest can sustain it, I'd peck around with curl and various User-Agent settings.

To mitigate the worrisome wrong-type-argument ein:$websocket nil diagnostics, I committed fbc1d99, but by all rights that change alone would not forestall the infinite loop you're seeing.

@sam-s
Copy link
Contributor Author

sam-s commented Oct 21, 2020

Because you turned off debug-on-error, ein is doing its level best to keep up with websocket's refusal to let the connection die.

I am not sure what you mean here; I routinely run emacs with debug-on-error, and that is causing the infinite loop.

Reactivate debug-on-error, and get the backtrace.

how do I do that?
the image of the backtrace in the original message is all I can get, emacs is not responding

Now figure out why

C-g and C-] have no effect.
The echo area is blinking with "cannot send message to a closed web server"

The "no effect" means that C-] causes *Backtrace* to momentarily disappear and then it reappears again immediately.

Either way your k8s cluster is killing websocket's entreaties with extreme prejudice, without so much as a handshake.

how does it know websocket from chrome?
note that the list of notebooks is created just fine.

It's not clear if your company is running a military-grade jupyter instance or

the jupyter is quite ordinary; as I said it works fine with the browser and I even have some control over how it is invoked.

the k8s networking is nixing non-browser connections out of hand. If your waning interest can sustain it, I'd peck around with curl and various User-Agent settings.

I would appreciate a more specific/detailed instruction.

@dickmao
Copy link
Collaborator

dickmao commented Oct 21, 2020

note that the list of notebooks is created just fine.

Ah, that is true.

You might have to C-u C-M-x in websocket-send and then h on https://github.com/ahyatt/emacs-websocket/blob/main/websocket.el#L562 and see where it leads.

open-network-stream in websocket.el does create a new process, so I suppose it's theoretically possible that process is going beserk preventing emacs's main thread from regaining control. It's doubtful though. I don't recall ever seeing emacs capitulate the backtrace and then freeze.

This is about the extent of my predictive powers without having boots on the ground. It's up to you come up with a MRE that doesn't involve logging into your company's intranet. Given two hours I'm sure I could nail it down to something websocket is doing.

@dickmao dickmao closed this as completed Oct 21, 2020
@sam-s
Copy link
Contributor Author

sam-s commented Oct 22, 2020

Before I enter websocket-send, I get ein: [info] Worksheet Test.ipynb is ready in *Messages*
The frame is #s(websocket-frame :opcode close :payload nil :length nil :completep t)
and I get the "unescapable beeping loop" in line 560.

image

C-] results in a brief message error in process filter: Quit and then the beeping resumes.

(if you want, we can get on zoom/chat/phone and you will drive the interactive debugging - I did this 20 year ago with RMS, it was fun ;-)

@sam-s
Copy link
Contributor Author

sam-s commented Oct 22, 2020

In websocket-ensure-connected:
(websocket-conn websocket) ==>
#<process websocket to wss://data-science.k8s.region-001.p-use-1.braze.com:443/api/kernels/36fc6b15-f366-4b81-8856-6029272ef998/channels?session_id=ba51c70d-9aad-493a-933b-41b8c07f9d27>
(process-status (websocket-conn websocket)) ==> closed
and the lockdown in websocket-open on websocket-ensure-handshake:

image

@sam-s
Copy link
Contributor Author

sam-s commented Oct 30, 2020

here is the full loop in*Messages*:

error in process filter: websocket-send: Cannot send message to a closed websocket: #s(websocket-frame pong "c6417ed01bdc0ae3ef32ae4894fd03\">
    <meta http-equiv=\"" nil t)
error in process filter: Cannot send message to a closed websocket: #s(websocket-frame pong "c6417ed01bdc0ae3ef32ae4894fd03\">
    <meta http-equiv=\"" nil t)
ein: [error] ein:start-single-websocket: on-close no client data for wss://data-science.k8s.region-001.p-use-1.braze.com:443/api/kernels/2e23266f-ec11-48b2-af60-97218ca34498/channels?session_id=afaef748-b620-4407-9d0f-854ff3a17fa7.
ein: [info] WS action [(websocket-received-error-http-response 403)] on-open (wss://data-science.k8s.region-001.p-use-1.braze.com:443/api/kernels/2e23266f-ec11-48b2-af60-97218ca34498/channels?session_id=afaef748-b620-4407-9d0f-854ff3a17fa7)
ein: [error] ein:start-single-websocket: on-open no client data for wss://data-science.k8s.region-001.p-use-1.braze.com:443/api/kernels/2e23266f-ec11-48b2-af60-97218ca34498/channels?session_id=afaef748-b620-4407-9d0f-854ff3a17fa7.
ein: [info] WS action [(json-readtable-error 79)] on-message (wss://data-science.k8s.region-001.p-use-1.braze.com:443/api/kernels/2e23266f-ec11-48b2-af60-97218ca34498/channels?session_id=afaef748-b620-4407-9d0f-854ff3a17fa7)
ein: [info] WS action [(json-unknown-keyword ter)] on-message (wss://data-science.k8s.region-001.p-use-1.braze.com:443/api/kernels/2e23266f-ec11-48b2-af60-97218ca34498/channels?session_id=afaef748-b620-4407-9d0f-854ff3a17fa7)
ein: [error] ein:start-single-websocket: on-close no client data for wss://data-science.k8s.region-001.p-use-1.braze.com:443/api/kernels/2e23266f-ec11-48b2-af60-97218ca34498/channels?session_id=afaef748-b620-4407-9d0f-854ff3a17fa7.
error in process filter: websocket-send: Cannot send message to a closed websocket: #s(websocket-frame pong "c6417ed01bdc0ae3ef32ae4894fd03\">
    <meta http-equiv=\"" nil t)
error in process filter: Cannot send message to a closed websocket: #s(websocket-frame pong "c6417ed01bdc0ae3ef32ae4894fd03\">
    <meta http-equiv=\"" nil t)
ein: [error] ein:start-single-websocket: on-close no client data for wss://data-science.k8s.region-001.p-use-1.braze.com:443/api/kernels/2e23266f-ec11-48b2-af60-97218ca34498/channels?session_id=afaef748-b620-4407-9d0f-854ff3a17fa7.

@sam-s
Copy link
Contributor Author

sam-s commented Nov 3, 2020

the server logs show these messages:

[W 19:09:53.201 NotebookApp] Couldn't authenticate WebSocket connection
[W 19:09:53.202 NotebookApp] 403 GET /api/kernels/605f64ec-25cf-43b0-a08f-547a26565ce5/channels?session_id=e2923aa5-cde2-452b-ba31-91a6a0cf8989 (100.110.89.202) 1.58ms referer=None
[W 19:09:53.374 NotebookApp] Couldn't authenticate WebSocket connection
[W 19:09:53.375 NotebookApp] 403 GET /api/kernels/605f64ec-25cf-43b0-a08f-547a26565ce5/channels?session_id=e2923aa5-cde2-452b-ba31-91a6a0cf8989 (100.126.83.207) 1.55ms referer=None
[W 19:09:53.522 NotebookApp] Couldn't authenticate WebSocket connection
[W 19:09:53.523 NotebookApp] 403 GET /api/kernels/605f64ec-25cf-43b0-a08f-547a26565ce5/channels?session_id=e2923aa5-cde2-452b-ba31-91a6a0cf8989 (100.126.83.207) 2.08ms referer=None
[W 19:09:53.670 NotebookApp] Couldn't authenticate WebSocket connection
[W 19:09:53.671 NotebookApp] 403 GET /api/kernels/605f64ec-25cf-43b0-a08f-547a26565ce5/channels?session_id=e2923aa5-cde2-452b-ba31-91a6a0cf8989 (100.126.83.207) 1.64ms referer=None
[W 19:09:53.837 NotebookApp] Couldn't authenticate WebSocket connection
[W 19:09:53.838 NotebookApp] 403 GET /api/kernels/605f64ec-25cf-43b0-a08f-547a26565ce5/channels?session_id=e2923aa5-cde2-452b-ba31-91a6a0cf8989 (100.110.89.202) 1.56ms referer=None

apparently ein (or websocket?) needs to send the auth token with each request.

@sam-s
Copy link
Contributor Author

sam-s commented Nov 11, 2020

according to @ahyatt in ahyatt/emacs-websocket#75, while the infinite loop is probably websocket's problem (and he said that he would look into it), the 403 HTTP error is likely to be ein's problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
readme Please follow the README in reporting bugs
Projects
None yet
Development

No branches or pull requests

2 participants