Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REPL periodically freezes Emacs #344

Closed
DarrenN opened this issue Nov 24, 2018 · 27 comments
Closed

REPL periodically freezes Emacs #344

DarrenN opened this issue Nov 24, 2018 · 27 comments
Labels

Comments

@DarrenN
Copy link

DarrenN commented Nov 24, 2018

Background

I wanted to start this issue so I can being gathering more information.

Periodically when I have the Racket REPL open in racket-mode Emacs will freeze and peg a core at 100%. I cannot determine when it will happen. Sometimes I will step away from my laptop for a moment and come back and its hung, other times I will be in the middle of editing code and it will freeze. I'm not 100% sure its related to the REPL, but it only happens when I have the REPL open. I have to force quit Emacs via Activity Moniotor when this happens. Sometimes its gets really bad and I revert to running code on the CLI instead of using the REPL.

The code I'm working on it isn't doing anything fancy with threads that would block the REPL (and even when I do I can break out with C-c).

What other information can I grab? When it does lock I can sample the running process from OSX's Activity Monitor.

Debug information

screenshot 2018-11-24 09 48 00

((emacs-version "26.1")
 (emacs-uptime "16 hours, 32 minutes, 48 seconds")
 (system-type darwin)
 (major-mode racket-mode)
 (racket--source-dir "/Users/kyushu/.emacs.d/elpa/racket-mode-20180827.1303/")
 (racket-program "racket")
 (racket-memory-limit 2048)
 (racket-error-context medium)
 (racket-history-filter-regexp "\\`\\s *\\S ?\\S ?\\s *\\'")
 (racket-images-inline t)
 (racket-images-keep-last 100)
 (racket-images-system-viewer "display")
 (racket-pretty-print t)
 (racket-indent-curly-as-sequence t)
 (racket-indent-sequence-depth 0)
 (racket-pretty-lambda nil)
 (racket-smart-open-bracket-enable nil))
(enabled-minor-modes
 (async-bytecomp-package-mode)
 (auto-composition-mode)
 (auto-compression-mode)
 (auto-encryption-mode)
 (auto-fill-mode)
 (auto-revert-mode)
 (auto-save-mode)
 (blink-cursor-mode)
 (column-number-mode)
 (company-mode)
 (delete-selection-mode)
 (diff-auto-refine-mode)
 (eldoc-mode)
 (electric-indent-mode)
 (file-name-shadow-mode)
 (font-lock-mode)
 (global-company-mode)
 (global-eldoc-mode)
 (global-font-lock-mode)
 (global-git-commit-mode)
 (global-linum-mode)
 (global-magit-file-mode)
 (helm-mode)
 (hl-line-mode)
 (hs-minor-mode)
 (line-number-mode)
 (linum-mode)
 (magit-auto-revert-mode)
 (magit-file-mode)
 (menu-bar-mode)
 (mouse-wheel-mode)
 (paredit-mode)
 (shell-dirtrack-mode)
 (show-paren-mode)
 (spaceline-helm-mode)
 (tooltip-mode)
 (transient-mark-mode)
 (whitespace-mode))
(disabled-minor-modes
 (abbrev-mode)
 (auto-fill-function)
 (auto-image-file-mode)
 (auto-revert-tail-mode)
 (auto-save-visited-mode)
 (buffer-read-only)
 (cl-old-struct-compat-mode)
 (company-search-mode)
 (compilation-in-progress)
 (compilation-minor-mode)
 (compilation-shell-minor-mode)
 (completion-in-region-mode)
 (defining-kbd-macro)
 (diff-minor-mode)
 (dired-hide-details-mode)
 (display-time-mode)
 (edebug-mode)
 (electric-layout-mode)
 (electric-pair-mode)
 (electric-quote-mode)
 (flycheck-mode)
 (git-commit-mode)
 (global-auto-revert-mode)
 (global-flycheck-mode)
 (global-hl-line-mode)
 (global-prettify-symbols-mode)
 (global-visual-line-mode)
 (global-whitespace-mode)
 (global-whitespace-newline-mode)
 (helm--minor-mode)
 (helm--remap-mouse-mode)
 (helm-autoresize-mode)
 (helm-ff--delete-async-modeline-mode)
 (helm-migemo-mode)
 (helm-popup-tip-mode)
 (horizontal-scroll-bar-mode)
 (html-autoview-mode)
 (ido-everywhere)
 (isearch-mode)
 (jit-lock-debug-mode)
 (js2-highlight-unused-variables-mode)
 (js2-minor-mode)
 (js2-refactor-mode)
 (magit-blame-mode)
 (magit-blame-read-only-mode)
 (magit-blob-mode)
 (magit-popup-help-mode)
 (magit-wip-after-apply-mode)
 (magit-wip-after-save-local-mode)
 (magit-wip-after-save-mode)
 (magit-wip-before-change-mode)
 (mail-abbrevs-mode)
 (mml-mode)
 (multiple-cursors-mode)
 (next-error-follow-minor-mode)
 (overwrite-mode)
 (paragraph-indent-minor-mode)
 (prettify-symbols-mode)
 (racket-check-syntax-mode)
 (rectangle-mark-mode)
 (server-mode)
 (sgml-electric-tag-pair-mode)
 (sh-electric-here-document-mode)
 (shell-command-with-editor-mode)
 (show-smartparens-global-mode)
 (show-smartparens-mode)
 (size-indication-mode)
 (smartparens-global-mode)
 (smartparens-global-strict-mode)
 (smartparens-mode)
 (smartparens-strict-mode)
 (smerge-mode)
 (spaceline-info-mode)
 (temp-buffer-resize-mode)
 (tool-bar-mode)
 (unify-8859-on-decoding-mode)
 (unify-8859-on-encoding-mode)
 (url-handler-mode)
 (use-hard-newlines)
 (view-mode)
 (visible-mode)
 (visual-line-mode)
 (which-function-mode)
 (whitespace-newline-mode)
 (window-divider-mode)
 (winner-mode)
 (with-editor-mode)
 (xref-etags-mode)
 (yas-global-mode)
 (yas-minor-mode))
@greghendershott
Copy link
Owner

One quick thought I have would be to try with the latest from MELPA. (racket--source-dir "/Users/kyushu/.emacs.d/elpa/racket-mode-20180827.1303/") suggests you have commit ec35502 from Aug 27. There have been more commits, some of which have changed how Emacs talks to the back end Racket process.

On the one hand that's the general area I would expect could cause the kind of "Emacs freeze" you describe.

On the other hand, none were intended to fix that problem. (I haven't seen that myself, and I leave racket-repl-mode open for days/weeks at a time. I'm sure you're seeing it! I'm just saying I didn't know this was a problem so didn't try to fix it.)

On the third hand, maybe one change did so inadvertently.

I'll mull this over but in the meantime if it's not too inconvenient please try the latest?

Note: It might be slightly inconvenient because you might need to restart Emacs. (Due to issue #327 changing the back end startup "API". And due to Emacs package updates not necessarily "refreshing" already-loaded Emacs code.)

@greghendershott
Copy link
Owner

In terms of gathering more data:

  • Seeing CPU or other stats for the emacs and racket processes might be helpful.

  • Normally I'd suggest enabling emacs debug-on-error and giving me the resulting info. Unfortunately I don't know good things to do when emacs is frozen.

@DarrenN
Copy link
Author

DarrenN commented Nov 25, 2018

Ah good call on updating racket-mode, forgot that I hadn't in a while on this laptop, currently on:

 (racket--el-source-dir "/Users/kyushu/.emacs.d/elpa/racket-mode-20181117.229/")
 (racket--rkt-source-dir "/Users/kyushu/.emacs.d/elpa/racket-mode-20181117.229/racket/")

Will take it for a spin and report back.

@greghendershott
Copy link
Owner

Hopefully a spin not a spinlock. kick snare cymbal

@alex-hhh
Copy link

I noticed a similar problem, but running on Windows: sometimes, when the laptop comes out of sleep, the Racket process for the REPL uses 100% CPU. Emacs is however responsive and I can just kill the REPL buffer and start fresh. This does not happen every time the laptop is resumed from sleep, so I don't actually know what the trigger is.

I had this problem for several Racket versions and several racket-mode versions (I update regularly) -- it is something I learned to live with, as it is not a major inconvenience.

When this happens, I sometimes get a message in the REPL about the event space being unexpectedly closed, so I suspect it has something to do with the GUI libraries. Unfortunately I don't use DrRacket often enough to know if it suffers from the same problem...

@DarrenN
Copy link
Author

DarrenN commented Dec 6, 2018

Okay, following up. For the most part updating racket-mode to latest has helped a lot. Ran into this today. Opened a scratch buffer and set it to racket-mode, with no running REPL. Typed in some code and then Emacs froze with the following in the messages:

Checking Racket version ...
Starting racket to run /Users/daz/.emacs.d/elpa/racket-mode-20181117.229/racket/run.rkt ...
Still trying to connect to racket-command process on port 55555 ... [468004 times]
Company: An error occurred in auto-begin
Company: backend company-capf error "Could not connect to racket-command process on port 55555" with args (candidates #la)
user-error: Minibuffer window is not active
Connected to racket-command<1> process on port 55555 after 1 attempt(s)
Connected to racket-command<2> process on port 55555 after 1 attempt(s)
Connected to racket-command process on port 55555 after 2 attempt(s)

Emacs eventually recovered after hanging for about a minute. Unclear if the problem was with trying to connect to Racket or some interplay with company-mode and racket-mode.

greghendershott pushed a commit that referenced this issue Jan 16, 2019
I noticed when testing on Windows, that sometimes the connection would
not succeed, and furthermore get stuck in a state where
racket--cmd-connecting-p was left non-nil.

This commit fixed that for me.

Although this might have some bearing on issue #344 and issue #348, I
don't have any reason to think this fixes those.
greghendershott pushed a commit that referenced this issue Jan 16, 2019
I noticed when testing on Windows, that sometimes the connection would
not succeed, and furthermore get stuck in a state where
racket--cmd-connecting-p was left non-nil.

This commit fixed that for me.

Although this might have some bearing on issue #344 and issue #348, I
don't have any reason to think this fixes those.
@greghendershott
Copy link
Owner

Thanks! I can reproduce this specific situation with both conditions:

  • Using the *scratch* buffer -- or any buffer not associated with a file.
  • The *Racket REPL* buffer doesn't exist with a live process.

There is code that avoids starting the Racket REPL solely for completion candidates (instead it defaults to names we font-lock). However it's getting fooled when the edited buffer lacks any file, and there is no file associated with the REPL; both are nil, and Emacs Lisp silently says (string-equal nil nil) is t rather than raising an error. So the code thinks "oh the buffer has a file and it is live in the REPL, let's start sending commands for completion". Exactly opposite.

That is a simple fix, which I just made locally, but haven't yet pushed.


I'm a little freaked out by this part of your output:

Connected to racket-command<1> process on port 55555 after 1 attempt(s)
Connected to racket-command<2> process on port 55555 after 1 attempt(s)
Connected to racket-command process on port 55555 after 2 attempt(s)

There is no way that three racket command processes ought to be created. So I need to investigate that.

@DarrenN
Copy link
Author

DarrenN commented Jan 17, 2019

Thanks for the follow up @greghendershott

I ran into this yesterday at work trying to help a colleague debug something. Got the Connected to racket-command process on port 55555 after 1 attempt(s) and it hung for a couple of minutes.

Looking forward to the release with the fix, thanks again!

@greghendershott
Copy link
Owner

You're welcome. Caveat: This fix is very specific to the situation where you go to some buffer like *scratch* that isn't associated with any file and M-x racket-mode, and there isn't any REPL, and you start typing and company-mode kicks in.

So if you discover some other failure mode(s), (a) sorry! and (b) please let me know.

@octplane
Copy link

octplane commented Feb 3, 2019

Hello, stumbled on this bug. I have version racket-mode-20181206.329 installed and I can crash the whole thing by:

  • start emacs in a Racket project
  • open a rkt file
  • jump to definition of an object using the "jump to definition" call (SPC-m-g-g in my case)
  • I then have Emacs hanging and trying to connect to Racket for what seems to be too long for me...

As a side note, I'm a big emacs newbie and currently use Spacemacs 0.200.13@26.1.

@11111000000
Copy link

Same: "Still trying to connect to racket-command process on port 55555" on company completion (company-mode disable can prevent this)

@11111000000
Copy link

Please reopen this, I use lastest racket-mode from git!

@greghendershott
Copy link
Owner

@11111000000 FWIW that sounds closer to #359 i.e. company-mode related. Admittedly, that's also closed, and apparently the commit closing that isn't helping you now.

@octplane Possibly same in your case. IIRC company-mode would be one of the (many) things enabled by spacemacs by default.

Also related is #349 from mid-January tagged help-wanted, but I've been super-busy since then, apparently like everyone else. 😄

I will try to dig into this soon. If I don't have time to figure it out for real, maybe I could at least put a Band-Aid on like setting the company-mode popup delay to the value that means "never popup automatically, I press a special keybinding when I want you".

@greghendershott
Copy link
Owner

greghendershott commented Apr 3, 2019

Also see #348 -- and could you try this to see if it helps you, too? Please let me know?

@greghendershott
Copy link
Owner

p.s. I don't think you need to blow away your entire packages folder! Instead this:

Update

Be aware that an Emacs package update doesn't necessarily fully update Emacs' state. An example symptom is an "invalid function" error message. You might need to restart Emacs. In some cases, you might even need to:

  1. Uninstall racket-mode
  2. Exit and restart Emacs
  3. Install racket-mode

@greghendershott
Copy link
Owner

@11111000000 If that "deep update" process doesn't help, can you please give a step-by-step recipe so I can reproduce? Also M-x racket-bug-report will let me see some Emacs vars that might have an impact.

I ask because I just tried this:

  1. Start Emacs
  2. M-x global-company-mode to enable it (I have it disabled by default).
  3. C-x C-f /tmp/foo.rkt. New, blank .rkt file.
  4. Type "def" and pause.

Result: Company pops up with a list of completion candidates. (Pick one, all is well.)

Note that ever since commit b0296d9 in July 2018, in the above steps it is not even necessary for the racket-run to happen automatically. We just suggest font-lock keywords as candidates. So it wouldn't show the "Still trying to connect to racket-command process on port 55555" message you mentioned.

You must be doing something at least slightly different? If so, please tell me.

Meanwhile I'll try some other steps to see if I can stumble across it before you can tell me....

@greghendershott
Copy link
Owner

p.s. My company-mode config in init.el is:

(use-package company
  :ensure t
  :defer t
  :diminish company-mode
  ;; The following for testing racket-mode/issues/318
  :config
  (setq company-idle-delay 0.1
        company-minimum-prefix-length 2
        company-tooltip-align-annotations t
        company-show-numbers t
        company-require-match nil))

How about you?

Also can you give me the M-x racket-bug-report?

Thanks.

@alex-hhh
Copy link

alex-hhh commented Apr 5, 2019

I just had this problem happen to me while trying to run a simple program:

#lang racket
(define x 1)

When running the above using racket-run (F5), with a REPL which was opened and responsive, I got the "Still trying to connect..." message and had to break with C-g, this happened repeatedly: the REPL was responsive (I could type things into it and get responses), but running the program caused the "Still trying to connect..." errors. I tried to debug racket--cmd-connect-finish using EDebug, but the problem disappeared and I could no longer reproduce it. I did not restart Emacs during this time.

Also, I am not using company-mode and don't have it installed.

I suspect there might be a race condition somewhere, but not sure where. If I manage to reproduce it again, I will attempt to debug it further...

@greghendershott
Copy link
Owner

greghendershott commented Apr 5, 2019

@alex-hhh That jogs my memory: I was using racket-mode heavily on Windows about a month ago, and ISTR this happened once. But not thereafter. 😞 At the time I was trying to fix some other issues on Windows, and forgot.

The proximate cause might be that racket--cmd-connecting-p gets stuck with a t value. e.g. If this happens again, and you try M-: and enter (setq racket-cmd-connecting-p nil) it might then behave fine. But even that's true, the real question is how it gets stuck in that state.

greghendershott pushed a commit that referenced this issue Apr 6, 2019
I noticed when testing on Windows, that sometimes the connection would
not succeed, and furthermore get stuck in a state where
racket--cmd-connecting-p was left non-nil.

This commit fixed that for me.

Although this might have some bearing on issue #344 and issue #348, I
don't have any reason to think this fixes those.

-----------------------------------------------------------------

PROVENANCE: Today I noticed commit 98bb9c7 and commit cedf4ba linked
to from the above issues. But I do not have either commit in any of my
local repos, not even according to `git fsck --lost-found`. WAT.

I suspect what happened was that I made a topic branch and pushed a
commit, which is why GitHub got it (and apparently will keep it
indefinitely because of the link from comments). But then I deleted
the branch without merging. Or possibly that commit was rebased away
before merging the branch.

Probably this happened on a new Windows laptop. In my defense,
mid-January I was moving between Mac and Windows, as part of an effort
to improve racket-mode support on the latter. And then, a Linux laptop
got added to the mix, as well.

In any case I'm going to recover and use this commit, now. The bit
above the horizontal line is the commit message.
greghendershott pushed a commit that referenced this issue Apr 7, 2019
Goal: Fix any remnants of issue #344.

- Simplify code paths by relying on sentinels to do cleanup
  when the process is deleted or closed.

- Analyze the remaining paths to ensure that racket--cmd-connecting-p
  is set t during a series of connection attempts, and, set back to
  nil after such a series has completed (whether succeeded, failed, or
  C-g by the user).

- Break out the guts of racket--cmd-connect-start to a helper
  racket--cmd-connect-attempt. This makes it clearer that the former
  does checks needed at the start of the series, and the latter is
  called at intervals via run-at-timer to "run in the background".

- Rather than setting racket--cmd-proc back to nil, leave it set to
  the last value, and use things like `process-status` determine if it
  is 'open. Provide a little helper for this: racket--cmd-open-p.

- Add more (and hopefully better) comments and doc strings.
@DarrenN
Copy link
Author

DarrenN commented Apr 7, 2019

@alex-hhh thanks for the tip on C-g to break out of the REPL. This happened to me yesterday and I was able to break out and keep going without needing to shutdown Emacs. Luckily I haven't had this happen too often lately.

@greghendershott
Copy link
Owner

@DarrenN I just pushed a commit 72a54ad (to a topic branch, not yet to master) adding a reminder.


I've been spending a lot of time this weekend trying to understand and eliminate this issue. (I'd find this easier to do with Racket's concurrency primitives and tools, but I'm doing the best I can with Emacs Lisp.)

e.g. commit 82a381d tries to simplify the code paths, and make sure all handle things correctly.

And commit 7020209 tries to narrow it down even more.

I've done a little testing on Windows -- which is the only place I've experienced this heisenbug, even rarely -- and also on Linux, including putting intentional timing delays in certain places to try to flush it out more readily. Obviously not going to make any predictions about the success, yet.

@greghendershott
Copy link
Owner

Still working on this. I'm starting to think that the root problem is that, when something needs to send a command, currently we try to helpfully and automagically start the REPL backend and connect to its command server.

This creates code paths like:

  • "Oh, maybe the REPL is still starting up, and we're still trying to connect to its command server -- so maybe I should wait here for it to finish." i.e. where some people are needing to C-g.

  • "I'm confused whether such startup is already underway, so I start it again, with possibly hilarious results"

  • etc.

I'm starting to think the best thing to do is just issue an error: Tell the user, "Yeah, sorry, can't do that. You need to start the REPL first." Or sometimes, "Yeah, sorry, although the REPL is live, the command server doesn't seem to be. Looks like you'll need to restart the REPL."

That might flush out some new set of bugs or annoyances -- but I think they'll be deterministic bugs and annoyances.

So I'll work on that as I have time...

@greghendershott
Copy link
Owner

Found more hours to work on this. Same vein as above. WIP commits on a topic branch command-server-startup ICYI.

@alex-hhh
Copy link

Hi @greghendershott,

I had a brief look at the code, as it is on the master branch, and In racket--cmd-connect-start, the callback to make-network-process handles all unknown events by calling racket--cmd-disconnect -- but if this happens, racket--cmd-connect-finish will loop forever. As far as I can tell from the code, there will be no indication that the connection attempt has been aborted.

Also, given that this issue cannot be reproduced, perhaps it would be better to just update the code to add some more logging, and ask the users who encounter this issue to report the contents of their *Messages* buffer. I would suggest the following logging:

  • in racket--cmd-connect-start, add a log message that an unknown event has been received by the make-network-process callback, including the name of the event.
  • in the racket--cmd-connect-finish:
    • Add an assert that racket--cmd-connecting-p is t when the function is called
    • The while loop just waits for the variable racket--cmd-proc to become true. Inside the loop, I would add code to look for a process named "racket-command.*", using process-list and process-name, and if found report it status, and if not found report that as well. This way we'll know if there is a connection being attempted.

@greghendershott
Copy link
Owner

greghendershott commented Apr 10, 2019

Hi @alex-hhh, thank you for taking a look at this. I think you're right that's one problem the master code.

Another problem is that anyone or anything could delete the racket-command process-buffer, and therefore also the process, and the code will be confused.

What I've been working on goes a little further. I have about 21 commits where I've been simplifying and changing things one step at a time, and leaving breadcrumbs.

A quick summary:

  • Get rid of racket--cmd-connecting-p. In hindsight I think it's a smell -- having some redundant flag be mutated, when instead we could query the process-status and have the process sentinel handle all the cases properly. If it doesn't exist, it can't get out of sync.

  • Remove some "ensure X" flavor code. Either X is live and can be used, or isn't and must be created. Make that explicit.

  • If we need to send a command, and the command server isn't available, immediately error. (Although, I did add back the "waiting for command server" behavior as an option that is off by default.)

  • Wait for Racket and our run.rkt backend to load, before first attempting to connect to the command server. Although I think that's n/a for this issue, it's part of the simplification.

@11111000000
Copy link

11111000000 commented Apr 10, 2019

@greghendershott Now, there is no issue here for me.
Since I switched my NixOS to an unstable branch, where<M-x>version say: GNU Emacs 26.1 (build 1, x86_64-pc-linux-gnu, GTK + Version 3.24.5) of 2019-03-30, and also completely reinstall the Melpa packages - this error are gone! Now, I am every day actively developing an application on Racket, and everything works fine. I was still waiting for the error to be reproduced again to dive into debugging it, but it is gone. By the way, I want to say thank you for such a wonderful mode for EMACS! Now it is much better than Geiser!

greghendershott pushed a commit that referenced this issue Apr 15, 2019
I noticed when testing on Windows, that sometimes the connection would
not succeed, and furthermore get stuck in a state where
racket--cmd-connecting-p was left non-nil.

This commit fixed that for me.

Although this might have some bearing on issue #344 and issue #348, I
don't have any reason to think this fixes those.

-----------------------------------------------------------------

PROVENANCE: Today I noticed commit 98bb9c7 and commit cedf4ba linked
to from the above issues. But I do not have either commit in any of my
local repos, not even according to `git fsck --lost-found`. WAT.

I suspect what happened was that I made a topic branch and pushed a
commit, which is why GitHub got it (and apparently will keep it
indefinitely because of the link from comments). But then I deleted
the branch without merging. Or possibly that commit was rebased away
before merging the branch.

Probably this happened on a new Windows laptop. In my defense,
mid-January I was moving between Mac and Windows, as part of an effort
to improve racket-mode support on the latter. And then, a Linux laptop
got added to the mix, as well.

In any case I'm going to recover and use this commit, now. The bit
above the horizontal line is the commit message.
greghendershott pushed a commit that referenced this issue Apr 15, 2019
Goal: Fix any remnants of issue #344.

- Simplify code paths by relying on sentinels to do cleanup
  when the process is deleted or closed.

- Analyze the remaining paths to ensure that racket--cmd-connecting-p
  is set t during a series of connection attempts, and, set back to
  nil after such a series has completed (whether succeeded, failed, or
  C-g by the user).

- Break out the guts of racket--cmd-connect-start to a helper
  racket--cmd-connect-attempt. This makes it clearer that the former
  does checks needed at the start of the series, and the latter is
  called at intervals via run-at-timer to "run in the background".

- Rather than setting racket--cmd-proc back to nil, leave it set to
  the last value, and use things like `process-status` determine if it
  is 'open. Provide a little helper for this: racket--cmd-open-p.

- Add more (and hopefully better) comments and doc strings.
greghendershott pushed a commit that referenced this issue Apr 16, 2019
I noticed when testing on Windows, that sometimes the connection would
not succeed, and furthermore get stuck in a state where
racket--cmd-connecting-p was left non-nil.

This commit fixed that for me.

Although this might have some bearing on issue #344 and issue #348, I
don't have any reason to think this fixes those.

-----------------------------------------------------------------

PROVENANCE: Today I noticed commit 98bb9c7 and commit cedf4ba linked
to from the above issues. But I do not have either commit in any of my
local repos, not even according to `git fsck --lost-found`. WAT.

I suspect what happened was that I made a topic branch and pushed a
commit, which is why GitHub got it (and apparently will keep it
indefinitely because of the link from comments). But then I deleted
the branch without merging. Or possibly that commit was rebased away
before merging the branch.

Probably this happened on a new Windows laptop. In my defense,
mid-January I was moving between Mac and Windows, as part of an effort
to improve racket-mode support on the latter. And then, a Linux laptop
got added to the mix, as well.

In any case I'm going to recover and use this commit, now. The bit
above the horizontal line is the commit message.
greghendershott pushed a commit that referenced this issue Apr 16, 2019
Goal: Fix any remnants of issue #344.

- Simplify code paths by relying on sentinels to do cleanup
  when the process is deleted or closed.

- Analyze the remaining paths to ensure that racket--cmd-connecting-p
  is set t during a series of connection attempts, and, set back to
  nil after such a series has completed (whether succeeded, failed, or
  C-g by the user).

- Break out the guts of racket--cmd-connect-start to a helper
  racket--cmd-connect-attempt. This makes it clearer that the former
  does checks needed at the start of the series, and the latter is
  called at intervals via run-at-timer to "run in the background".

- Rather than setting racket--cmd-proc back to nil, leave it set to
  the last value, and use things like `process-status` determine if it
  is 'open. Provide a little helper for this: racket--cmd-open-p.

- Add more (and hopefully better) comments and doc strings.
greghendershott pushed a commit that referenced this issue Apr 16, 2019
People continue to report symptoms like issue #344. In some cases it
turned out people needed to update to the latest racket-mode. However
at least a couple users on very recent versions reported still seeing
the problem intermittently.

So:

- Simplify code paths by relying on sentinels to do cleanup when the
  process is deleted or closed.

- Rename racket--repl-ensure-buffer-and-process to racket--repl-start,
  and, it is an error to call it when (racket--repl-live-p) is true.

- Rather than setting racket--cmd-proc back to nil, leave it set to
  the last value, and use things like `process-status` determine if it
  is 'open. Provide a little helper for this: racket--cmd-open-p.

- Wait to start command connect; eliminate racket--cmd-connecting-p
  state variable.

  Until we believe the Racket backend has finished starting up and the
  TCP server would be listening, there is really no point in starting
  to attempt to connect. Ideally we could know this from a process
  sentinel, but there is no appropriate event for a comint buffer
  process. Instead use a comint output filter function to detect first
  output, e.g. the Welcome to Racket banner. This uses run-at-time to
  schedule racket--cmd-connect-attempt (should not do directly from a
  filter function, IIUC). As a result, there is no longer any
  racket--cmd-connect-start function. Now, we only try to connect to
  the command server when we first start the REPL process. If the
  command connection dies thereafter, but the REPL is still alive?
  Tough beans, users will need to restart the REPL process.

  Also as a result, I convinced myself that the
  racket--cmd-connecting-p flag/guard is no longer necessary. Which is
  good, because it is tricky to make all the code paths manage that
  properly. I _think_ I finally got that right in a WIP commit. But
  it's fragile. Some other change, someday, could break this again.
  Better users get a "can't talk to command server" error, and need to
  restart the REPL, than have heisenbugs and reports thereof.

  Note: As before, racket--cmd-connect-attempt will run-at-time itself
        to retry: We wait until the TCP command server could
        _possibly_ be ready -- but it's not necessarily ready yet.
        Although it will probably succeed on the first try, in fact
        I've seen it occasionally take two attempts while testing this
        on a laptop/OS that does the startup very quickly.

- racket-repl-exit: Use comint-kill-subjob when command server dead

- Add defcustom racket-command-startup.

  - When nil (default): We do not try to start the REPL or the
    command server automatically; instead we give the user an error
    message explaining they need to (re)start racket-repl-mode.

  - When a positive number, we do try to start things automatically
    and we wait that number of seconds for the command server to
    connect. While waiting, we remind the user they can C-g to quit
    waiting.
@greghendershott
Copy link
Owner

@11111000000 Glad to hear that.

@DarrenN and @alex-hhh If you still encounter this, you might want to try upgrading to use my latest commit 1e46dc5, which is a squash of the 20+ commits I worked on over the past week, and just merged to master. As usual, it will take an hour or two for MELPA to update to reflect that.

This adds a defcustom, racket-command-startup. It defaults to nil, which is super conservative. If you do something that needs to send a command, but the REPL and command server aren't both up, you get an error message telling you to (re)start the REPL. OR, if you set it to something like 10 or 15, that's equivalent to the old behavior -- except I think you won't encounter the old problems. And if you do, the "waiting to connect" prompt reminds you that you can C-g to quit waiting. So, it's kind of "defense in depth".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants