Skip to content
This repository was archived by the owner on Jun 30, 2025. It is now read-only.

Conversation

@juliangruber
Copy link
Member

There are some issues with the zinnia loop end logic. See for example these logs from core-fly:

Mar 26 15:03:07 core-fly [spark]  ⇣ checking for updates
Mar 26 15:03:08 core-fly [voyager]  ⇣ checking for updates
Mar 26 15:03:08 core-fly [spark]  ⇣ checking for updates
Mar 26 15:03:08 core-fly [voyager]  ⇣ checking for updates
Mar 26 15:03:08 core-fly [voyager]  ✓ no update available
Mar 26 15:03:08 core-fly [spark]  ✓ no update available
Mar 26 15:03:08 core-fly [spark]  ✓ no update available
Mar 26 15:03:08 core-fly [voyager]  ✓ no update available

We can see that 2 update loops were running at once, which means that in a previous run() invocation the loop should have been aborted, but wasn't.

This PR fixes this by only using one method: controller.abort() for aborting, and controller.signal.aborted for checking if it was aborted. Previously, shouldRestart.get() was used in conjunction with the AbortController, and in some places only one was checked or set.

Should fix issues like https://filecoinproject.slack.com/archives/C03S6LXSRB8/p1711639505087089. This has also been observed on core-fly:

Mar 26 15:13:09 core-fly [3/26/2024, 2:13:09 PM] INFO  Updated Zinnia module source code, restarting...

(no logs following)

@juliangruber juliangruber requested a review from bajtos April 2, 2024 12:35
@juliangruber juliangruber changed the title Fix abort errors by simplifying state Fix main loop not ending properly Apr 2, 2024
waitForStdioClose({ childProcesses, controller })
])

if (shouldRestart.get()) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could remove this condition because here we know we always want to restart

Copy link
Member

@bajtos bajtos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice cleanup!

:shipit:

@juliangruber juliangruber merged commit 78b1cd7 into main Apr 2, 2024
@juliangruber juliangruber deleted the fix/abort-handling branch April 2, 2024 13:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants