Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix fault in Octoscreen causing issues with our Octoprint enabled printers #1325

Open
jackie1050 opened this issue Nov 25, 2019 · 7 comments

Comments

@jackie1050
Copy link
Contributor

@jackie1050 jackie1050 commented Nov 25, 2019

An error in the Octoprint? software means that the printer screens are displaying an error message (the error is due to opening threads and not closing them).
Apparently the CR-10S can be used if turned off and on again and the others need to be stopped and started in the app.(@mattcroughan is going to make some temporary labels explaining what to do in the short term).

@MatthewCroughan MatthewCroughan changed the title Fix threading fault in Octoprint software for Creality printers Fix fault in Octoscreen causing issues with our Octoprint enabled printers Nov 25, 2019
@MatthewCroughan

This comment has been minimized.

Copy link

@MatthewCroughan MatthewCroughan commented Nov 25, 2019

This issue is referenced here, already being worked on. It's not a fault with octoprint, but with Octoscreen.

Z-Bolt/OctoScreen#56

@MatthewCroughan

This comment has been minimized.

Copy link

@MatthewCroughan MatthewCroughan commented Nov 25, 2019

Octoscreen tells Octoprint to automatically connect to the printer every 5 seconds or so. This causes a memory leak to occur as the threads that do this are not being closed. This then means the connections are still open to Octoprint's api, thus running Octoprint out of memory eventually, since it cannot handle connections from that many clients. Octoscreen is written in golang, which I don't understand fully yet, though am working on thanks to this issue. Maybe @paulfurley could take a quick look at this?

I believe the problem is in the threading logic in common.go:

https://github.com/Z-Bolt/OctoScreen/blob/fc33e3a0f986cef3105fbada95870bcdb9b54b61/ui/common.go#L163

func (t *BackgroundTask) execute() {
	_, err := glib.IdleAdd(t.task)
	if err != nil {
		log.Fatal("IdleAdd() failed:", err)
	}
}

Or from the function verifyConnection() in ui.go:

https://github.com/Z-Bolt/OctoScreen/blob/fc33e3a0f986cef3105fbada95870bcdb9b54b61/ui/ui.go#L121

func (ui *UI) verifyConnection() {

	ui.sdNotify("WATCHDOG=1")

	newUiState := "splash"
	splashMessage := "Loading..."

	s, err := (&octoprint.ConnectionRequest{}).Do(ui.Printer)
	if err == nil {
		ui.State = s.Current.State
		switch {
		case s.Current.State.IsOperational():
			newUiState = "idle"
		case s.Current.State.IsPrinting():
			newUiState = "printing"
		case s.Current.State.IsError():
			fallthrough
		case s.Current.State.IsOffline():
			if err := (&octoprint.ConnectRequest{}).Do(ui.Printer); err != nil {
				newUiState = "splash"
				splashMessage = fmt.Sprintf("Error connecting to printer: %s", err)
			}
		case s.Current.State.IsConnecting():
			splashMessage = string(s.Current.State)
		}
	} else {
		if time.Since(ui.t) > errMercyPeriod {
			splashMessage = ui.errToUser(err)
		}

		newUiState = "splash"
		Logger.Debugf("Unexpected error: %s", err)
	}

	defer func() { ui.UIState = newUiState }()

	ui.s.Label.SetText(splashMessage)

	if newUiState == ui.UIState {
		return
	}

	switch newUiState {
	case "idle":
		Logger.Info("Printer is ready")
		ui.Add(IdleStatusPanel(ui))
	case "printing":
		Logger.Info("Printing a job")
		ui.Add(PrintStatusPanel(ui))
	case "splash":
		ui.Add(ui.s)
	}
}

Importantly, verifyConnection() is called by ui.update:
https://github.com/Z-Bolt/OctoScreen/blob/fc33e3a0f986cef3105fbada95870bcdb9b54b61/ui/ui.go#L80

	ui.b = NewBackgroundTask(time.Second*2, ui.update)

This spawns a new background process every two seconds for this, which is always ran regardless of if one is still in the background. I just don't know enough about golang to have this not occur.

Z-Bolt/OctoScreen#56

@ajlennon

This comment has been minimized.

Copy link
Contributor

@ajlennon ajlennon commented Nov 25, 2019

So you think this is a problem in the underlying Golang framework? We have a chap coming up at the weekend who might also be able to advise if so

@MatthewCroughan

This comment has been minimized.

Copy link

@MatthewCroughan MatthewCroughan commented Nov 25, 2019

@ajlennon The problem could be fixed in Octoprint by having it close connections on clients that have been connected for too long a period of time, or by catching the MemoryError exception in Tornado and at the very least crashing, if not clearing up old client connections. foosel/OctoPrint#3343

In Go, as far as I can tell, common.go is written by the author of a particular thing to act as a toolset. So the threading logic is either copy and pasted from somewhere, or improperly written by the author if this is a problem. Although, maybe the thinking is that verifyConnection() is supposed to close the thread after it's ran? Or at the very least check if a thread is open and close if if so. I just don't know enough to know where the problem is in that regard. I wouldn't call it a problem with the framework without knowing for sure.

Would like to see someone with more knowledge fix the issue so I can learn from them, so whoever is coming in, let me know when they are so I can be around!

@ajlennon

This comment has been minimized.

Copy link
Contributor

@ajlennon ajlennon commented Nov 25, 2019

Connections should time out.

@MatthewCroughan

This comment has been minimized.

Copy link

@MatthewCroughan MatthewCroughan commented Nov 25, 2019

@ajlennon I haven't observed connections timing out. Octoprint's log doesn't suggest there is any timing out occuring, although it does tell me that when haproxy fails to serve a client, for example when connecting to the web interface. The REST api that Octoprint implements may not for some reason time clients out. I don't know enough about how APIs are implemented to know if that's the case, but all I observe is that threads are allowed to endlessly accumulate in Octoscreen, along with the number of outgoing TCP connections until Octoprint's usage of Tornado crashes with MemoryError. That exception is caught instead of crashing, which means nothing is capable of restarting Octoprint after the fact, such as systemd or docker. That has been noted in the Octoprint issue.

@MatthewCroughan

This comment has been minimized.

Copy link

@MatthewCroughan MatthewCroughan commented Nov 25, 2019

There is also this issue from 2018 which goes into some more detail about the same sort of thing I discovered.

mcuadros/OctoPrint-TFT#15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.