Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test sending big files (>4MB) to Wanhao #4

Closed
peteruithoven opened this issue Apr 1, 2016 · 21 comments
Closed

Test sending big files (>4MB) to Wanhao #4

peteruithoven opened this issue Apr 1, 2016 · 21 comments
Assignees
Labels
Milestone

Comments

@peteruithoven
Copy link
Member

Perform the same test as #3 but with a Wanhao printer.

@olijf
Copy link

olijf commented Apr 1, 2016

During the heating process a lot of gcode is sent at once. Due to this the buffer size drops to 1580 just before printing starts. Once the printer has started printing the gcode buffer goes up and a lot of space is freed.

@olijf
Copy link

olijf commented Apr 1, 2016

Print 1 0.10.10-c

The Wanhao stopped extruding with Heater ERROR 4# on the display. I stopped the print, but the wanhao printer hangs during the stopping procedure
Included are the logs.
Firmware version: latest beta 0.10.10-c

Will test again to see if the Wanhao works again.
Wanhao log.zip

@olijf
Copy link

olijf commented Apr 1, 2016

print 2 0.10.10-c

On the second occasion the print was succesful. Only once it dropped to 1280 again during preheating.

Wanhao Print 2.zip

@olijf
Copy link

olijf commented Apr 4, 2016

print 3 using tablet 0.10.10-c

Wanhao tablet print 1.zip

The Wanhao print from my tablet stopped suddenly. I used ADB to remotely log the webconsole using the chrome://inspect page. Investigating the ram usage does not yield much except 5 times it went below the 2000KB threshold. I am currently testing if the ultimaker does not hang.

@peteruithoven
Copy link
Member Author

peteruithoven commented May 2, 2016

With release 0.10.10-d and 0.10.10-e we should try this again. Probably 3 times to be sure.
Since Doodle3D/print3d#44 seems like a issue we can't fix in short term, let's exclude it by using a USB hub for these tests.

@olijf
Copy link

olijf commented May 2, 2016

0.10.10-e Nexus 9 print 1

Print Successful
0.10.10-e test 1 nexus 9 wanhao.zip

@olijf
Copy link

olijf commented May 2, 2016

0.10.10-e Nexus 9 print 2

Stopped the print because filament was stuck
After stopping the print I got a lot of disconnect errors.
0.10.10-e test 2 nexus 9 wanhao.zip

@olijf
Copy link

olijf commented May 2, 2016

0.10.10-e Nexus 9 print 3

Got a lot of AJAX disconnect errors during preheating. Did not send the doodle correctly to the printer. Printing stopped once the print buffer was empty.
0.10.10-e test 3 nexus 9 wanhao.zip

@olijf
Copy link

olijf commented May 2, 2016

0.10.10-e Nexus 9 print 4

Same. doodle did not print.
0.10.10-e test 4 nexus 9 wanhao.zip

@olijf
Copy link

olijf commented May 2, 2016

0.10.10-e Nexus 9 print 5

After a full reset of the wifibox (including firstboot) I performed the same test but this time the print was also unsuccessful.
0.10.10-e test 5 nexus 9 wanhao.zip

Also I checked the running processes of the other tests. It seems that uhttpd is forked off a lot of times but never gets shut down correctly. This is something I also discovered during my stresstesting of the wifibox. I am still investigating why this happens. Increasing the max_requests in the uhttpd config seems to fix this for now.

@woutgg
Copy link

woutgg commented May 3, 2016

In all four failed tests, printer/print requests start failing sooner or later (as early as the 18th chunk up until the 70th). Sometimes one or two more trickle through but no more. When this happens, status/info requests also start failing almost all the time (~95%?).
Only in the second failed test, other errors than just AJAX failures were logged (mainly net::ERR_CONNECTION_REFUSED).

Except in the first failed test, the last chunk to arrive in tact at the server is also the last one for which the client got an 'ok' back, contrary to what was observed in Doodle3D/doodle3d-client#304.

Even though the client keeps sending both print and status requests, wifibox.log only logs receiving status requests after things start failing. Why could this be? Or am I missing something?
Also, the status requests are logged in groups of 4 within one second, then nothing for 10-30 seconds, and this pattern repeats.

Miscellaneous remarks:

  • At least appending the last received chunk from firmware->server is done in no more than 1 second, so no issues there.
  • Requests were coming from multiple IPs (213, 124 and 182), probably tablet/PC/phone?
  • The syslogs do not seem to contain anything out of the ordinary.

So what could be the matter here? Something blocking the uhttpd/Lua process?

@woutgg
Copy link

woutgg commented May 3, 2016

One thing that might come in handy are timestamps in the web console log, so we could inspect intervals between messages as well as map those logs onto the firmware/print3d logs to connect what is happening when.

@peteruithoven
Copy link
Member Author

In all four failed tests, printer/print requests start failing sooner or later (as early as the 18th chunk up until the 70th).

The Wanhao driver does take up more resources, because of the translation process is there any indicator this might be the cause? Did you check the memory usage over time log? (wanhao.log).

Even though the client keeps sending both print and status requests, wifibox.log only logs receiving status requests after things start failing. Why could this be? Or am I missing something?

How did you see this? I see info/status requests from even before seeing /printer/print requests in the rotated logs.

Also, the status requests are logged in groups of 4 within one second, then nothing for 10-30 seconds, and this pattern repeats.

Seems like the requests timeout. If I understood Olaf it could be that one request takes up so much all that requests that are waiting for that request to finish all timeout.

@olijf Could you explain the multiple ip's? If there where multiple devices listening this would have been much of a stress test than I intended it to be.
Could you also check if it's possible to save the console logs with timestamp, I understand there are settings available:
http://stackoverflow.com/questions/12008120/console-log-timestamps-in-chrome
https://developers.google.com/web/tools/chrome-devtools/debug/console/console-ui

@olijf
Copy link

olijf commented May 4, 2016

I was only actively using my tablet. sometimes I checked the info/status from my pc. but I do not think that this would be much of an impact. .. I presume someone else had the connect.doodle3d.com page open in his browser (at least that explains for 1 more)

@peteruithoven
Copy link
Member Author

I think a important question is whether we made it perform worse with the developments in the develop branch. Therefore I'd like to do 2 tests with less "stress".

@olaf, could you do 2 more tests?

  • From pc
  • Checking that there is only one client
  • Including the timestamps in the web console (that's possible right?)
  • With usb hub
  • I'm assuming having the Wanhao in the workshop instead of in the office doesn't have to much of an impact.

@woutgg
Copy link

woutgg commented May 4, 2016

The Wanhao driver does take up more resources, because of the translation process is there any indicator this might be the cause? Did you check the memory usage over time log? (wanhao.log).

Do you mean process in the OS sense? The GPX code is compiled into the print server and it only converts as much as it needs each time (here), so the translation should not take up any noticeable extra resources.

Overview of the memory logs:

  • In the successful print, free memory starts out at 7.5MB, then mostly ranges between 3~4MB, with an occasional drop to 2.7 at the lowest (over about 7.5 minutes in total).
  • Print 2 has no memlog.
  • Print 3 shows memory steadily dropping from 10.1MB to 3.5MB over the course of 6 minutes.
  • Print 4 (over around 7.5 mins) starts out at 7.4MB, then mostly ranging around 4~5MB but with drops to 2MB and even 1.5MB near the end.
  • Print 5 (over around 7 mins) starts out at 7.2MB, mostly gradually lowers to around 1.9MB.

Could we be dealing with a memory leak, is some other process taking up memory or is this normal behaviour? Even so, according to the syslog the OOM killer never triggered.

How did you see this? I see info/status requests from even before seeing /printer/print requests in the rotated logs.

Sorry my sentence was ambiguous, I meant that it logs both types of requests up until the point where things start failing. After that moment it does not log printer/print requests anymore but still logs info/status requests - even though the client keeps sending both.

Seems like the requests timeout. If I understood Olaf it could be that one request takes up so much all that requests that are waiting for that request to finish all timeout.

Indeed. iirc this also happened during the beginning of the project.

@olijf
Copy link

olijf commented May 6, 2016

Print from PC 0.10.10-e

Here are the results of another test unsure if print was succesful. I got a lot AJAX errors.

I can do another test on Monday with the chrome time stamps enabled.
0.10.10-e print 1 pc.zip

@woutgg
Copy link

woutgg commented May 7, 2016

Trying to reproduce the AJAX timeouts, I ran several large prints on the wanhao. All of them cancelled after some time since the previously observed issues all occured quite soon - they all ran until the buffer had been full for at least about a minute.
The printer was located in the workshop, the wifibox the same one as used in the failed tests from previously, with a newly installed 0.10.10-e image.
First tests were on my computer, about 3 or 4 times and no errors occurred.
Attempting the same again, this time from the galaxy tab tablet, no timeouts occurred either.

@olijf
Copy link

olijf commented May 10, 2016

Very interesting, I have 1 plausible theory: On my tablet after 1 successful print the timeout issues arose. It is possible that that first print happened on a just booted WiFi-box. maybe this happens because I did not reboot the wifibox in between? (im unsure if I did this)

On another note: during the daytime the Wifi network is quite busy (lots of devices connected, lots of people etc) so maybe that can influence the wireless signals?
It seems the TP link MR3020 should be able to have a 150Mbps network if I run iwinfo I often get speeds far below that (in client mode) I have seen speed of 5.5Mbps and 60~70Mbps at max.

@woutgg
Copy link

woutgg commented May 10, 2016

If it had to do with (not) rebooting, something must have changed because I did not reboot the wifibox in between prints.
Could there have been any difference in the files/OS on the box? I mean, that the issue was accidentally 'fixed' by me installing a fresh image?

In case you/we are going to do more tests, it might indeed be interesting to also track the speed & quality iwinfo reports and see if there is any correlation.

@peteruithoven
Copy link
Member Author

We've released 0.10.10, so I'm closing this test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants