Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random error - Rs232 SerialPortError: Resource temporarily unavailable #1183

Closed
1 task done
newkind opened this issue Feb 4, 2021 · 22 comments
Closed
1 task done

Comments

@newkind
Copy link

newkind commented Feb 4, 2021

  • I confirm that this is an issue rather than a question.

Bug report

First of all I'm aware I'm using CoreElec and you most likely will tell me to go to the CE forums and ask there, but I tried and unfortunatelly folks there weren't able to help me. Here's the link to my previous topic in CE forums: https://discourse.coreelec.org/t/amlogicgrabb-stops-working-after-couple-hours-hyperion-ng/13091/11?u=newkind

So I'm using a WS2812B leds connected to the Arduino Uno clone and the issue is that randomly after random amount of time the Hyperion.ng stops working because of the Rs232 error. This Hyperion.ng addon build is based on the latest Hyperion.ng 2.0.0-alpha.9

Here's the part of the log:

2021-02-03T09:56:52.173 hyperiond HYPERION     : <DEBUG> PriorityMuxer.cpp:270:clearInput() | Removed source priority 250
2021-02-03T09:56:52.173 hyperiond HYPERION     : <DEBUG> PriorityMuxer.cpp:351:setCurrentTime() | Set visible priority to 255
2021-02-03T09:56:52.174 hyperiond HYPERION     : <DEBUG> Hyperion.cpp:559:handlePriorityChangedLedDevice() | priority[255], previousPriority[250]
2021-02-03T09:56:52.174 hyperiond HYPERION     : <DEBUG> Hyperion.cpp:562:handlePriorityChangedLedDevice() | No source left -> switch LED-Device off
2021-02-03T09:56:52.175 hyperiond COMPONENTREG : <DEBUG> ComponentRegister.cpp:36:setNewComponentState() | Framegrabber: disabled
2021-02-03T09:56:52.177 hyperiond AmLogic      : <DEBUG> GrabberWrapper.cpp:58:stop() | Grabber stop()
2021-02-04T00:37:52.156 hyperiond LEDDEVICE    : <DEBUG> ProviderRs232.cpp:95:close() | Close UART: ttyUSB0
2021-02-04T00:37:52.156 hyperiond LEDDEVICE    : <ERROR> Device disabled, device 'adalight' signals error: 'Rs232 SerialPortError: Resource temporarily unavailable'
2021-02-04T00:37:52.157 hyperiond COMPONENTREG : <DEBUG> ComponentRegister.cpp:36:setNewComponentState() | LED device: disabled

Link to the full log: https://pastebin.com/raw/ZwyptMHK

When the error happens, doing killall hyperiond and waiting couple seconds for the daemon to restart fixes the issue for the next X hours until it happens again.

This happens randomly and can happen ie. after an hour or 20 hours. There's no general rule for that. I could set the CRON to restart hyperiond every couple hours but that is certainly not a fix for this issue.

I'll gladly test all the patches or code that would fix this.

Thank you!

Steps to reproduce

None, just wait couple hours.

What is expected?

Hyperion.ng works just fine and in case of such error - restarts the daemon automatically.

What is actually happening?

Error Device disabled, device 'adalight' signals error: 'Rs232 SerialPortError: Resource temporarily unavailable is thrown and the daemon never restarts keeping the device dead until the whole box is restarted or the killall hyperiond command is used.

System

Hyperion Server:

  • Build: coreelec-9.2 (Portisch-de94399e92/d7336cbe8c-1611911223)
  • Build time: Jan 31 2021 10:29:42
  • Git Remote: https://github.com/CoreELEC/CoreELEC.git
  • Version: 2.0.0-alpha.9
  • UI Lang: en (BrowserLang: pl)
  • UI Access: default
  • Avail Capt: v4l2,framebuffer,amlogic,qt
  • Database: read/write

Hyperion Server OS:

  • Distribution: CoreELEC (official): 9.2.5
  • Architecture: arm64
  • CPU Model: Amlogic S912 rev a
  • CPU Hardware: Amlogic
  • Kernel: linux (3.14.29 (WS: 32))
  • Qt Version: 5.13.0
  • Python Version: 3.7.3
  • Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:86.0) Gecko/20100101 Firefox/86.0
@Lord-Grey
Copy link
Collaborator

@newkind Try setting the baud rate to 115200.

An Arduino Uno cannot do 500000.

Does that work?

@newkind
Copy link
Author

newkind commented Feb 6, 2021

I tried different baudrates under 500000 but the leds were always slow and choppy - especially at 115200. The 500000 looks really nice and is very fluid and smooth.

@Lord-Grey I'll try going with 460800 - this one is also smooth and fluid. This is the one I used in the past on my different setup with Arduino Uno and I didn't had such issues there.

@Lord-Grey
Copy link
Collaborator

The older code was slower and had more delays. That is why it might have been working before.
The real Android Uno can technically only do 115200.
In case your Uno like model can do higher baud-rate, I would suggest you increase the latch-time.
The "Temporarily Available" error is an indication that the board cannot cope with the data volume sent in the configured timeframe.

If it is slow you might want also to check, if you grabbing too quickly or with a too high resolution...

@newkind
Copy link
Author

newkind commented Feb 6, 2021

Thank you very much for the suggestions! I'll test everything again and let you know the results.

@Lord-Grey
Copy link
Collaborator

@newkind Any update?

@newkind
Copy link
Author

newkind commented Feb 23, 2021

@Lord-Grey I'm sorry but last days has been crazy (we got a new puppy) so I wasn't able to do any testing, but today I'll be doing all changes to the configuration and the arduino sketch. It might take couple hours to see if the Hyperion.ng still breaks.

Please give me couple more hours and I'll surely get back to you with an update!

@Lord-Grey
Copy link
Collaborator

@newkind No worries. Take your time! I only wanted to check, if the issues still persist....

@newkind
Copy link
Author

newkind commented Feb 26, 2021

@Lord-Grey Sorry it took so long to get back to you.

So I did play with the configuration and Arduino sketches today and my findings are:

  1. I wasn't able to achieve the smooth results on baud rate 115200 - I played with the capture width and height, latch time, refresh rate
  2. On baud rate 500 000 - everythings super smooth on latch time 30ms (my current configuration), but anything above that value ie. 40ms makes the animation look choppy
  3. I decreased my capture width from 80px to 45px and will see if this will help the issue, but I doubt that especially that the error happens even when Hyperion is not working currently

@Lord-Grey "The "Temporarily Available" error is an indication that the board cannot cope with the data volume sent in the configured timeframe."

I agree that this could be the case, but if it would - shouldn't the error happen only when Hyperion is actually working and controlling leds actively? The issue I'm facing happens overtime and 99% during night when no one is watching anything and leds are turned off, so the device just sits idle and waits until I'll start playing anything. It just behaves like it would lose the usb power or something like that and cannot reconnect properly.

I'm also doing another test with a different Uno R3 device. I noticed that the one I used currently was not based on Atmel chip and reported on ttyUSB0. The new one has Atmel chip and connects as ttyACM0 - we'll see if this will help in any way.

@newkind
Copy link
Author

newkind commented Feb 28, 2021

Update: So changing to the Atmel powered Uno R3 didn't help. However I noticed one more thing - the Atmel version of Uno R3 allows me to set baudrate to 460800 - something that non-Atmel version didn't allow, so I'm going to try it. The non Atmel version allowed me only 115200 or 500000, 460800 didn't worked for some reason.

I'll come with more updates later.

@newkind
Copy link
Author

newkind commented Feb 28, 2021

So it looks like this didn't change anything - I even think that the device is crashing much faster when using ttyACM0.

Latest log: https://pastebin.com/raw/L4pPQPNN

This time it took only about 5 minutes in idle to prevent device from working.

Right now I'll switch the baudrate to 115200 (just for testing as it's unusable) to see if it'll still crash. I'll come back with new log.

@newkind
Copy link
Author

newkind commented Mar 1, 2021

One more update - setting baudrate to 115200 also doesn't solve it.

Here is the new log: https://paste.kodi.tv/eduxarocev.kodi

Relevant part:

2021-03-01T14:11:29.336 hyperiond HYPERION     : <DEBUG> PriorityMuxer.cpp:161:registerInput() | Register new input 'System/GRABBER' with priority 250 as inactive
2021-03-01T14:11:29.336 hyperiond COMPONENTREG : <DEBUG> ComponentRegister.cpp:36:setNewComponentState() | Framegrabber: enabled
2021-03-01T14:11:29.336 hyperiond AmLogic      : <DEBUG> GrabberWrapper.cpp:48:start() | Grabber start()
2021-03-01T14:11:29.390 hyperiond HYPERION     : <DEBUG> PriorityMuxer.cpp:251:setInputImage() | Priority 250 is now active
2021-03-01T14:11:29.390 hyperiond HYPERION     : <DEBUG> PriorityMuxer.cpp:351:setCurrentTime() | Set visible priority to 250
2021-03-01T14:11:29.390 hyperiond HYPERION     : <DEBUG> Hyperion.cpp:559:handlePriorityChangedLedDevice() | priority[250], previousPriority[255]
2021-03-01T14:11:29.390 hyperiond HYPERION     : <DEBUG> Hyperion.cpp:569:handlePriorityChangedLedDevice() | new source available -> switch LED-Device on
2021-03-01T14:11:29.399 hyperiond LEDDEVICE    : <DEBUG> ProviderRs232.cpp:95:close() | Close UART: ttyACM0
2021-03-01T14:11:29.399 hyperiond LEDDEVICE    : <ERROR> Device disabled, device 'adalight' signals error: 'Rs232 SerialPortError: Resource temporarily unavailable'
2021-03-01T14:11:29.400 hyperiond COMPONENTREG : <DEBUG> ComponentRegister.cpp:36:setNewComponentState() | LED device: disabled

So it's either my device - Mecool M8S L or there's something with Hyperion.ng as this happens on two completely different Arduino devices.

I have ordered today two Arduino Micro Leonardo to hopefully solve this once for all. I'm using it on another device on my main TV and that one works perfectly fine, so if it'll be having issues on my secondary TV, then I'll know for sure that the Mecool M8S L device is to blame.

@Lord-Grey
Copy link
Collaborator

@newkind Just to see, if something is in the system log, could you do me an extract for the 28.02.2021 and 01.03.2021, please?

journalctl --since "2021-02-28" --until "2021-02-28 21:00"
journalctl --since "2021-03-01" --until "2021-03-01 17:00"

@Lord-Grey
Copy link
Collaborator

As you seem to use CoreElec, do you use the Suspend/Resume feature?

@newkind
Copy link
Author

newkind commented Mar 1, 2021

@Lord-Grey Both of CoreElec devices are constantly up/awake - I do not put any of these devices under suspend/sleep.

I'm not sure if I do something wrong but both of these commands output only ~ :

-- Logs begin at Mon 2021-03-01 19:09:13 CET, end at Mon 2021-03-01 21:53:08 CET. --

~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~

@Lord-Grey
Copy link
Collaborator

@newkind Could you do a

journalctl --since "2021-02-28" --until now

or alternatively, if the error occurs

journalctl --since "2 hours ago"

@newkind
Copy link
Author

newkind commented Mar 2, 2021

@Lord-Grey Sure! Here's the log: http://ix.io/2RnY

@Lord-Grey
Copy link
Collaborator

@newkind Thank you. At minimum there seems to be some indication in the Log:

Mar 01 22:12:48 CoreELEC-M8S kernel: hub 1-0:1.0: port 1 disabled by hub (EMI?), re-enabling...
Mar 01 22:12:48 CoreELEC-M8S kernel: usb 1-1: USB disconnect, device number 2
Mar 01 22:12:48 CoreELEC-M8S kernel: usb 1-1: new full-speed USB device number 4 using xhci-hcd
Mar 01 22:12:48 CoreELEC-M8S kernel: usb 1-1: New USB device found, idVendor=2341, idProduct=0043
Mar 01 22:12:48 CoreELEC-M8S kernel: usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=220
Mar 01 22:12:48 CoreELEC-M8S kernel: usb 1-1: Manufacturer: Arduino (www.arduino.cc)
Mar 01 22:12:48 CoreELEC-M8S kernel: usb 1-1: SerialNumber: 55735323935351B002A0
Mar 01 22:12:48 CoreELEC-M8S kernel: usb 1-1: ep 0x82 - rounding interval to 1024 microframes, ep desc says 2040 microframes
Mar 01 22:12:48 CoreELEC-M8S kernel: cdc_acm 1-1:1.0: ttyACM1: USB ACM device
Mar 01 22:12:48 CoreELEC-M8S sh[2891]: 2021-03-01T22:12:48.829 hyperiond LEDDEVICE    : <ERROR> Device disabled, device 'adalight' signals error: 'Rs232 SerialPortError: Resource temporarily unavailable'
Mar 01 22:12:48 CoreELEC-M8S hyperiond[2901]: Device disabled, device 'adalight' signals error: 'Rs232 SerialPortError: Resource temporarily unavailable'

@newkind
Copy link
Author

newkind commented Mar 2, 2021

@Lord-Grey So as suspected the device turns off the usb hub internally right? If that's the case then I'll take this topic to the CE forums and ask for advice there.

@Lord-Grey
Copy link
Collaborator

@newkind In addition, I am going to check, if I just could ignore the error and do the next write or if I need to reopen the USB device.... to make the hyperion more robust for such a scenario....

@newkind
Copy link
Author

newkind commented Mar 2, 2021

That would be awesome! Thank you!

@newkind
Copy link
Author

newkind commented Mar 3, 2021

@Lord-Grey Just coming back to you with another update. I took the kernel issue to the CoreElec forums and vpeter checked that this kernel code has such description:

EM interference sometimes causes badly shielded USB devices to be shutdown by
the hub, this hack enables them again. Works at least with mouse driver.

My arduino is in a separate plastic case about 30cm from the CE device so I ruled that out as a source of interference.

After that I replaced the usb cable that I have been using and I haven't noticed any crashes since then (almost 24h)!
Even though the cable I was using works, it's most likely so badly shielded that it caused the EM interference and thus linux kernel was disabling the usb port on the internal hub.

That was... well... unexpected!

I still need a day or two to make sure that the crash won't happen again, but so far it looks really good!

One more time thank you for your support and finding out the kernel error report! :)

@newkind
Copy link
Author

newkind commented Mar 4, 2021

Everything still works fine, so I can confirm that the issue was the poor cable shielding causing interference and linux kernel disabling the usb internal hub port because of the EM interference.

One more time thank you @Lord-Grey for your support.

@newkind newkind closed this as completed Mar 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants