Skip to content
This repository has been archived by the owner on Feb 19, 2024. It is now read-only.

Matrix boot stability #90

Open
zoe-codez opened this issue May 6, 2023 · 2 comments
Open

Matrix boot stability #90

zoe-codez opened this issue May 6, 2023 · 2 comments
Labels
🏚️ bug Something isn't working. Attach project label also 📯 help wanted Stuff that would benefit from extra collaboration 🚥 rgb-matrix @digital-alchemy/pi-matrix, pi-matrix-client, rgb-matrix

Comments

@zoe-codez
Copy link
Owner

zoe-codez commented May 6, 2023

Issue

Pi matrix code has a habit of being crash prone around boot. This appears to originating from lower level libraries throwing errors. These errors seem to come in 2 types, but either one will happily happen in back to back runs with no obvious changes to the situation:

  • corrupted size vs. prev_size
  • free(): invalid pointer

Investigation notes

  • Not all builds act the same. Some will never stay up for more than a split second, others never seem to error
  • There may be some non-obvious factors at work, such as device temperature, or something indirectly related to uptime
    • Had the a single build go from unusable to stable after a reboot
  • So far, all attempts to trap the error within node have failed
  • Didn't get anything obvious with https://www.npmjs.com/package/segfault-handler, may have used wrong

Faulty bindings

Since there are github issues with the same thing happening for other people with rgb-led-matrix, and the issue is not super consistent here, it's probably being done here. Ex: bad arguments into a method somewhere, or order of operations in application wiring (in the domain of this repo at least)

Building a custom version using latest hzeller/rpi-rgb-led-matrix didn't meaningfully change anything.

Open questions

  • What specific call is causing the error?

If it happens as a result of font loading, or basic LedMatrix constructor, life is gonna suck. There might be some detection of that situation that can be thrown in, and do some sort of magic to try to break the boot loot situation?

  • Does the current / previous state of the matrix matter?

Unclear how this would affect anything yet, since widget rendering involves explicit clear calls.

Extra Solutions being investigated

Error trapping at lower levels

Needs research / help: N-API & C++ isn't really my thing

Working within node to debug this results in a lot of random noise in observations. The error seems to bubble up through the n-api layer, and crash the app directly, not running any of the shutdown hooks.

#89 asks for the creation of more locally controlled bindings. It may be possible to fix (preferable) / trap / bubble up the error differently from there

@zoe-codez zoe-codez added 🏚️ bug Something isn't working. Attach project label also 📯 help wanted Stuff that would benefit from extra collaboration 🚥 rgb-matrix @digital-alchemy/pi-matrix, pi-matrix-client, rgb-matrix labels May 6, 2023
@zoe-codez
Copy link
Owner Author

zoe-codez commented May 7, 2023

@zoe-codez
Copy link
Owner Author

zoe-codez commented May 11, 2023

There is probably some throttling on render calls that can happen also. Currently, widgets are repeatedly being re-rendered, with no actual changes to the displayed content. Excess screen refreshes are likely not helping here

At least, it's adding to the overall process load on the device, and generating extra heat.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
🏚️ bug Something isn't working. Attach project label also 📯 help wanted Stuff that would benefit from extra collaboration 🚥 rgb-matrix @digital-alchemy/pi-matrix, pi-matrix-client, rgb-matrix
Projects
Development

No branches or pull requests

1 participant