Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debugging random slow CPU with rPI4 and 384x256 #1177

Closed
marcmerlin opened this issue Oct 13, 2020 · 5 comments
Closed

Debugging random slow CPU with rPI4 and 384x256 #1177

marcmerlin opened this issue Oct 13, 2020 · 5 comments

Comments

@marcmerlin
Copy link
Contributor

marcmerlin commented Oct 13, 2020

I've been struggling with this one for a while. I don't think it's a problem with the library because the display refresh seems solid: https://www.youtube.com/watch?v=GUp3OilGPe8
However, my code randomly hangs and resumes and I'm pretty sure it's not an issue with the code, but with the CPU throttling, although I couldn't find anything obvious.
The code is single threaded, it generates a frame buffer, and then calls show() to copy it to this driver's canvas:

void FastLED_RPIRGBPanel_GFX::show() {
    Framebuffer_GFX::showfps();
    for (uint16_t y = 0; y < _fbh; y++) {
	for (uint16_t x = 0; x < _fbw; x++) {
	    CRGB pixel = _fb[y*matrixWidth + x];
	    uint8_t r = pixel.r;
	    uint8_t g = pixel.g;
	    uint8_t b = pixel.b;
	    _canvas->SetPixel(x, y, r, g, b);
	}
    }
}

If you look at this video: https://www.youtube.com/watch?v=GUp3OilGPe8 , there are interesting bits:

  1. the refresh seems constant enough, but the animation hangs and things hang to badly around the 12s mark that you can see the function above copying the canvas in what looks like almost 0.5s

I've put a heat sink on the CPU and GPU, maybe it's not good enough. I'm trying to find out if the cores are going into some massive throttling for reason unknown (0.5s to do the loop above is super super slow).
The same code works flawlessly on rPI3 with a smaller display (192x160).

I've spent some time on this, but have come up empty so far, and was wondering if anyone has any ideas of what I should consider. I am running raspbian which I know is not ideal but the hangs seem to happen to certain patterns and not others, so they seem to depend on what the code is computing and not some random time interval from an external cronjob or daemon.

@marcmerlin
Copy link
Contributor Author

same display, same everything is running on the right in this demo: https://youtu.be/PbuB-QE-WjQ?t=103 , so the hangs are hard to pin, they happen mostly in some demos and not others, but not consistently.

@marcmerlin
Copy link
Contributor Author

When the display hangs, I see:

top - 20:05:04 up 28 days,  6:43,  3 users,  load average: 1.96, 1.77, 1.59 
Tasks: 115 total,   2 running, 113 sleeping,   0 stopped,   0 zombie 
%Cpu(s): 41.2 us,  0.5 sy,  0.0 ni, 58.1 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
MiB Mem :   1939.4 total,   1567.8 free,    106.3 used,    265.3 buff/cache
MiB Swap:    100.0 total,    100.0 free,      0.0 used.   1735.2 avail Mem 
 
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                  
 5548 daemon    20   0   27416   6544   3364 R 165.6   0.3  51136:35 Table_Mark_Este     

I assume that because the driver is its own thread, I should be able to use 2 cores, or 200% CPU.
I just don't know if 165% means my code is at 100% and the driver at 65% or some other combination.

I'm not too sure how to see if the cores are being thermally throttled, but it looks like they're not:

root@rPi4b:~# cpufreq-info   |grep GHz
  hardware limits: 600 MHz - 1.50 GHz
  available frequency steps: 600 MHz, 750 MHz, 1000 MHz, 1.50 GHz
  current policy: frequency should be within 600 MHz and 1.50 GHz.
  current CPU frequency is 1.50 GHz (asserted by call to hardware).
  cpufreq stats: 600 MHz:0.87%, 750 MHz:0.00%, 1000 MHz:0.00%, 1.50 GHz:99.12%  (453)
  hardware limits: 600 MHz - 1.50 GHz
  available frequency steps: 600 MHz, 750 MHz, 1000 MHz, 1.50 GHz
  current policy: frequency should be within 600 MHz and 1.50 GHz.
  current CPU frequency is 1.50 GHz (asserted by call to hardware).
  cpufreq stats: 600 MHz:0.87%, 750 MHz:0.00%, 1000 MHz:0.00%, 1.50 GHz:99.12%  (453)
  hardware limits: 600 MHz - 1.50 GHz
  available frequency steps: 600 MHz, 750 MHz, 1000 MHz, 1.50 GHz
  current policy: frequency should be within 600 MHz and 1.50 GHz.
  current CPU frequency is 1.50 GHz (asserted by call to hardware).
  cpufreq stats: 600 MHz:0.87%, 750 MHz:0.00%, 1000 MHz:0.00%, 1.50 GHz:99.12%  (453)
  hardware limits: 600 MHz - 1.50 GHz
  available frequency steps: 600 MHz, 750 MHz, 1000 MHz, 1.50 GHz
  current policy: frequency should be within 600 MHz and 1.50 GHz.
  current CPU frequency is 1.50 GHz (asserted by call to hardware).
  cpufreq stats: 600 MHz:0.87%, 750 MHz:0.00%, 1000 MHz:0.00%, 1.50 GHz:99.12%  (453)

@marcmerlin
Copy link
Contributor Author

Dave, I'm not even sure what to say. Of course Pi is arm64...
animation works fine at other times, same complexity
tearing is normal, but a 0.5s to 1s time to copy a single frame is not.

Also, as I said, no, I don't think it's a probably with the library.

Please don't reply off topic further, or I'll just ignore the replies.

@marcmerlin
Copy link
Contributor Author

ok, I'll admit that you gave me a good laugh, but that is the last time too. Plonk.

@marcmerlin
Copy link
Contributor Author

The support group has finally been created, this has now been moved to
https://rpi-rgb-led-matrix.discourse.group/t/debugging-random-slow-cpu-with-rpi4-and-384x256/13

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant