Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bounty: $170] XMB always stops displaying images with low-power/memory (rpi, Switch, Classic, others) #6747

Open
markwkidd opened this issue May 9, 2018 · 75 comments · Fixed by #7487

Comments

@markwkidd
Copy link
Contributor

markwkidd commented May 9, 2018

Bounty contributions make a difference - please donate at this link today to support this fix

Since at least March 2016, users of Lakka on low-powered systems -- and specifically of Raspberry Pi systems -- have experienced slowdowns in the display of thumbnails and dynamic backgrounds. These slowdowns lead to a crash or tipping point after which RetroArch no longer displays images at all.

@OGWillikers provided a video which shows the issue: https://www.youtube.com/watch?v=0to3Kznk3Z4
They add:

Don't have to scroll fast to cause it. I've done it by just being indecisive and taking a long time scrolling one by one.

This issue has been confirmed by @lollo78, @kivutar, @OGWillikers, @jcreznor, @brnrdbrk, @leokendall, @Lomig, @flapjackfiasco, and @MopheusDG. It's real!

To my knowledge, it has only been experienced within Lakka but also to my knowledge no one has tested this on a non-Lakka low-power system.

@lollo78 provided a playlist that demonstrates the issue (note that you don't need the ROMs to test this, just the playlist and the corresponding thumbnails)

They also describe how to reproduce the issue:

Try to lateral scroll my playlist holding down your finger (don't lift it or do a single click) on left or right keys of the controller. You will notice a nasty lag (seems that xmb freezes for a while).

Now, activate boxarts view and try to scroll down in a platform (like n64). Also here, holding down your finger and continue to scroll down like an infinite loop, sooner or later (depends on system used: on Rpi it happens first) the images disappear and you will see only a black square.
Sometimes also the icons and menu disappears.
The only way is restart Retroarch.

This is the third iteration of this Issue posting. The first has been lost to time during the LakkaOE->LakkaLE repo migration. The second iteration can be found here: #2791

@markwkidd
Copy link
Contributor Author

markwkidd commented May 9, 2018

Anyone who is interested in contributing to a bounty to fix this issue can do so at this address: https://www.bountysource.com/issues/58129765-xmb-always-stops-displaying-images-on-low-power-systems-incl-raspberry-pi

For my part, I have put $5 into the bounty to get it started

Edit: @Ntemis I'm tagging you as this a concern for Lakka

@Ntemis
Copy link

Ntemis commented May 9, 2018

@kivutar should be tagged on this as he loves fixing these kind of issues and also i don't own any raspberry to be at any help on this

@ShockwaveTheFallen
Copy link

I'll help out however I can. I have a test bench with a Pi 3B+ on it right now, and a high-end Lakka unit (relative, that Gigabyte Brix i3). Let me test it out right now and see if the problem replicates itself.

@ShockwaveTheFallen
Copy link

PC does not display that problem. I'll create a list with ROMs on the Pi 3B+ and see what it does, and see if there's anything I can do.

@ShockwaveTheFallen
Copy link

As I like to say when I discover problems like this and I suspect something along these lines...

"You gotsa memory leak somewhere, bud." @kivutar @Ntemis @markwkidd

My extremely educated guess: you're running out of real estate. Check your stack pointer!

Usual caveat endeavor: Just a guess on my part. I've been wrong plenty of times before (hoping I'm wrong now to be honest, but the fact that it's locking up over time, I'm leaning towards not). Let me do some digging and I'll post the pastebin/log of the details. This might take a while....unto the breach with Arch!

@undeadindustries
Copy link

I clicked on the bounty link to contribute and it was dead. I'd surely donate to get this issue fixed asap.

@inactive123
Copy link
Contributor

Bountysource appears to be having issues today. Give it some time, nothing we can do about it anyway.

@undeadindustries
Copy link

I added $100 to the bounty.

@inactive123
Copy link
Contributor

Awesome @undeadindustries ! I'll retweet this.

@undeadindustries
Copy link

If this gets fixed pretty quickly, I'll probably start going bounty crazy ;-)

@undeadindustries
Copy link

Bounty is up to $170!!!

@nayslayer
Copy link
Contributor

nayslayer commented Oct 24, 2018

@markwkidd @twinaphex
I believe I've identified the problem, though I lack expertise to actually fix it. If you have an opengl wizard on team you may want to show them this comment. Until then, I believe that a workaround is to use a Vulkan or D3D video driver, if the target device supports it. The RPi doesn't, unfortunately.

The memory leak actually happens at a texture_unload/texture_load pair in xmb.c. Going deeper I found out that the texture memory allocated in gl.c opengl driver here doesn't get deallocated there with the texture being deleted. This can be confirmed by commenting out the glTexImage2D line - the memory stops leaking, though you'll obviously get a corrupt image. I've ruled out the possibility of a texture id (or 'name' in opengl's terms) mismatch - the texture reference actually gets deallocated correctly, while its data doesn't.

Therefore I conclude that this happens due to an obstinate GPU driver likely provoked by a threading issue or hackish rendering code. A driver not deallocating its memory isn't unexpected - the fact that it doesn't ever deallocate it is. I've also tried classic remedies of turning off multithreading and randomly inserting glflush/glfinish in the code - nothing works, so I'm abandoning the issue for now.

upd: this isn't a threading issue - texture (de)allocations are on the same thread as buffer swaps; digging through the history didn't help either - the bug was there when thumbnails were introduced somewhere around v1.3.4 and it was ever more severe, leaking tens of megabytes at a time; there was a (failed) attempt to fix this bug in 2016, see xmb.c blame log; the only idea for now is to reuse existing texture names - this sort of works, but it's an ugly gl-only hack.

@nayslayer
Copy link
Contributor

My apologies, the above analysis may be invalid. It does seem that I've missed a race condition in the same texture_unload/texture_load code I've first ruled out. I'll probably make a fix later today.

@undeadindustries
Copy link

@nayslayer , meaning that you think you found the issue in xmb.c and you'll be able to fix it?

@nayslayer
Copy link
Contributor

That's right, and I'm working on it atm.

@undeadindustries
Copy link

That's great!

@nayslayer
Copy link
Contributor

@undeadindustries
Okay, now I don't know whether the issue I've found was actually the one that bugs you, but I've fixed some definite resource leaks.

And it actually had nothing to do with threading or driver issues - it was a plain, boring logic error. I've been excited for nothing :(

@undeadindustries
Copy link

@nayslayer Ahhh. So did that fix it though? Let me know, I can test on my B+.

Thanks again for tackling this!

@nayslayer
Copy link
Contributor

Well, memory stopped leaking on my Linux desktop, so it's highly likely.

@undeadindustries
Copy link

Nice!!! Do you know what release this will be part of? I'll update my Lakka when it's released and let you know.

Thanks again. If you fixed this, you are my hero.

@hizzlekizzle
Copy link
Contributor

@undeadindustries it should make its way into Lakka nightlies soon after it's merged.

@jdca
Copy link

jdca commented Nov 4, 2018

No nightly yet, if i cant test this before nov 8 i have to reject. When is the next nightly planned ? or is there a simple way to test this in another way ?

@markwkidd
Copy link
Contributor Author

@jdca this code is already present in RetroArch nighties -- are you talking about Lakka nightlies? My understanding is that @kivutar is not planning on updating the RetroArch version in Lakka until another bug (relating to color shifts on certain ARM chips) has been fixed.

In other words, it could be days, weeks, or months until this shows up in a Lakka nightly. I don't think that will be a fair standard for accepting or rejecting the claim in this case.

Are you using an rpi? Post your hardware platform and there will probably be some way to get the latest RetroArch onto it.

@jdca
Copy link

jdca commented Nov 5, 2018

Make me a rpi 3 b+ image or tell me how to test this on my raspberry pi. On my pc this is not a big issue as i can go thru 10 000 thumbnails before my pc hangs, on my raspberry its like 40 thumbnails.

@markwkidd
Copy link
Contributor Author

I've not set up RetroArch on an rpi as standalone software, but there are guides I'm finding on google. I think I have come close to building Lakka before myself but I honestly haven't figured it out yet.

I don't want to send you down a rabbit hole since I haven't followed this guide (and I'm hoping someone else will post here with direct experience) but this one looks credible on the surface at least:

Setting up RetroArch on a Raspberry Pi https://gist.github.com/AlexMax/32e5d038a66ce57253e740ea75736805

I'll also ask on the #lakkatv channel of the libretro discord channel

@natinusala
Copy link
Contributor

I will make a new Lakka image with the fix so that you can try it. I'll get back to you with a transfer.sh link.

@daliaetnano
Copy link
Contributor

Hi @essm1988
Don't worry. @natinusala also tested and he gave me everything I need.
I am actually looking at the dump.
Thanks

@essm1988
Copy link

@daliaetnano

Ok , however I will try to learn use nxlink for next tests I think found how to use nxlink, I am very interested to fix this issue

@natinusala
Copy link
Contributor

Using nxlink is really simple, just do nxlink retroarch_switch.nro -s and it will send the Homebrew to your Switch and redirect the logs to the terminal. Make sure to press Y in hbmenu to enter the net loader.

@essm1988
Copy link

Thank you very much, I will try

@essm1988
Copy link

it worked by CMD not msys2 terminal , thanks

daliaetnano added a commit to daliaetnano/RetroArch that referenced this issue Mar 2, 2019
@daliaetnano
Copy link
Contributor

Hi @essm1988 and @natinusala
I have updated my fix for #6747 to make the unload of texture more reliable.
If the video thread is not initialized, the texture is unloaded in the main thread (like before my fix).
The fix is in the same branch https://github.com/daliaetnano/RetroArch/tree/fix-6747-black-bug-switch
If one of you could test it on switch when you have time that would be great.
Thanks
Nano

@undeadindustries
Copy link

Last update was 3 months ago for someone to test. Has anyone tested yet? I'm between setups right now or I would!!

@essm1988
Copy link

essm1988 commented Jul 9, 2019

@daliaetnano

Sorry for late

I tested it for 5 minutes , I think the issue is fixed but I will test it for some days to confirm the issue is fixed.

Thank you very much for your hard work.

@essm1988
Copy link

@daliaetnano

The issue is not fixed, the thumbnails become black and crash.

I will provide a log file maybe after two days.

@daliaetnano
Copy link
Contributor

Hi @essm1988
Even if the issue is not fixed on Switch, it doesn't crash immediately which was the purpose of my last fix.
I will have a look at the log file.
Thanks for the test.

@jdca
Copy link

jdca commented Jul 16, 2019 via email

@essm1988
Copy link

I will provide the log file but I have to test it again with clean install.

@essm1988
Copy link

Hi @daliaetnano

it crush again ,however I have provided the log file+ crush report +debug elf

https://www.dropbox.com/s/rmgpz9y9agxdtsm/crash.rar?dl=0

thank you very much for your hard work

daliaetnano added a commit to daliaetnano/RetroArch that referenced this issue Jul 21, 2019
@daliaetnano
Copy link
Contributor

Thank you very much #essm1988 for the log.
I looked at it and it crashes immediately and it does not seem to be linked with my changes.
I am disapointed because I don't know if the problem comes from my last change or not.
It crashes when calling libdrm_nouveau.
My branch is is quite old, so I will rebase my code to be up to date.

@inactive123
Copy link
Contributor

Hi there, was this issue ever resolved?

@gouchi
Copy link
Member

gouchi commented Jul 23, 2021

Do you reproduce the issue if you have the thumbnails with resolution less or equal to 1920x1080 ?

Source

@sonninnos
Copy link
Collaborator

Thumbnails can be disabled and backgrounds can be disabled, and have been for a long time, so is this really still an issue?

@jdca
Copy link

jdca commented Sep 30, 2023

Thumbnails can be disabled and backgrounds can be disabled, and have been for a long time, so is this really still an issue?

It is a issue, if you want to use thumbnails and backgrounds.

It's nice for your eyes, I always use thumbnails.

But it seems to work great with ozone so I switched anyways.

@sonninnos
Copy link
Collaborator

sonninnos commented Sep 30, 2023

Hah, I wasn't awake enough then since I read the title as if someone wants to stop displaying images because they slow the machine down.. as in a feature request.

All menu drivers share a lot of code regarding images, so it shouldn't be impossible to make it behave as well as Ozone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.