Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better Random: Perfect shuffle of systems, games and screensaver items #757

Merged
merged 1 commit into from Jul 22, 2021

Conversation

Gemba
Copy link

@Gemba Gemba commented Jun 16, 2021

Used in Random (X-Button function) and in screensaver random screenshots, videos, custom images.

  • Use of uniform RNG (PCG) with more random seeding than time(NULL)
  • Shuffles systems (aso.) like a card deck and picks top system until empty, then shuffles again
  • Different shuffle on every ES start and every re-shuffle
  • Fixes the flaws of the current random implementation: Real 1/N chance for an element in a set of N
  • Minor refactorings

Relates to findings stated here: https://retropie.org.uk/forum/post/261120

Tested entirely on a Rpi4.

Looks large but on ES side it is mainly SystemData and SystemScreenSaver the larger part is PCG source tree.

@pjft
Copy link
Collaborator

pjft commented Jun 16, 2021

Ok, so, first of all, this change rocks.

A few notes:

  • I love the intellectual intent in wanting to solve this. +1 for that! I read the forum post, and I commend you in digging into this :)
  • I'm not thrilled about adding a whole new dependency to the project just for this. This is a personal opinion, though, and shouldn't on its own, be binding. If others are ok, I won't have an issue, but I doubt that'll be the case.
  • Finally, even if we were to adopt PCG, any chance we could add it as pugixml, which adds the source repository as a dependency, rather than adding the entire source tree to our project? Or are there concerns with doing it that way?

I'd appreciate thoughts from others, but other than the fact that we're adding a new dependency to the project (which is, once again, a personal opinion), I like the goal of the change.

Let's wait on others, but thanks once again for sending this over.

@tomaz82
Copy link
Collaborator

tomaz82 commented Jun 18, 2021

The usage of a really large thirdparty dependency means it's a no from me. I'm sure the reason the forum posters gets the same game twice in just 3-4 randoms means the issue must be elsewhere, not the actual random() implementation, but even if it is then I'm sure we can come up with a much better solution than a large dependency (and yes I have such a solution if needed)

@Gemba
Copy link
Author

Gemba commented Jun 19, 2021

The usage of a really large thirdparty dependency means it's a no from me. [...]

Ok. My bad I did not clean up the source tree (only pcgcpp/include is needed).
Still IMHO it is not large in contrast to other 3rd party libs:

$ ls -la lib*
-rw-r--r-- 1 pi pi 1787372 Jun 19 10:06 libes-core.a
-rw-r--r-- 1 pi pi   59096 Jun 19 10:05 libnanosvg.a
-rw-r--r-- 1 pi pi    1816 Jun 19 10:05 libpcgcpp.a
-rw-r--r-- 1 pi pi  210416 Jun 19 10:06 libpugixml.a

@Gemba
Copy link
Author

Gemba commented Jun 19, 2021

@pjft thanks for the appreciation.

* Finally, even if we were to adopt PCG, any chance we could add it as pugixml, which adds the source repository as a dependency, rather than adding the entire source tree to our project? Or are there concerns with doing it that way?

Indepentently if PCG will be used or not. No concerns, I just did not make it up so far. The PCG cpp github is here so it should be a no brainer to integrate as external like nano/pugi.

@tomaz82
Copy link
Collaborator

tomaz82 commented Jun 19, 2021

It's not really fair to compare it against other included external dependencies given what they do versus what this does.
I'm gonna read through your changes and try to see what is actually a change of logic in ES, and what is a change of random.

Just as a sidenote, I ran some tests yesterday and on 256 random calls (using my own method which is just 10 lines of code ) on a value between 0-3000 to match that forum post, I only got one duplicate, and that was on call 6 and 61, so 55 rands inbetween. I repeated this test multiple times and came to similar numbers every time.

So if the forumposter constantly got the same game show up in just 2-3 calls then either ES does something really weird, or rand() is really broken on the pi.

@Gemba
Copy link
Author

Gemba commented Jun 19, 2021

I'm sure the reason the forum posters gets the same game twice in just 3-4 randoms means the issue must be elsewhere, not the actual random() implementation

Leaving possible user misconfigs aside one still can get more than once g out of N (with g being the game and N the set of games to pick from) for the same reason you can roll twice (or more) 6 on a perfect 6-sided-die before having seen all other digits than 6 as result. This is why I choosed to implement a card deck like approach (instead the current implemented pick-and-return), so one will see every g exactly once in N takes.

What will happen if you have a poor random source? Chances are that you will get the same shuffle more often than others (and a identical shuffle is easier to detect as some games showing up more often than others). This in turn will give you a evidence that your random source should be reconsidered. With the proposed implementation one can be very sure that ES is no longer the source of showing g more than once in a random selection of N. Which in turn means there will be no more discussions like OMMIW / OMMIWN (on my machine it works [not]). Then it is clearly a user's misconfig.

@tomaz82
Copy link
Collaborator

tomaz82 commented Jun 19, 2021

Yeah I can see now that you build a randomized list on startup, and just iterate over that until it's empty and then generate a new one?

I'm quite sure that can be accomplished without the use of pcg.

@Gemba
Copy link
Author

Gemba commented Jun 21, 2021

PCG is the best in class so why settle for anything less?

If the maintainers really want to go without PCG: Mediocre choice could be Mersenne Twister (part of C++11 <random>). However I am skeptical if the seeding works well on a Rpi (no RTC, almost constant bootup time), which can be a source of repetition:

Seedings that only differ slightly take a long time to diverge from each other; seeding must be done carefully to avoid bad states.

(Source: https://www.pcg-random.org/other-rngs.html#negative-qualities)
I am also concerned about the bad states it can enter (see also same source).

Last but not least MT uses +2KiBy RAM, in contrast PCG uses only a very, very small fraction of it and the whole library of PCG is ~4KiBy.

@tomaz82
Copy link
Collaborator

tomaz82 commented Jun 21, 2021

PCG is the best in class so why settle for anything less?

If the maintainers really want to go without PCG: Mediocre choice could be Mersenne Twister (part of C++11 <random>). However I am skeptical if the seeding works well on a Rpi (no RTC, almost constant bootup time), which can be a source of repetition:

Seedings that only differ slightly take a long time to diverge from each other; seeding must be done carefully to avoid bad states.

(Source: https://www.pcg-random.org/other-rngs.html#negative-qualities)
I am also concerned about the bad states it can enter (see also same source).

Last but not least MT uses +2KiBy RAM, in contrast PCG uses only a very, very small fraction of it and the whole library of PCG is ~4KiBy.

It's actually very simple, it's not a question about what is best in class when it comes to just randomizing a gamelist. It's about not adding another dependency when it's not needed, regarding seeding it can easily be remedied by setting the initial seed using nanoseconds since boot rather than seconds. Using what's already there combined with "your deck of cards" approach is my proposed solution, In other words, what you have done except without using PCG and just use std::random_shuffle combined with a nanosecond seed sent to srand.

You get the same end result with way less code.

@Gemba
Copy link
Author

Gemba commented Jul 1, 2021

I have chosen RANLUX48 (included in <random>) by Martin Lüscher which also counts as high quality RNG: It passes the BigCrush TestU01 suite (takes approx 60hrs on a Rpi4) and has a memory footprint of ~120 bytes. However, it is relatively slow (only noticable when you need very high count of random numbers e.g. in simulations) in comparision to other RNGs, but for the uses in ES it does not have a impact.

Seeding works well with a 32 bit number of std::random_device which uses /dev/hwrng (Rpi) and for the Windows build MS communicates that std::random_device is even cryptographically secure.
(If not present the rng-tools package should be installed for the use of /dev/hwrng).

@tomaz82
Copy link
Collaborator

tomaz82 commented Jul 8, 2021

First off, I appreciate all the work put into this, but I still think you are overcomplicating it way too much.
I understand your comment about PCG and "PCG is the best in class so why settle for anything less".

It's very simple, being the best doesn't mean it's the best for THIS. You wouldn't put Ferrari brakes on a Toyota Prius.

After some research I have found an interesting way to seed srand that is random on the 3 systems I've tested it on (Windows 10, Ubuntu 20.0 and Pi3) and that is simple to malloc 1 byte and feed the address of that memory to srand.

		void* mem = malloc(1);
		free(mem);
		srand((unsigned int)(time(NULL) + (size_t)mem));

The reason this works is due to something called address-space–layout randomization (ASLR).
Basically it means that each time an application is launched it's actually put at different places in memory.
Now I can not guarantee what algorithm is used for the ASLR, but my Pi3 at least did show different numbers each time ES was launched.

I'm sure RANLUX48 is great but we're still just randomizing a deck of cards, it will be random, it will show each game just once until all games have been looped over. The quality of the rand algorithm doesn't matter. So this can easily be achieved without using at all, and the shuffling of the cards can be simplified.

@tomaz82
Copy link
Collaborator

tomaz82 commented Jul 8, 2021

After even more investigation I found out that they plan to deprecate rand(), and apparently random_shuffle was depracated in C++17 so ranlux48 seems to be a good alternative.

tomaz82@9bb203b

This is my suggestion tho for a more simplistic implementation. (only tested on Windows 10 so far) and it' still needs the screensaver portion rewritten.

There is no need for multiple instances of ranlux48, and this can also be reused by the screensaver code, also I know that on a 64bit system this approach uses twice the memory as just storing integers, but this is also way faster, both in producing the shuffled vector, but also when actually retrieving the system/game.

And a quick calculation, on a game collection of 50k games (which is insane) this only adds up to about 390kb of memory (compared to 195kb when storing integers), so that's an increased footprint of 195kb which we can easily handle.

…evice for seed

- Use C++11 built-in ranlux48 and random_device for seeding
- Shuffles systems (aso.) like a card deck and picks top system until empty, then shuffles again
- Fixes the flaws of the current random implementation: Real 1/N chance for an element in a set of N
- Minor refactorings
@Gemba
Copy link
Author

Gemba commented Jul 13, 2021

Thanks for the optimizations, agreed no need to keep multiple instances of the URNG.
(Did not realize that SystemData.h is already included in SystemScreenSaver.cpp).
If you don't mind I integrated and pushed forward most of your suggestions.

Except these two:

In the getRandomSystem() it is important to skip "Retropie Configuration": I guess it is not what the user expects to land on this carousel item when jumping randomly between systems and it would also create a regression against the current upstream. TL;DR: I used the std::copy_if() to sift the "Retropie Configuration" entry and keep code compact.

The ASLR is a nice way to use as seed source. My two thoughts here: The main platforms ES is running at do provide a proper random_device (which does not fall back to a PRNG, as the C+11 std allows) and for a solid software we should not rely on a feature (ASLR) that can be disabled during compile time and/or system wide, although it must be done deliberately. TL;DR: I did not apply it.

@tomaz82
Copy link
Collaborator

tomaz82 commented Jul 15, 2021

Thanks for the optimizations, agreed no need to keep multiple instances of the URNG.
(Did not realize that SystemData.h is already included in SystemScreenSaver.cpp).
If you don't mind I integrated and pushed forward most of your suggestions.

Except these two:

In the getRandomSystem() it is important to skip "Retropie Configuration": I guess it is not what the user expects to land on this carousel item when jumping randomly between systems and it would also create a regression against the current upstream. TL;DR: I used the std::copy_if() to sift the "Retropie Configuration" entry and keep code compact.

The ASLR is a nice way to use as seed source. My two thoughts here: The main platforms ES is running at do provide a proper random_device (which does not fall back to a PRNG, as the C+11 std allows) and for a solid software we should not rely on a feature (ASLR) that can be disabled during compile time and/or system wide, although it must be done deliberately. TL;DR: I did not apply it.

Yeah I missed the RetroPie entry.
Also the ASLR was reverted in my latest commit since as I mentioned, rand() will be deprecated in the next C++ version anyway.

@Gemba
Copy link
Author

Gemba commented Jul 19, 2021

@tomaz82 seems we got aligned at this. Nice. Thanks for being straight but also helpful.

IMHO let this rest a few more days for inquiries by others before the maintainers may decide to merge.

@pjft
Copy link
Collaborator

pjft commented Jul 19, 2021

I have no objection to the merge - I just re-read the code, you did a fair bit of refactoring of the code for the screensaver, but you seem to be keeping all the functionality. Well done!

I cannot test the code right now as I do not have access to a Pi at this moment. That being said, if there are no further objections, I'm happy to merge.

Thanks both for going through this! If no further feedback by end of week, I'll merge then.

@pjft pjft merged commit c612f23 into RetroPie:master Jul 22, 2021
@pjft
Copy link
Collaborator

pjft commented Jul 22, 2021

Thanks for sending this over.

@Gemba Gemba deleted the fb_perfect_shuffle branch February 18, 2023 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants