New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add many nes hacks and translations #619
Conversation
please don't merge this yet, i have a better version of the script about to work that handles multiple patches on the descriptions. The script is also handy for finding updates to patches you have... |
Ok, can be review i guess. The rest is a question of quote this, separate that. Right now the script can support multiple links in a 'version' file and adds info about all of the patches. There is a complicated algorithm on the script but it basically has this idea:
And the typical output is:
I'll open a snes one since it's so easy to parse if you have the version files. |
There are a large amount of entries here... Could we just bring in the top 100 or so? The larger we make the database, the slower scanning will be. |
There is a large amount of translated roms over the decades. I'm against cutting it. And i wouldn't know how to prioritize even if i wanted. Notice that these are only the roms i have (i didn't find a secret romhacking dat), so there are even more floating around. Conservatively, a lot more. Besides, these are nes and snes roms. What actually destroys the scanner is cd platforms without serial detection (saturn etc). |
to be honest, my previous comment(the other PR) was sarcastic... there are too many hacks/translations and even adding them all wont make it a guarantee that everyone will be able to have the same crc since nes headers can have extra bits (even on the needed 1st 7 bytes) that affects crc. for example mirroring can be horizontal or vertical for mapper-controlled games. retroarch should support skipping headers for nes/fds when scanning/calculating crc. |
I'd also REALLY like libretro/RetroArch#2033 as an alternative solution around this. Just allow people to add the hacks without even needing them in the database. |
the whole point of having the database is having the metadata. Cutting off a large amount of data the users use because 'no this is too much' is already what RA does badly. Adding without the scanner is nice to speculate on but it hasn't happened for years and even if it did, it wouldn't give the user experience of this. It's not like this data is particularly badly behaved. As long as you have the hacks, the script and one metadata file with the version and the url on romhacking, you can regenerate these. I don't have anything against skipping headers, but you should first do that work on the scanner on Ra if you want it to happen. It's per platform too. |
The issue is scanning performance... The larger we make the database, the slower scanning gets. |
Doesn't sound like it's a big problem to make it a different db in that case that you need to download from the downloader to opt in. But, i'm still not convinced about the performance argument simply because what really hurts there is scanning discs and cds that are unzipped and thus need to calculate crc32. For small files like these the crc would have to be calculated anyway if the files were on the database or not (except if they're zipped). Might as well make the work useful if you're going to take the performance hit from that. |
Let me make my point a little clearer: people will have these files on the roms folders already and when they request 'scan here' to RA, it will have no chance but to scan. If they aren't found it just makes the scan (infinitesimally) longer because they'd have to calculate CRC and search the whole database just to fail. In fact that's one of the ways you could help speed up the Scanner. Order the database by filesize when built and create a binary search to the 'Rom to be scanned size'. Only calculate the CRC of the file if there is at least 1 match to the exact byte size and search upwards and downwards of the first byte size match until there is a CRC match or the size changes. i dunno how effective this could be at filtering unnecessary CRC calculation but my guess it's at least 'ok' for non-database files of nonstandard sizes (ie: your ps1 etc redumps, if they weren't on the database) but not of 'standard' size (your Nintendo 'pseudo random' filled Wii discs, all 4.5 gb files have to calculate crc and search that section, which is bad obviously, so you use serials). For roms, not much benefits because they all have standard sizes but it could work to limit the search to their own section of the database even having a single database. Though extension filtering is much more effective at avoiding random scan delays. it's unfortunate that these files have the same extension as what the scanner is searching for then. Adding them might actually speed it up (very slightly because most of the cost is CRC calculation as i said). |
I appreciate these additions but I think there are two issues....
The other PRs are looking great though. I'll go through and do some reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mind reducing the amount of entries?
|
8d965ca
to
bb6d3e5
Compare
Same as the closed PR.
I verified that RA checksums don't skip the header like redump dats do so the output rom() entry checksums are of the whole patched file.
This was made by
having a cli version of flips on the same dir as the script (built with
make CLI=1
)having a no-intro (xml format) dat for nes to pass (to use the no-intro names when possible instead of the names on the romhacking page)
and using this script:
edit, script updated. edit: twice:
https://gist.github.com/i30817/c5332248f46113fcb4ca03081f7673f2
with the following arguments:
./makedat.py NES/ nes -d no-intro-nes.dat
This verifies the games against the no-intro dat and if it finds it, use that name, but if it doesn't, warn use the romhacking.net page name. In effect this only 'warned' on unlicensed games translations (since i had no hacks for unlicensed games).
Hacks that aren't translations use the name on the redump page always but its extended description might use the dat name if it's found because those are more likely to match the local rom name than the romhacking entry. But if not found still use romhacking 'secondary' title. I haven't a example of this because i've not seen a hack that isn't a translation of unlicensed games on my collection.
A warning: although this calculates the right checksum values on the where you combined two hacks into a single softpatch file, you'll end up with 'only' the first hack link on the readme and the name of that hack.
On these files, the only one that had that problem was
Castlevania 3: Improved Controls + Localization / Title Screen Redone [Hack by NaOH and ShadowOne333]
which i edited manually.I also didn't allow to add romhacks that flips said failed the checksum when applying the rom so some hacks won't be here (besides the ones that i never downloaded anyway). One example of such is Castlevania: Chorus of Mysteries, which is supposed to apply to some janky rom that doesn't seem to be on the net anymore.