Skip to content

Fuzzy Logic

stoj edited this page Mar 30, 2023 · 4 revisions

What Is It?

Fuzzy logic is an approach where files are matched against the database entries based on "degrees of truth" rather than the typical binary "true or false".


Why Do We Need It?

Unlike the Mame world where ROM (and to a lesser extent other resources such as flyers) files are named consistently and uniquely identified via CRC/hash, the Virtual Pinball world is a lot more "fluid".

Virtual pinball files are rarely named in a consistent manner. To a large extent, this reflects the artistic nature of the Virtual Pinballs with authors expressing their personal preferences when creating, updating, and modding their content.

For example, some of the more common file name variations..

  • author names prefixed or postfixed
  • version numbers suffixed
  • inconsistent (or missing entirely) manufacturer and/or year details
  • speling mistakes ;)
  • inconsistent usage of symbols (dashes, spaces, exclaimations, etc)
  • etc

Manually matching up newly downloaded content to your existing content then becomes a very tedius and time consuming task.


How Does It Work?

This is a rapidly changing/improving area, so ultimately the code is the source of truth here.

But in brief, there are various techniques/approaches employed to match files against database entries based on "degrees of truth" via..

  • Ignoring white space
  • Removing superfluous words and characters, e.g. "the", "_", "-", etc.
  • Converting text, e.g. roman numerals to numbers
  • Partial word matches
  • Matching against table OR content descriptions
  • Accommodating for manufacturer years that vary +/- a few years
  • Assigning matches criteria to a scoring system
  • Pixie dust

At the time of writing, approximately 90% of files are correctly(!) captured via the fuzzy logic matching. Whilst there's still some scope to improve the algorithms for the remaining 10%, it becomes an increasingly more difficult task.