New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: Rename the repository from MNIST to something else #1
Comments
thanks for reporting. Will consult the company's legal team on Monday and check. |
Even if it's legal, using the name MNIST for something that is not related to the NIST database seems confusing. |
As long as the legal team is alright with it - I think that including MNIST in the title is actually pretty great and not confusing, because even though the dataset is totally different than MNIST, it is related since it's a drop-in replacement for MNIST. It sets out to fix many issues in the original MNIST handwritten digits dataset; and serves the exact same purpose as the original. |
Good idea, @hanxiao. Otherwise, great job with this benchmark and thank you for sharing it! I'm already running TPOT on your benchmark to see how it works. @robbiebarrat, technically MNIST was called the "MNIST database of handwritten digits". The handwritten digits database is not called MNIST, though I think that fact was lost in time. A more appropriate shorthand name would be "digits" or something similar. Some cheeky commenters have suggested "fashioNISTa" as a possible name for this new benchmark. |
or just call it fashion-digits... |
@rhiever Oooh i didn't know that the handwritten dataset wasn't actually referred to as MNIST... Oops... I think fashioNISTa would be a great name, though! |
Some updates in this thread:
I will close this issue soon. |
To be clear: I raised this issue not because of legal concerns, but because (as discussed above) naming it with MNIST in the name doesn't actually make sense. MNIST refers to the institution that originally provided the digits dataset, not the dataset itself. |
The original NIST digits database is now called "NIST Special Database 19" at the time Yann Lecun created MNIST it was SD 3 and 1. Someone could have made a modified version of the NIST fingerprint or mugshot datasets and named them MNIST prints and MNIST mugs, with just as much accuracy. But nobody did and so the shorthand is globally unique in the world that it is meaningful in (AI research dataset names) But not specific if you want to be pedantic about it you have to tack on the "handwritten digits" part. NIST acknowledges as much by having an official NIST version of MNIST called EMNIST https://www.nist.gov/itl/iad/image-group/emnist-dataset In the horse and buggy days "dashboard" meant one thing and then something else in the age of the auto and yet something else in the information age so too MNIST in this case is (IMHO) an equally understandable moniker that does not mean what it technically is supposed to mean but captures a gestalt that everyone immediately grasps. |
OK. I still believe that it's a poor decision to continue misusing the MNIST name in this manner, but ultimately it's not my repo so I'll close the issue. |
Someone commented about this issue on Reddit (pasted below) and I think you should seriously consider changing the name of the benchmark to something else while it's still early on.
The text was updated successfully, but these errors were encountered: