New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFE: Add MIME classification of all files to packages #1096
Comments
Oh and just FWIW, the more I look at the rpmfcTokens table, the tempted I'm to axe all that and replace with nice and simple MIME data... |
pmatilai
added a commit
to pmatilai/rpm
that referenced
this issue
Mar 5, 2020
File magic strings are unreliable and largely unusable for anything but human consumption, MIME types are far more meaningful for classifying file types. Populate RPMTAG_FILECLASS (or rather, CLASSDICT) with MIME type instead, and add types for all files and not just our strange hardcoded list. Remove now redundant cruft. Fixes: rpm-software-management#1096
pmatilai
moved this from To do
to In progress
in Use MIME types in favor of "magic" strings
Mar 5, 2020
pmatilai
added a commit
to pmatilai/rpm
that referenced
this issue
Mar 11, 2020
Add new tags, rpmfiles APIs and other infra to support storing and querying file MIME types. Store MIME type for all files, stop adding rather arbitrarily filtered file "class" data as this is bloated and relatively useless data, remove related cruft. Fixes: rpm-software-management#1096
pmatilai
added a commit
to pmatilai/rpm
that referenced
this issue
Mar 11, 2020
Add new tags, rpmfiles APIs and other infra to support storing and querying file MIME types. Store MIME type for all files, stop adding rather arbitrarily filtered file "class" data as this is bloated and relatively useless data, remove related cruft. Fixes: rpm-software-management#1096
pmatilai
added a commit
to pmatilai/rpm
that referenced
this issue
Mar 20, 2020
Add new tags, rpmfiles APIs and other infra to support storing and querying file MIME types. Store MIME type for all files, stop adding rather arbitrarily filtered file "class" data as this is bloated and relatively useless data, remove related cruft. Fixes: rpm-software-management#1096
pmatilai
moved this from In progress
to To do
in Use MIME types in favor of "magic" strings
Mar 7, 2023
Actually, v6 is the place where we can and should flick this particular switch. The libmagic strings in headers make no sense, but for v4 dropping them is a compat break. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Rpm has added a libmagic file classification string for a bunch of hardcoded file types to headers since v4.2 or thereabouts. This data has the potential to being useful for all sorts of purposes, but libmagic strings being such volatile beasts means using them is difficult at best, and there's no way to translate those strings to MIME which is what most users would expect and prefer. Also the data is not stored for all files which reduces its usability greatly.
We should add MIME type for all files into the headers. Unlike the magic strings, this is standard data and also compresses well using a simple dictionary approach.
The biggest open question to me is whether we can reuse FILECLASS tag set for this purpose or not. That data is increasingly bloaty because of increasing uniqueness of libmagic strings (buildid hashes, image sizes etc) that don't compress into a dictionary, and if there are no users that actually care about this data, or at least couldn't just as(or more) easily use MIME instead...
AFAIK few things look at FILECLASS, simply because its so erratic. IIRC rpmlint (or something similar) does, but would be better served by MIME type.
Thoughts?
The text was updated successfully, but these errors were encountered: