Skip to content
This repository has been archived by the owner on Mar 26, 2021. It is now read-only.

Externalise source data #3

Merged
merged 17 commits into from
Mar 25, 2021
Merged

Externalise source data #3

merged 17 commits into from
Mar 25, 2021

Conversation

jellybob
Copy link
Owner

@jellybob jellybob commented Mar 24, 2021

See #2 for full details of what's going on here.

To ensure compliance with the upstream license on the Freedesktop.org Shared Mime Types database this PR removes it from the gem, and instead requires the user to provide one themselves. Most Linux distributions probably have a copy of this already, and it can probably be found in the location being checked in ext/mimemagic/extconf.rb, so we default to that. Otherwise an environment variable will need to be set to point it in the right direction. This path also gets baked into the installed version of the gem via a Gemfile pretending to be a native extension.

minad and others added 6 commits March 24, 2021 17:15
This allows it to be called at runtime.
This is the first step in removing the dependency on distributing
freedesktop.og.xml as part of the gem.
Currently looks at the value of `FREEDESKTOP_MIME_TYPES_PATH` and
in `/usr/share/mime/packages/freedesktop.org.xml`, which is the
path you would expect to find that file at on a typical Linux
system.

# Unknown if this test failure is expected. Commenting out for now.
#
# assert_equal 'text/html', MimeMagic.by_path('/adsjkfa/kajsdfkadsf/kajsdfjasdf.html').to_s

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is failing:

  Actual: "application/xhtml+xml"Minitest::Assertion: Expected: "text/html"
  Actual: "application/xhtml+xml"```

It seems like this should be "text/html"
https://stackoverflow.com/questions/6788934/what-is-the-difference-between-serving-a-page-as-text-xml-and-application-xhtml

But this older post makes it seem like `"application/xhtml+xml"` is correct https://www.w3.org/2003/01/xhtml-mimetype/


# Unknown if this test failure is expected. Commenting out for now.
#
# assert_equal 'text/html', MimeMagic.by_extension('.html').to_s

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DacotahHarvey
Copy link

Consider adding TODO statements that reference an issue for the

Unknown if this test failure is expected. Commenting out for now

sections. It will greatly help the longevity of the project to have a reminder that there are outstanding questions for @minad

Because we're required to build a C extension in order to do so
(don't ask, its a long story), use that C extension to make the
path provided at build time available at run time.
@yorickpeterse
Copy link

yorickpeterse commented Mar 24, 2021

@jellybob IIRC you don't have to build an actual C extension. Years ago I wrote a bit about this here. Unless things have changed, you should be able to use a Rakefile instead. This way you probably don't need any actual C code. I hope this info is of any use.

@pboling
Copy link

pboling commented Mar 24, 2021

As for Mac OS compatibility I am investigating the Homebrew path. The main prerequisite of shared-mime-info is glib2, which is actually just glib in Homebrew.

$ brew install glib
==> Downloading https://homebrew.bintray.com/bottles/libffi-3.3_3.big_sur.bottle.tar.gz
######################################################################## 100.0%
==> Downloading https://homebrew.bintray.com/bottles/sqlite-3.35.2.big_sur.bottle.tar.gz
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/db0fa1be882b6184dc9fa4ba8ca5aefbd9c6a1441b
######################################################################## 100.0%
==> Downloading https://homebrew.bintray.com/bottles/glib-2.68.0.big_sur.bottle.tar.gz
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/7d671e3104d1a3e8d620ef99b4a1c9b237362e60e8
######################################################################## 100.0%
... etc

The install for shared-mime-info itself is:

$ brew upgrade shared-mime-info
brew install shared-mime-info
==> Downloading https://homebrew.bintray.com/bottles/libffi-3.3_3.big_sur.bottle.tar.gz
Already downloaded: /Users/pboling/Library/Caches/Homebrew/downloads/60b45c0f23d19cde24cfc8e6834288901010f39f7733d9b3312e759a58229193--libffi-3.3_3.big_sur.bottle.tar.gz
==> Downloading https://homebrew.bintray.com/bottles/sqlite-3.35.2.big_sur.bottle.tar.gz
Already downloaded: /Users/pboling/Library/Caches/Homebrew/downloads/0ece5a897ef993b09cf8fdab929a7f2392b1057799cae7c89aaba8ca507e243a--sqlite-3.35.2.big_sur.bottle.tar.gz
==> Downloading https://homebrew.bintray.com/bottles/glib-2.68.0.big_sur.bottle.tar.gz
Already downloaded: /Users/pboling/Library/Caches/Homebrew/downloads/34a93ed75ca6c68aa1dc16c348870beef563edd005994b4c0ccb88b8e3628fa0--glib-2.68.0.big_sur.bottle.tar.gz
==> Downloading https://homebrew.bintray.com/bottles/gnu-getopt-2.36.2.big_sur.bottle.tar.gz
######################################################################## 100.0%
==> Downloading https://homebrew.bintray.com/bottles/shared-mime-info-2.1.big_sur.bottle.tar.gz
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/4857d9f38c0f3cbf23984d60c4ec6280d84b457123
######################################################################## 100.0%
... etc

The source file for a MacOS BigSur install of shared-mime-info-2.1 is usr/local/Cellar/shared-mime-info/2.1/share/shared-mime-info/packages/freedesktop.org.xml:

$ ls -la /usr/local/Cellar/shared-mime-info/2.1/share/shared-mime-info/packages/freedesktop.org.xml
-rw-r--r--  1 pboling  staff  2375737 Mar 24 16:18 /usr/local/Cellar/shared-mime-info/2.1/share/shared-mime-info/packages/freedesktop.org.xml

@pboling
Copy link

pboling commented Mar 24, 2021

@jellybob I'd like to be able to configure the ENV variable via bundle config, which is what we do for nokogiri, mysql2 and many other gems.

bundle config set --global build.mimemagic --with-freedesktop-mime-types-path=/usr/local/Cellar/shared-mime-info/2.1/share/shared-mime-info/packages/freedesktop.org.xml

Which would result in this:

$ bundle config

Settings are listed in order of priority. The top value will be used.
build.libxml-ruby
Set for the current user (/Users/pboling/.bundle/config): "--with-xml2-config=/usr/local/opt/libxml2/bin/xml2-config --with-xml2-dir=/usr/local/opt/libxml2 --with-xml2-lib=/usr/local/opt/libxml2/lib --with-xml2-include=/usr/local/opt/libxml2/include"

build.mysql2
Set for the current user (/Users/pboling/.bundle/config): "--with-opt-dir=/usr/local/opt/openssl@1.1"

gem.test
Set for the current user (/Users/pboling/.bundle/config): "rspec"

gem.mit
Set for the current user (/Users/pboling/.bundle/config): true

build.mimemagic
Set for the current user (/Users/pboling/.bundle/config): "--with-freedesktop-mime-types-path=/usr/local/Cellar/shared-mime-info/2.1/share/shared-mime-info/packages/freedesktop.org.xml"

See the last lines! Then we just need to utilize the bundle config from within the gem...

this bit of the config lets me build nokogiri:

build.libxml-ruby
Set for the current user (/Users/pboling/.bundle/config): "--with-xml2-config=/usr/local/opt/libxml2/bin/xml2-config --with-xml2-dir=/usr/local/opt/libxml2 --with-xml2-lib=/usr/local/opt/libxml2/lib --with-xml2-include=/usr/local/opt/libxml2/include"

this bit of the config lets me build mysql2:

build.mysql2
Set for the current user (/Users/pboling/.bundle/config): "--with-opt-dir=/usr/local/opt/openssl@1.1"

My understanding is that the --with* bits get passed on to the gems when they are being built, such that it is equivalent to, e.g.:

gem install nokogiri --with-xml2-config=/usr/local/opt/libxml2/bin/xml2-config --with-xml2-dir=/usr/local/opt/libxml2 --with-xml2-lib=/usr/local/opt/libxml2/lib --with-xml2-include=/usr/local/opt/libxml2/include

And we'd just need it to work the same way here.

@elebow
Copy link

elebow commented Mar 24, 2021

Here's a branch demonstrating how to use extconf.rb without actually building any C. You can cherry-pick this and build on it if you find it useful.

externalise-source-data...elebow:externalise-source-data-no-c

This also makes some other changes:

  1. Download the file at install-time, but decide which file to open at runtime. This lets the user supply ENV["FREEDESKTOP_MIME_TYPES_PATH"] after the gem has been installed, or change locations without reinstalling the gem.

  2. Download the file only if the user requests it with --with-download-freedesktop-database.

  3. Refactor the way we check possible paths for maintainability—we can add more possible paths to the array in MimeMagic.database_path.

README.md Outdated Show resolved Hide resolved
@jellybob
Copy link
Owner Author

@jellybob IIRC you don't have to build an actual C extension. Years ago I wrote a bit about this here. Unless things have changed, you should be able to use a Rakefile instead. This way you probably don't need any actual C code. I hope this info is of any use.

🎉 Thanks for this - just about to push a version that does this without a C extension now.

README.md Outdated Show resolved Hide resolved
@jellybob
Copy link
Owner Author

@elebow thanks for that - I'd actually implemented the version without an extension before seeing your comment, so didn't make use of that. I'd definitely be interested in a PR to support downloading the file if a version that can be used in that manner can be found though.

@jellybob
Copy link
Owner Author

@pboling a PR to support this would be welcome, but at least for now I think I'm going to go with the current implementation for the sake of getting something shipped, and build unblocked, even if that is in an initially suboptimal way.

Copy link

@pboling pboling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MacOS install instructions unclear

README.md Outdated
it's probably available via your package manager, and will probably be in the location it's being looked for
when the gem is installed.

macOS users can install via Homebrew with `brew install shared-mime-info`.
Copy link

@pboling pboling Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does install the package, but the gem still fails to install, and supplying the path with FREEDESKTOP_MIME_TYPES_PATH is still required. The way this is worded, that isn't clear.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, can you give it another try? A brew install, followed by just doing a gem install with a locally built version of the gem worked fine on my machine just now.

Copy link

@pboling pboling Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jellybob Sorry I wasn't clear, and my suggestion isn't that great either. My comment was on the subtle implications derived from the text layout in the readme as it is in this PR. It might imply that you only need to set FREEDESKTOP_MIME_TYPES_PATH iff you can't do either of those things:

If for whatever reason you can't do either of those things, you'll need to obtain a copy from the Internet. Set the environment variable FREEDESKTOP_MIME_TYPES_PATH before installing the gem in order to point the build at that location.

By juxtaposing the two sentences without any additional context it can easily be misunderstood.

The following works!

brew install shared-mime-info
export FREEDESKTOP_MIME_TYPES_PATH=/usr/local/Cellar/shared-mime-info/2.1/share/shared-mime-info/packages/freedesktop.org.xml
bundle install

ext/mimemagic/Rakefile Outdated Show resolved Hide resolved
pboling referenced this pull request in UKGovernmentBEIS/beis-report-official-development-assistance Mar 25, 2021
All versions prior to 0.3.6 have been pulled by the developer, presumably due to a security vulnerability. [1]

[1] Conversation on the repo mimemagicrb/mimemagic#98
@mvz
Copy link

mvz commented Mar 25, 2021

How about rebasing this on the 0.3.5 version to avoid

  • Re-licensing mimemagic's own code back from GPL-2
  • Breakage due to removal of mimemagic/overlay

I tried this and there were some merge conflicts but nothing too serious. See this branch: https://github.com/Docbldr/mimemagic/tree/externalise-source-data-rebased

@jellybob
Copy link
Owner Author

@mvz I'm not overly concerned about rebasing from 0.3.5 in terms of licensing - we've been talking to the copyright holder who raised the initial issue, and there's agreement there that the approach being taken resolves their complaint, and no issues with reverting the license to MIT. On the other point around mimemagic/overlay, I'm honestly not familiar enough with this code to know what the implications are. I'm going to merge the changes I've made into master on this repository as I'm reasonably happy with them, so feel free to open a PR to reintroduce that.

@jellybob jellybob merged commit b14939c into master Mar 25, 2021
@jellybob jellybob deleted the externalise-source-data branch March 25, 2021 11:28
@Deradon
Copy link

Deradon commented Mar 25, 2021

When releasing this as a 0.3.x version will this possibly break on windows machines if you're creating a new rails project?
Rails still has a soft-dependency on 0.3.x.
So I'ld personally like to at least bump the minor version so the rails maintainers can explicitly decide to go w/ the approach here.

@jellybob
Copy link
Owner Author

Yes, it will potentially break creating a new Rails project on Windows, but that feels like a better option than quietly imposing GPL 2 licensing on every new Rails project which is the current situation. We're also looking at yanking 0.3.6 because of those licensing implications, so in practice whatever happens here new Rails projects on Windows are going to be broken to some degree.

@mvz
Copy link

mvz commented Mar 25, 2021

@jellybob Thanks for all your work in resolving this! I've opened a pull request to reintroduce overlay.rb to improve compatibility with older 0.3.x versions.

@pboling
Copy link

pboling commented Mar 25, 2021

I agree with yanking 0.3.6. I am sure that many people updated to it to make things work without understanding or noticing the licensing changes. I have personally seen MIT-licensed open source projects that did this. For their benefit and safety, as well as that of the closed source apps we can't see, we should yank 0.3.6.

@enriclluelles
Copy link

@jellybob Thanks so much for your hard work on this and ensuring that the gem can be used with the file loaded at runtime and no behavioral changes!

Much appreciated

@pboling
Copy link

pboling commented Mar 25, 2021

I would probably even yank 0.4.0, TBH. I wouldn't expect a drastic license change in anything but a major version or a name change, and would prefer a name change. The GPL version should be renamed to mimemagic-gpl 1.0.0

@jellybob
Copy link
Owner Author

Yup - 0.3.6 and 0.4.0 have both been yanked.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.