Skip to content
This repository has been archived by the owner on Aug 26, 2023. It is now read-only.

Matching libxml2 headers with nokogiri #71

Closed
stevecheckoway opened this issue Aug 15, 2018 · 6 comments
Closed

Matching libxml2 headers with nokogiri #71

stevecheckoway opened this issue Aug 15, 2018 · 6 comments
Assignees
Milestone

Comments

@stevecheckoway
Copy link
Collaborator

Build configurations

Nokogiri has (for our purposes) two supported configurations

  1. Compile with an embedded libxml2 (the default and most supported option)
  2. Compile with system libraries

Nokogumbo has two supported configurations

  1. Compile with libxml2 (and nokogiri) headers (this should be the most efficient option)
  2. Compile without the headers present and only use the nokogiri API to construct the DOM tree

Issue

There's no guarantee that the version of libxml2 that nokogumbo builds against in configuration 1 matches the version of libxml2 that nokogiri builds against.

The reason is that ext/nokogumboc/extconf.rb checks for the presence of a system libxml2 (via have_library('xml2', 'xmlNewDoc')) and then searches for nokogiri.h. If nokogiri is built in its default configuration, then there's no guarantee that the system library matches.

Desired goal

I think the best approach is for extconf.rb to perform the following actions.

  1. If nokogiri was compiled with an embedded libxml2 and if the headers it needs (in this case libxml/tree.h and nokogiri.h) are available from the nokogiri gem, then build nokogumbo configuration 1.
  2. If nokogiri was compiled with the system libxml2 and the system library headers (and nokogiri.h) are available, then build nokogumbo in configuration 1.
  3. Otherwise, build nokogumbo in configuration 2.

Blocking issue

I believe that for nokogiri 1.7, we can achieve this using this extconf.rb. Unfortunately, for 1.8, this approach doesn't work because the built libxml2, along with its headers, are deleted after the extension is built. See sparklemotion/nokogiri#1325 for more discussion on this.

One possible way forward (and I haven't tested this because I don't like this solution) would be to test if the libxml2 symbols are available in the nokogiri .so/.bundle and if so, then download the corresponding version of libxml2 and extract the headers.

It would be better if nokogiri either didn't delete the built libraries or first installed all of its headers into the extension directory.

@stevecheckoway stevecheckoway added this to the 2.0 milestone Aug 15, 2018
@flavorjones
Copy link
Collaborator

Hi,

Please feel free to tag me on issues related to the Nokogiri interop. I generally follow nokogumbo but it would help to be explicit and tag me when discussion is needed.

I'm happy to discuss ways to make compilation and integration easier for nokogumbo. I wasn't particularly fond of removing header files and libraries from the installed nokogiri directory hierarchy, so if there are compelling reasons to change that behavior I'm open to it (potentially with a nokogiri installation flag that nokogumbo would document).

Tagging @larskanis who was instrumental in the original series of commits to clean up post-installation for his thoughts.

@rubys
Copy link
Owner

rubys commented Aug 15, 2018

Ideally, installing nokogumbo -- with or without nokogiri previously being installed -- would be as easy as adding a singe line to a Gemfile and then running bundle install. Perhaps it would make sense to have a separate nokogiri-headers gem that would provide access to the necessary headers and libraries?

@stevecheckoway
Copy link
Collaborator Author

@rubys Over at sparklemotion/nokogiri#1325 I suggested a way that this might work. Essentially nokogiri would install its headers and, if system libraries aren't used, the headers for libxml2 and libxslt.

If that (or a similar approach) seems reasonable to @flavorjones and the other nokogiri maintainers, then I think we should be able to modify nokogumbo's extconf.rb to check for the headers in nokogiri's extension dir.

@rubys
Copy link
Owner

rubys commented Aug 15, 2018

Cool. Just a bit of history: I originally developed this for my personal use of Mac OS/X and Ubuntu. Others found it useful, and tried it on other platforms. Gentoo has some unique requirements and would from time to time propose patches that would break other platforms (most commonly, Mac OS/X). What you propose may end of fixing this... or once again break Gentoo further.

I'm not certain where I'm going with this, other than to say that testing on those three platforms is ideal. It might mean that there needs to be a new set of instructions written up for Gentoo users. See https://forums.gentoo.org/viewtopic-p-8234666.html?sid=79e6655c46226b3440fd9eca8655133e for an example of what they currently need to do.

@stevecheckoway
Copy link
Collaborator Author

I'll investigate testing on Gentoo. This SO answer suggests using Docker with travis-ci for testing. I'm going to add a new issue to track that.

@stevecheckoway
Copy link
Collaborator Author

Somewhat painful work around in #86. I believe I was wrong about Nokogiri 1.7 being different than 1.8. I think it's simply that the build libraries are left in place if there's a .git directory and prior to #86, we were including the Nokogiri extconf.rb which was rebuilding the libraries and then leaving them in place due to the .git directory.

If sparklemotion/nokogiri#1788 (or similar) gets merged, then we can simplify our extconf.rb. But for now, I think #86 is our least-bad option.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants