-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libxml 2.10+ compatibility for <dot.> and <foo:colon> angle-bracket syntax #102
Comments
Ah, namespaces, great. Wouldn't be surprised if this is just the new libxml version being more correct about namespace handling; some of our test code throws around "foo"s. I'll have a look at it. |
Any progress on this? |
Ping? The Fedora 40 mass rebuild is expected in January, and without a fix that will be the second consecutive mass rebuild with an unsuccessful build for rubygem-ronn-ng, which puts its continued presence in Fedora at risk, as well as the various packages which depend on it to build. |
Okay, made some progress. I can now reliably reproduce this error under both Ruby 2.7.4 and 3.1.4, the versions in the latest Debian, and got my dev env working regularly. The trick was giving up on the system ruby and using rbenv and user-installed Rubies. And it's also reproducing on a Mac with a Homebrew-installed libxml2.12. Think I'll be able to get this sorted out soon now. |
Okay, I think I've got a partial diagnosis here. I'm not sure this is actually due to libxml 2.10 exclusively, or per se. I think it may be a larger issue, also involving the switch from rdiscount to Kramdown as the Markdown processing library, or changes between versions in Kramdown, or the use of GitHub Flavored Markdown. I think: The newly-failing test due to that output change where the "foo:" disappears is not the only problem. The current expected output in that test, where it expects
That means that That ronn-format.7 description doesn't say anything about special behavior for
It doesn't say why dots and colons are an exception here. But I'll bet that's for compatibility with Markdown's special I suspect that the "namespace" stripping happens either during the call to This stuff is done by the Next steps: Figuring out exactly where the namespace stripping is happening, and try to revert it. I'm thinking maybe the If I can't get that to work, I'm inclined to just mark the new output as the expected correct output, since a For the longer term, should probably define more thoroughly and formally what the behavior is for these |
Oh. My. Gosh. I think I've been reading the test error message backwards this whole time.
That means the test's expected output definition currently has I think the new output that includes the |
…-bracket syntax (#102) A couple years ago, for special markdown-link-like things like "<foo:bar>", the text up to the colon was being stripped, maybe due to namespace handling. Now, it's being preserved. Maybe due to libxml 2.x changes, maybe due to Kramdown/Nokogiri changes or how we're now using them instead of rdiscount and hpricot. I think the new "foo:"-preserving behavior is more correct, both in terms of HTML/XML processing an Ronn's definition of special angle-bracket "<word>" syntax. So I'm marking the new behavior as expected, and not doing any behavioral code changes. This should get the tests passing again under recent Linux and macOS releases.
… be preserved (#102) A couple years ago, for special markdown-link-like things like "<foo:bar>", the text up to the colon was being stripped, maybe due to namespace handling. Now, it's being preserved. Maybe due to libxml 2.x changes, maybe due to Kramdown/Nokogiri changes or how we're now using them instead of rdiscount and hpricot. I think the new "foo:"-preserving behavior is more correct, both in terms of HTML/XML processing an Ronn's definition of special angle-bracket "<word>" syntax. So I'm marking the new behavior as expected, and not doing any behavioral code changes. This should get the tests passing again under recent Linux and macOS releases.
…efixes to be preserved (#102) A couple years ago, for special markdown-link-like things like "<foo:bar>", the text up to the colon was being stripped, maybe due to namespace handling. Now, it's being preserved. Maybe due to libxml 2.x changes, maybe due to Kramdown/Nokogiri changes or how we're now using them instead of rdiscount and hpricot. I think the new "foo:"-preserving behavior is more correct, both in terms of HTML/XML processing an Ronn's definition of special angle-bracket "<word>" syntax. So I'm marking the new behavior as expected, and not doing any behavioral code changes. This should get the tests passing again under recent Linux and macOS releases.
…efixes to be preserved (#102) A couple years ago, for special markdown-link-like things like "<foo:bar>", the text up to the colon was being stripped, maybe due to namespace handling. Now, it's being preserved. Maybe due to libxml 2.x changes, maybe due to Kramdown/Nokogiri changes or how we're now using them instead of rdiscount and hpricot. I think the new "foo:"-preserving behavior is more correct, both in terms of HTML/XML processing an Ronn's definition of special angle-bracket "<word>" syntax. So I'm marking the new behavior as expected, and not doing any behavioral code changes. This should get the tests passing again under recent Linux and macOS releases.
…efixes to be preserved (#102) A couple years ago, for special markdown-link-like things like "<foo:bar>", the text up to the colon was being stripped, maybe due to namespace handling. Now, it's being preserved. Maybe due to libxml 2.x changes, maybe due to Kramdown/Nokogiri changes or how we're now using them instead of rdiscount and hpricot. I think the new "foo:"-preserving behavior is more correct, both in terms of HTML/XML processing an Ronn's definition of special angle-bracket "<word>" syntax. So I'm marking the new behavior as expected, and not doing any behavioral code changes. This should get the tests passing again under recent Linux and macOS releases.
…efixes to be preserved (#102) A couple years ago, for special markdown-link-like things like "<foo:bar>", the text up to the colon was being stripped, maybe due to namespace handling. Now, it's being preserved. Maybe due to libxml 2.x changes, maybe due to Kramdown/Nokogiri changes or how we're now using them instead of rdiscount and hpricot. I think the new "foo:"-preserving behavior is more correct, both in terms of HTML/XML processing an Ronn's definition of special angle-bracket "<word>" syntax. So I'm marking the new behavior as expected, and not doing any behavioral code changes. This should get the tests passing again under recent Linux and macOS releases.
…efixes to be preserved (#102) A couple years ago, for special markdown-link-like things like "<foo:bar>", the text up to the colon was being stripped, maybe due to namespace handling. Now, it's being preserved. Maybe due to libxml 2.x changes, maybe due to Kramdown/Nokogiri changes or how we're now using them instead of rdiscount and hpricot. I think the new "foo:"-preserving behavior is more correct, both in terms of HTML/XML processing an Ronn's definition of special angle-bracket "<word>" syntax. So I'm marking the new behavior as expected, and not doing any behavioral code changes. This should get the tests passing again under recent Linux and macOS releases.
…efixes to be preserved (#102) A couple years ago, for special markdown-link-like things like "<foo:bar>", the text up to the colon was being stripped, maybe due to namespace handling. Now, it's being preserved. Maybe due to libxml 2.x changes, maybe due to Kramdown/Nokogiri changes or how we're now using them instead of rdiscount and hpricot. I think the new "foo:"-preserving behavior is more correct, both in terms of HTML/XML processing an Ronn's definition of special angle-bracket "<word>" syntax. So I'm marking the new behavior as expected, and not doing any behavioral code changes. This should get the tests passing again under recent Linux and macOS releases.
…efixes to be preserved (#102) A couple years ago, for special markdown-link-like things like "<foo:bar>", the text up to the colon was being stripped, maybe due to namespace handling. Now, it's being preserved. Maybe due to libxml 2.x changes, maybe due to Kramdown/Nokogiri changes or how we're now using them instead of rdiscount and hpricot. I think the new "foo:"-preserving behavior is more correct, both in terms of HTML/XML processing an Ronn's definition of special angle-bracket "<word>" syntax. So I'm marking the new behavior as expected, and not doing any behavioral code changes. This should get the tests passing again under recent Linux and macOS releases.
…efixes to be preserved (#102) A couple years ago, for special markdown-link-like things like "<foo:bar>", the text up to the colon was being stripped, maybe due to namespace handling. Now, it's being preserved. Maybe due to libxml 2.x changes, maybe due to Kramdown/Nokogiri changes or how we're now using them instead of rdiscount and hpricot. I think the new "foo:"-preserving behavior is more correct, both in terms of HTML/XML processing an Ronn's definition of special angle-bracket "<word>" syntax. So I'm marking the new behavior as expected, and not doing any behavioral code changes. This should get the tests passing again under recent Linux and macOS releases.
Yeah, I think the right fix is to assume we'll be running under the newer libxml 2.10+ and recent kramdown/Nokogiri, mark the current output as correct in the test, and ship it. And then look deeper in to what the ". or : in tag contents" behavior should be for a future release. I've checked in the fix on
Dunno if I should bump to using Ruby 3.x yet; I'm not sure exactly how ruby dependencies in Homebrew and Fedora work. I'll have a look at their package definitions. If you want to test this now from the repo, it's on |
Might have found a more-precise cause: yeah, it's nokogiri and libxml2, and maybe I'm seeing somewhat different behavior in my testing than you're seeing in the built Fedora one because I'm using the pre-built nokogiri and its vendored libxml2 library even when I'm testing ronn-ng on Fedora. Nokogiri ships ("vendors") its own libxml2 like you say. In my testing, this behavior change happened with nokogiri 1.14.3: earlier versions <= 1.14.2 have the old behavior that strips the "foo:" prefix, and 1.14.3+ preserve the "foo:" prefix. In my testing, this doesn't seme to depend what system libxml2 is installed on my box or brew environment; simply changing the nokogiri version in the The nokogiri 1.14.3 release notes say that version 1.14.3 upgraded its vendored libxml2 from 2.10.3 to 2.10.4. (For security reasons, it says.) And hey, look: the libxml 2.10.4 release notes talk about a behavior change related to namespaces in HTML elements, which looks like what we saw here. Though if it was 2.10.3 hitting Fedora that caused the change there, why is the nokogiri 1.14.2 gem that uses 2.10.3 showing the old behavior in my test environment?
And that seems consistent with what you say here:
That nokogiri 1.14.3 transition is maybe when the RubyGems version of nokogiri picked up the libxml2 transition that caused the Fedora build problem, and that's when it started showing up in my own testing, because I just test using gems and rbenv and don't go through the whole RPM package build process and take a dependency on the nokogiri RPM. I wonder if there's an easy way for me to set up the Ronn-NG repo so I could run it from the source but using the nokogiri and other dependencies from the system-installed RPM packages instead of gems I pick up in my dev Ruby environments using bunder and gem? Anyway, after this further reading and experimentation, I'm pretty confident that the new behavior of retaining the "foo:" is correct-er (as far as Ronn-NG is concerned), and updating the test fixture and gemspec dependency versions is the right fix. I merged the previous test fix to main, and altered the gemspec dep for nokogiri in b4deb42 (#108). I think we're about ready to cut an 0.10.1 release that will build the Fedora RPM successfully. |
…arget environments Allowing a broader range of versions for dependency gems will let a given ronn-ng version work in more environments, particularly distros that have fixed versions of the dependency gems in their packages, and systems running older Ruby versions. Added comments to the gemspec to explain why the version restrictions are there. I left them all open-ended on the upper bound because I'm not aware of any dep versions that break things, and I'll just assume back-compatibility until I know otherwise for specific gems. The nokogiri minimum version is 1.14.3, because that's required for the behavior we want for HTML tag names with ":" in them. See #102. I updated Gemfile.lock instead of removing it, but I no longer know if that's significant or good in deployment. Right now I'm mostly considering it an indication of the exact versions I tested it most heavily with. I don't expect downstream packagers to use or respect Gemfile.lock.
I've published a Ronn-NG 0.10.1.pre4 prerelease to GitHub and RubyGems that I think will fix this and get the Fedora RPM build working again (but am not sure, bc I haven't tested RPM building itself). |
This sounds plausible to me. Please note that on Fedora, we run CI called Koschei, which can catch issues like this caused by library updates, so I am pretty sure this started to happen with update libxml2.
On Fedora, you can either use the system version of Nokogiri:
With Bundler, you would also need to use something like Or you should be able to use Nokogiri installed from RubyGems like this:
Basically it is needed to install all the build dependencies and later 1) tell nokogiri configure script to use them 2) use the |
Here is the PR fixing the Fedora RPM. It links to Fedora CI - scratch build, where you could eventually get to the build.log. There can be seen that 1) the patch was applied 2) the test is passing. Please note there is applied custom patch instead yours, but we will use either the official patch or wait for official stable release. We don't do pre-releases often, because sometimes happens that there is some pre-release of new major version, which never materializes into official version. This likely won't be the case, but the patch is easier anyway :) Also, thx a lot for looking into this! BTW the bad news is that there is now also different fork of Ronn, which was recently started to be used by Bundler :( |
The good news is that they are likely to suffer this issue, so ronn-ng is ahead with the fix ;) |
Now I see the #87, so you are aware, sorry 🙈 |
Sounds likely to me. I barely know what I'm doing here with the basic Ruby and Gems stuff, and have basically no experience with RPM packaging or how Fedora ships nokogiri. BTW, thank you for a good bug report here. It was specific, readable, and had references, and I think you got the cause right in the first post. It just took me a while to get up to speed on this, because I didn't really know enough about RPMs, Fedora releases, and the nokogiri/libxml2 interaction to really understand what you were talking about until I spent some time looking around the code.
Ah, that might be what I need here. I'll look in to it.
Glad to see we agree on the approach here. (I think.) I think I'll have the time over the next couple days to pull this in to the official upstream release, and we can save the work/divergence of a patch at the RPM level.
You're welcome! Sorry it's taken so long. I'm working at a startup for my day job and it got really busy this year, and I caught Covid a few months ago and it really sucked and gobbled up my free time.
A new challenger has appeared! :) That fork only has a few changes in it, and I've already pulled them in to Ronn-NG here. (#87) Once this 0.10.1 release goes out, I'll be at feature parity and maybe there'll be no need for another fork. |
The fix here is merged to main, and I've tested it as well as I can at the moment. Setting this issue to Closed. Will ping you when it rolls out in a release. |
…refixes to be preserved (#102) A couple years ago, for special markdown-link-like things like "<foo:bar>", the text up to the colon was being stripped, maybe due to namespace handling. Now, it's being preserved. Maybe due to libxml 2.x changes, maybe due to Kramdown/Nokogiri changes or how we're now using them instead of rdiscount and hpricot. I think the new "foo:"-preserving behavior is more correct, both in terms of HTML/XML processing an Ronn's definition of special angle-bracket "<word>" syntax. So I'm marking the new behavior as expected, and not doing any behavioral code changes. This should get the tests passing again under recent Linux and macOS releases.
…target environments Allowing a broader range of versions for dependency gems will let a given ronn-ng version work in more environments, particularly distros that have fixed versions of the dependency gems in their packages, and systems running older Ruby versions. Added comments to the gemspec to explain why the version restrictions are there. I left them all open-ended on the upper bound because I'm not aware of any dep versions that break things, and I'll just assume back-compatibility until I know otherwise for specific gems. The nokogiri minimum version is 1.14.3, because that's required for the behavior we want for HTML tag names with ":" in them. See #102. I updated Gemfile.lock instead of removing it, but I no longer know if that's significant or good in deployment. Right now I'm mostly considering it an indication of the exact versions I tested it most heavily with. I don't expect downstream packagers to use or respect Gemfile.lock.
Aha! Think I figured out how to test properly on Fedora in its RPM-supplied environment: skip the rbenv and Tested that on Fedora 39 just now, and it passes:
I've published this as 0.10.1.pre6 to both GitHub and RubyGems. Think this is the release candidate. https://github.com/apjanke/ronn-ng/releases/tag/v0.10.1.pre6 Gonna sleep on this overnight and if no other problems occur to me, publish it as the real 0.10.1 release. (Don't think there's much to be gained by waiting further, since I don't have any users doing testing from the repo |
Yes, that is how we make the package. You could also use Bundler, but it is necessary to have the dependencies installed and run There is also possibility to use https://packit.dev/, but I have not real experience how to setup it and I don't think that upstreams need to bother with PRMs. |
And this is not RPM issue. We are just sometimes ahead of the game and Nokogiri will eventually move to more recent libxml |
Could that get Nokogiri built with its non-default options which have it use the system-supplied libxml2 instead of its vendored libxml2? I couldn't see a way to do that, since bundle uses the Gemfile/gemspec, and those declare their dependencies in terms of just gem name and version; I didn't see way to specify build options for the dependency gems. But again, I'm no Ruby expert. That packit thing looks interesting, but I'm going to pass on it, because I'm not confident that I'd be able to get it to correctly reproduce the Fedora-supplied RPMs for ronn's dependencies, and the only thing I care about here is making sure ronn-ng works against the official RPM-supplied r"ubygem-*" RPMs. I can just build VMs for each of the Fedora releases to get those for use as test beds. I added some notes about testing Ronn-NG on Fedora here in dd2ceeb, and will expand on that as I get better with it. I'm feeling good about this Ronn-NG 0.10.1 build, and will go ahead and publish the real release for it today. |
Actually, it might depend on a presence of Gemfile.lock. I suspect there might be stored architecture, which might make the difference. If you drop it, then it gets what it has available.
Yeah, right. No worries
Nice 🥳 |
Ronn-NG 0.10.1 is published!
I think this will work for y'all with current Fedora releases. Please let me know if there's any problems! |
Hi, @voxik, just circling back on this to see if it worked out. Is ronn-ng v0.10.1 working out okay in the new Fedora builds? On the Fedora rubygem-ronn-ng packaging page (https://packages.fedoraproject.org/pkgs/rubygem-ronn-ng/rubygem-ronn-ng/), I see that Rawhide now has an "0.10.1-2.fc40" package built under "Stable", which looks like a good sign? |
Thank you for asking. Yeah, the 0.10.1 have been just in Rawhide (the development version of Fedora) so fart, therefore it have not reached masses yet. But as far as I can say, the package itself builds fine and I am not aware it would cause troubles elsewhere in packages using Ronn-NG to generate their manpages. I am quite confident there won't be any further issues. Appreciate your support 👍 |
Great! Swing on back if anything does crop up. |
It seems that since libxml 2.10.3 has landed in Fedora, we observe the following test error ronn-ng test suite:
Please note that this is not issue with pre-built nokogiri installed from rubygems.org, because that likely bundles older libxml2
The text was updated successfully, but these errors were encountered: