Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
Possibe to remove nokogiri dependency? #100
Thanks for asking this question. I think we can get to a decision point pretty quickly, but first I'd like to provide some context.
Nokogiri, through either libxml2 or xerces/nekohtml, does a decent job of validating and fixing broken markup. Building a valid DOM is actually quite valuable for the sanitization function that Loofah is performing for Rails.
Historically, most XSS (and other HTML/js) attack vectors have intentionally used invalid or broken markup to sneak through sanitization. Some historical examples are here:
which are taken directly from the
We run these tests in Loofah's test suite to ensure that we're adequately protecting sites; though you'll notice that Loofah itself contains no logic to actually detect or handle invalid markup -- Loofah relies entirely on Nokogiri (and therefore libxml2 and xerces) to catch and remove-or-fix these cases.
All of this is to say that, if you wanted to experiment with porting Loofah to use Oga or Ox, as long Loofah's test suite and Rails's sanitization tests all pass, then we can have a more serious conversation about whether to replace Nokogiri entirely, or else have it use a pluggable parser.
Does all that make sense?
Replacing Nokogiri with X depends on how closely Loofah is tied into the API of Nokogiri. Oga's API is quite different in a bunch of places so I doubt a simple find-replace would suffice. For those interested there's a guide for migrating from Nokogiri to Oga which can be found at http://code.yorickpeterse.com/oga/latest/file.migrating_from_nokogiri.html.
@YorickPeterse the point I was trying to make above is that this may not be as simple as a mechanical transformation from one backend's API to another's.
The behavior of the parser is actually important to the function of sanitization, and so I'm happy to have a conversation about using other backends if we know the test suite will pass.
@flavorjones Sure, a working test suite is absolutely vital. Also, if the choice is ever made to switch from Nokogiri to X a set of benchmarks would be useful to ensure performance isn't suddenly 9000 times worse. Either way if any help/info is ever needed for Oga feel free to ping me on GitHub or the Twitters.
@mikbe, thanks for your comment.
I've actually participated in dozens of threads and issues for literally hundreds of hours of my life over the last few years, trying to diagnose a wide variety of issues related to Nokogiri working well on OSX.
I don't remember seeing your name in any of those threads. Are you familiar with the underlying challenges that you seem to think are so simple to fix? If so, why haven't you shared your diagnoses and participating in fixing the issues?
If not, then I don't understand why you'd make a comment like this. I see ruby-talk email from you in my inbox from 2012 and 2011 that appear to be from a kind, thoughtful, empathetic individual. Why did you take time out of your day to make this rude, disrespectful, and (worst of all) contentless comment, exactly? You know there are humans who read these comments, yes? And that those people are not obligated to provide services to you?
Don't bother responding. I'm going lock this ticket. Though I would certainly welcome your help in the future, should you decide to participate constructively in the community.