Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debian looking to backport parser changes for CVE-2021-37714 #1627

Closed
utkarsh2102 opened this issue Sep 6, 2021 · 5 comments
Closed

Debian looking to backport parser changes for CVE-2021-37714 #1627

utkarsh2102 opened this issue Sep 6, 2021 · 5 comments
Labels
discussion Discussion for a new feature, or other change proposal

Comments

@utkarsh2102
Copy link

Hello,

Thank you for your work on jsoup. However, since we have CVE-2021-37714 (which is fixed in the latest release), I'd like to backport the fixes to older releases (in Debian). To do that, I need to know the relevant commits that are sufficient to be backported for fixing the mentioned CVEs. On a quick look, it wasn't clear which ones are they, could you please point me to them? TIA! \o/

@jhy
Copy link
Owner

jhy commented Sep 6, 2021

Hi,

Personally, I am quite wary of the idea of trying to backport these changes to the old version of jsoup that Debian publishes. There have been many changes to the parser and related classes since that release, and I don't know how cleanly the patches will apply, and whether that could then still be considered a 1.10.x version. It would be some hybrid fork between 1.10 and 1.14.2 and is in my view likely to behave differently (including creating different parse trees) than any published version of jsoup.

My strong recommendation is that users upgrade to 1.14.2, and get into a position to stay up to date with the current version.

Here are the details on the changes:

I believe that I included the "Fuzz" label on each issue, which links to the relevant commit:

Closed issues with Fuzz label: https://github.com/jhy/jsoup/issues?q=label%3Afuzz+is%3Aclosed

You would need to pick up at least the changes in the parser from 1.14.1 and 1.14.2. Given that Debian is running such an old fork of jsoup (1.10.2, from Jan 2017) there are probably other changes in the parser and related interacting classes that will be required.

Take a look at these files:
https://github.com/jhy/jsoup/blob/master/src/main/java/org/jsoup/parser/HtmlTreeBuilder.java
https://github.com/jhy/jsoup/blob/master/src/main/java/org/jsoup/parser/HtmlTreeBuilderState.java
History: https://github.com/jhy/jsoup/commits/master/src/main/java/org/jsoup/parser
Test history: https://github.com/jhy/jsoup/tree/master/src/test/java/org/jsoup/parser

Specific tests:
https://github.com/jhy/jsoup/blob/master/src/test/java/org/jsoup/integration/FuzzFixesTest.java
https://github.com/jhy/jsoup/blob/master/src/test/java/org/jsoup/integration/FuzzFixesIT.java
(Should be enough to identify that any infinite loops are detected that were fixed with the 1.14.2 release, but won't tell us if there are other instances in the Debian fork that were not identified - as these tests resources only include failures the fuzzer detected in the HEAD version it was running against. If you do create this fork I would suggest also setting up similar fuzzing infrastructure to build confidence)

Let me know if I can help detail out anything else!

@jhy jhy added the discussion Discussion for a new feature, or other change proposal label Sep 6, 2021
@jhy jhy changed the title Fix for CVE-2021-37714 Debian looking to backport parser changes for CVE-2021-37714 Sep 6, 2021
@utkarsh2102
Copy link
Author

Hello @jhy,

Thanks a bunch for your detailed reply! 🎉

Personally, I am quite wary of the idea of trying to backport these changes to the old version of jsoup that Debian publishes. There have been many changes to the parser and related classes since that release, and I don't know how cleanly the patches will apply, and whether that could then still be considered a 1.10.x version. It would be some hybrid fork between 1.10 and 1.14.2 and is in my view likely to behave differently (including creating different parse trees) than any published version of jsoup.

Aha, I see! (more below)

My strong recommendation is that users upgrade to 1.14.2, and get into a position to stay up to date with the current version.

We'll indeed move to 1.14.2 in the newer releases of Debian, however upgrading to the latest version in the older releases (buster, stretch, jessie) isn't really a viable option, I am afraid. :(

Here are the details on the changes:
I believe that I included the "Fuzz" label on each issue, which links to the relevant commit:
Closed issues with Fuzz label: https://github.com/jhy/jsoup/issues?q=label%3Afuzz+is%3Aclosed

Oh wow, that is an awful lot! Are these all, more or less, affiliated to CVE-2021-37714?

You would need to pick up at least the changes in the parser from 1.14.1 and 1.14.2. Given that Debian is running such an old fork of jsoup (1.10.2, from Jan 2017) there are probably other changes in the parser and related interacting classes that will be required.

Hah, I was also looking forward to backporting this to v1.8.1. But now as I see it, that's a bit too ambitious, I guess? Is it even possible to go back so far w/o causing regressions, et al?

Let me know if I can help detail out anything else!

Thank you so much! 🎉

@jhy
Copy link
Owner

jhy commented Sep 6, 2021

Oh wow, that is an awful lot! Are these all, more or less, affiliated to CVE-2021-37714?

Yes.

Hah, I was also looking forward to backporting this to v1.8.1. But now as I see it, that's a bit too ambitious, I guess? Is it even possible to go back so far w/o causing regressions, et al?

The logical structure of the parser is pretty similar between 1.8 and 1.10. So I'm not sure how much of a bigger piece of work it would be. (The parser structure changed largely in 1.6.) The object layout of the nodes package changed in 1.12 which I expect you would need to accommodate.

While you're in there, maybe you could fix the build so that the test results are not just ignored?
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=893212
https://salsa.debian.org/java-team/jsoup/-/blame/master/debian/maven.properties

Seeing that made me lose a lot of trust in Debian's QC practices.

@jhy
Copy link
Owner

jhy commented Sep 13, 2021

(Closing, but please feel free to reopen if there's any more Qs or discussion points. Thanks!)

@jhy jhy closed this as completed Sep 13, 2021
@utkarsh2102
Copy link
Author

Thank you, @jhy. I'll try to ping the person to update the Debian package in the -devel release and might as well open the bug to do that.

Thanks for your help, shall let you know if we want more information! \o/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discussion for a new feature, or other change proposal
Projects
None yet
Development

No branches or pull requests

2 participants