Skip to content

Remove jericho-html dependency#952

Open
HelloKAzhao wants to merge 2 commits into
apache:masterfrom
HelloKAzhao:remove_unused_pkg
Open

Remove jericho-html dependency#952
HelloKAzhao wants to merge 2 commits into
apache:masterfrom
HelloKAzhao:remove_unused_pkg

Conversation

@HelloKAzhao
Copy link
Copy Markdown

@HelloKAzhao HelloKAzhao commented May 18, 2026

Replace jericho-html with jsoup due to licensing incompatibility

What changes were proposed in this pull request?

This PR proposes to replace the jericho-html library with jsoup in the [module-name] module.

Why are the changes needed?

  1. License Compliance: The jericho-html library is distributed under the Eclipse Public License (EPL) / GNU Lesser General Public License (LGPL), which is incompatible with the Apache License 2.0 used by Apache Ranger.
  2. Adopting a Standard Alternative: jsoup is a modern, robust, and widely-used Java HTML parser distributed under the permissive MIT License, making it fully compliant with our project's licensing requirements.
  3. Improved Maintainability: jsoup offers a more intuitive DOM traversal API (similar to jQuery) and is actively maintained, which will improve the long-term maintainability of the codebase.

Does this PR introduce any user-facing change?

No. This is an internal dependency replacement. The functional behavior of HTML parsing remains consistent.

How was this patch tested?

  1. Verified that the project builds successfully with mvn clean package -DskipTests.
  2. Updated the existing unit tests to reflect the API changes from jericho-html to jsoup.
  3. Ran the affected module's tests to ensure HTML parsing logic works correctly with the new library:
    mvn test -pl [module-name]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant