Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decodeHtml4 & decodeHtml5 in one step #18

Closed
vanniktech opened this issue Aug 17, 2023 · 4 comments · Fixed by #20
Closed

decodeHtml4 & decodeHtml5 in one step #18

vanniktech opened this issue Aug 17, 2023 · 4 comments · Fixed by #20

Comments

@vanniktech
Copy link

decodeHtml5 isn't a set which also contains the decodeHtml4 entities. I'd like to invoke both. I know that I can call them one after another however, but I'm guessing performance wise a single call would be better. Since it'd only need to loop once instead of twice over the string. Also NumericEntityDecoder() is used twice which is unnecessary.

Esentially, I want an aggregator like this:

AggregateTranslator(
    LookupTranslator(EntityMaps.HTML4Decode),
    LookupTranslator(EntityMaps.HTML5Decode),
    NumericEntityDecoder(),
)

Side note: I also found decodeHtml confusing as at first I thought this is decode4+5 but it's just 5, in which case, I'd go for removing that method to make the API smaller & conciser.

@MohamedRejeb
Copy link
Owner

decodeHtml5 already contains all HTML4 entities so you can just use decodeHtml5 and you will cover all the entities.
True decodeHtml is confusing, maybe I'll deprecate it and suggest replacing it with decodeHtml5. I added it to make it clear that this is going to cover any HTML entity (4+5) so people don't get confused

@vanniktech
Copy link
Author

It does not. The following test case fails for HTML5:

  @Test fun decode() {
    assertEquals("Å", KsoupEntities.decodeHtml4("Å"))
    assertEquals("Å", KsoupEntities.decodeHtml5("Å"))
  }

@MohamedRejeb
Copy link
Owner

It's a bug. I'll check this

@MohamedRejeb
Copy link
Owner

I'll release a new version soon containing the fix for this issue and the other issues as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants