Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add *_with_custom_entities versions of all `unescape_*\ methods #261

Merged
merged 4 commits into from
Feb 10, 2021

Conversation

pchampin
Copy link
Contributor

@pchampin pchampin commented Feb 8, 2021

these versions expect an additional parameter:
a hashmap containing custom entity definitions.
A typical use-case is for the user to parse those entities definitions
from the DOCTYPE.
An example is provided: examples/custom_entities.rs .

these versions expect an additional parameter:
a hashmap containing custom entity definitions.
A typical use-case is for the user to parse those entities definitions
from the DOCTYPE.
An example is provided: examples/custom_entities.rs .
Copy link
Owner

@tafia tafia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
I've made a few comments. Also I feel like there is a lot of code repeat. Not a big deal but if you find a way to make it a little more DRY would be much appreciated!

src/escapei.rs Outdated
parse_hexadecimal(&bytes[2..])
} else if bytes.starts_with(b"#") {
parse_decimal(&bytes[1..])
bytes if bytes[0] == b'#' => {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bytes if bytes.starts_with(b'#')

(to accommodate when bytes is empty)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absolutely!

src/escapei.rs Outdated
parse_hexadecimal(&bytes[2..])
} else if bytes.starts_with(b"#") {
parse_decimal(&bytes[1..])
bytes if bytes[0] == b'#' => {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment

let decoded = reader.decode(&*self.value);
let unescaped =
unescape_with(decoded.as_bytes(), custom_entities).map_err(Error::EscapeError)?;
String::from_utf8(unescaped.into_owned()).map_err(|e| Error::Utf8(e.utf8_error()))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure it'll be utf8? Shouldn't we use the reader.decode instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lean towards keeping things simple: what about requiring that custom_entities contains UTF-8?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright!

@pchampin
Copy link
Contributor Author

pchampin commented Feb 9, 2021

About making the code more DRY: yes, I would be better, but I wanted to test the idea with you before refactoring... I'll go make the change (as well as your comments above).

Copy link
Owner

@tafia tafia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks!

let decoded = reader.decode(&*self.value);
let unescaped =
unescape_with(decoded.as_bytes(), custom_entities).map_err(Error::EscapeError)?;
String::from_utf8(unescaped.into_owned()).map_err(|e| Error::Utf8(e.utf8_error()))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright!

@tafia tafia merged commit 4c85384 into tafia:master Feb 10, 2021
@Mingun Mingun mentioned this pull request May 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants