Fix broken image downloads and content parsing issues #51
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As it was a known issue, v0.0.17 was broken because the images weren't being downloaded.
The reason for this was mainly because on a previous commit of #35 the whole
content.data
was being passed through XML entities encoding, which (1) made, for instance, all the greater/less than characters surrounding tags being encoded, eg:<p>...
thus making thecheerio
loading/parsing not work as intended which (2) rendered all the subsequent processing code (images and such) useless. Since contents data proper XHTML5 entities encoding should be a responsibility of lib users, this was removed, which made the images parsing and download work well again.Finally the
content.data
was being outputted as HTML because cheerio$.xml()
returns proper HTML even when loading an encoded string. (¯\_(ツ)_/¯, sense: makes none)Moreover, several other issues with content data parsing we're fixed, namely the removal of
ignoreWhitespace
option oncheerio
too, which fixes the issue experienced on #38 and renders that solution obsolete.Summing up the list of fixes goes like:
<br></br>
thus making EPUB validation fail (4237a4b)ignoreWhitespace
option oncheerio
loading, fixes/renders Use raw data (rather than entities parsed version); required for new line preservation #38 obsolete and reverts it (c968d40)