-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot comment out XML start tag for files beginning with UTF-8 BOM (ef bb bf) #36
Labels
Comments
I would need to see the epub file in question. Also, what happens when you try to use the converted kepub? |
From https://github.com/andychlin/test, you can get a sample epub and the kepub generated from it. When I put both file into my KOBO forma, the text from epub can be shown. But KOBO forma cannot display text from the kepub.
|
|
pgaskin
added a commit
to pgaskin/net
that referenced
this issue
Jan 12, 2020
This option treats the UTF-8 BOM, if present, as whitespace to prevent moving comments into the body element. This is mainly intended for use with RenderOptionAllowXMLDeclarations to prevent the XML declaration being moved into the body element (which is invalid). See pgaskin/kepubify#36 for an example of this.
pgaskin
added a commit
to pgaskin/net
that referenced
this issue
Jan 12, 2020
This option treats the UTF-8 BOM, if present, as whitespace to prevent moving comments into the body element. This is mainly intended for use with RenderOptionAllowXMLDeclarations to prevent the XML declaration being moved into the body element (which is invalid). See pgaskin/kepubify#36 for an example of this.
pgaskin
added a commit
to pgaskin/net
that referenced
this issue
Jan 12, 2020
This option treats the UTF-8 BOM, if present, as whitespace to prevent moving comments into the body element. This is mainly intended for use with RenderOptionAllowXMLDeclarations to prevent the XML declaration being moved into the body element (which is invalid). See pgaskin/kepubify#36 for an example of this.
pgaskin
added a commit
to pgaskin/net
that referenced
this issue
Jan 12, 2020
This option treats the UTF-8 BOM, if present, as whitespace to prevent moving comments into the body element. This is mainly intended for use with RenderOptionAllowXMLDeclarations to prevent the XML declaration being moved into the body element (which is invalid). See pgaskin/kepubify#36 for an example of this.
pgaskin
added a commit
that referenced
this issue
Jan 14, 2020
- Improved robustness - More is implemented directly in the HTML parser and renderer (see my fork of x/net/html) - Better support for XHTML and HTML5 (rather than using a bunch of workarounds) - No more regexps for modifying HTML - Better smart punctuation - More punctuation supported - More robust (won't apply to everything unconditionally) - Now off by default - Faster and more efficient (15-30% faster, 50-70% less memory) - Less memory allocations and copies due to use of readers and writers rather than storing rhe entire file in memory multiple times - Stack-based span adding algorithm (rather than recursive, which has more runtime and memory overhead) - Use byte arrays or runes rather than strings where possible - Better parallel processing of content files - Eliminated memory, goroutine, and file descriptor leaks - Cleaner and better code - Easier to extend - More stable API - More complete unit tests - More accurate sentence splitting and segment numbering (checked against 3 recent free books) - Better match Kobo's behavior by preserving, but not wrapping (in a koboSpan) TextNodes with only whitespace. Previous versions of kepubify used to collapse it to a single space, which still works, but is less efficient to do and is slightly different than what Kobo does (although it results in the same thing during rendering). - Fixed some edge cases where the segment counter could be incorrectly incremented. - Also increment paragraph counter for tables (this case was missing before). - Don't increment paragraph counter if spans were added (i.e. an empty or only whitespace paragraph element) (this case was missing before). - Smaller binary size - Also run tests on Windows closes #47, fixes #45, fixes #35 better fix for #36, #29, #28, #26, #21, #14, #10, #5, and #2
pgaskin
added a commit
that referenced
this issue
Jun 11, 2021
This option treats the UTF-8 BOM, if present, as whitespace to prevent moving comments into the body element. This is mainly intended for use with RenderOptionAllowXMLDeclarations to prevent the XML declaration being moved into the body element (which is invalid). See #36 for an example of this.
pgaskin
added a commit
that referenced
this issue
Jun 11, 2021
This option treats the UTF-8 BOM, if present, as whitespace to prevent moving comments into the body element. This is mainly intended for use with RenderOptionAllowXMLDeclarations to prevent the XML declaration being moved into the body element (which is invalid). See #36 for an example of this.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Version: v2.3.2 on Windows 10
I found that kepubify cannot comment out XML start tag for files beginning with UTF-8 BOM (ef bb bf). And it worked well after I removed the UTF-8 BOM from files.
Another strange issue is that if I change the string "utf-8" to uppercase, it will works, too.
It means, kepubify cannot comment out
But it does comment out
even the file begins with UTF-8 BOM.
The text was updated successfully, but these errors were encountered: