You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm rewriting the HTML manipulation code to fix the root cause of quite a few of the recent bugs including #45, #36, #29, #28, #25, #21, and #2. These issues were caused due to goquery (and thus kepubify) internally using golang.org/x/net/html which is a HTML5 library. The parsing was fine for nearly all books, but it wasn't tolerant of self-closing non-void elements (as the spec says it should, but it's not really a useful thing to do), which many XML generators generate when an element doesn't contain any children. The code generation was also usually fine as it generated valid XML (it did things optional for HTML5 but mandatory for XHTML: putting the closing /> on void elements, having an empty ="" on boolean attributes, using only a few named entities, etc), but it caused a few issues with not escaping NBSPs, and messing up the XML declarations found in XHTML for EPUB2 books.
In addition, these changes will improve the performance and memory usage of kepubify (there will be a lot less string allocations and copying).
I'm rewriting the HTML manipulation code to fix the root cause of quite a few of the recent bugs including #45, #36, #29, #28, #25, #21, and #2. These issues were caused due to goquery (and thus kepubify) internally using
golang.org/x/net/html
which is a HTML5 library. The parsing was fine for nearly all books, but it wasn't tolerant of self-closing non-void elements (as the spec says it should, but it's not really a useful thing to do), which many XML generators generate when an element doesn't contain any children. The code generation was also usually fine as it generated valid XML (it did things optional for HTML5 but mandatory for XHTML: putting the closing/>
on void elements, having an empty=""
on boolean attributes, using only a few named entities, etc), but it caused a few issues with not escaping NBSPs, and messing up the XML declarations found in XHTML for EPUB2 books.In addition, these changes will improve the performance and memory usage of kepubify (there will be a lot less string allocations and copying).
The majority of the XHTML/XML/HTML5 fixes has been done in my fork of
golang.org/x/net/html
: https://github.com/geek1011/net/commits?author=geek1011.The text was updated successfully, but these errors were encountered: