Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.
Sign up{{ article.content }} gets redundant <html><body> tags #1
Comments
kernc
referenced this issue
Dec 22, 2014
Closed
Duplicate <html><body> in output generated using HTML file as source #984
This comment has been minimized.
This comment has been minimized.
|
I just tried verifying your problem and I am not able to reproduce:
|
This comment has been minimized.
This comment has been minimized.
|
Interesting. This is what I see:
and is surely responsible for the invalid resulting HTML. That was BeautifulSoup 4.1.0 as well as (just upgraded) 4.3.2. |
This comment has been minimized.
This comment has been minimized.
|
This comment has been minimized.
This comment has been minimized.
|
Confirming your observations with Python 3. |
This comment has been minimized.
This comment has been minimized.
|
might be more an issue like https://medium.com/@as_w/beware-beautiful-soup-and-lxml-f2fa442daf99 not python2 vs python3 i pushed an update d2c48c0 wich specifies html.parser as parser for bs, can you try if this resolves the issue? |
This comment has been minimized.
This comment has been minimized.
|
It does, thanks. |
kernc
closed this
Dec 22, 2014
This comment has been minimized.
This comment has been minimized.
|
thanks for reporting, was not aware of bs behavior of changing defaults because of installation of packages |
kernc commentedDec 22, 2014
After this plugin runs, each
Content.contenthas a<html><body>prefix and</body></html>suffix, both inconveniently added by BeautifulSoup, resulting in this issue: getpelican/pelican#984