bug 1453796: fix generation of HTML when filtering noincludes #4737
bug 1453796: fix generation of HTML when filtering noincludes #4737
Conversation
Looking forward to this landing; it will let me finish up a few things. Great job! |
I dug in to see when this might have broke, and I can't identify a library update that would have changed this behavior. I tested a few configurations going back 2-3 years ago, and the issue persists.
Every year, I find something that makes me wish XHTML won. There's a few more instances where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I was able to reproduce the issue locally, and this fixed it. I think more work needs to be done for other instances of .html
, but I'm happy to do that in a new PR.
kuma/wiki/content.py
Outdated
# to be closed in the opening tag. For example, without "method='html'" | ||
# the output would be "<iframe/>" instead of the correct | ||
# "<iframe></iframe>". | ||
return doc.html(method='html') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we decide to fix all the .html()
calls, this might be done as a utility function such as to_html
, rather than adding a 5-line comment to every call.
@@ -2710,6 +2710,7 @@ def test_raw_include_option(self): | |||
<dd>Type: <em>integer</em></dd> | |||
<dd>Przykłady 例 예제 示例</dd> | |||
</dl> | |||
<p><iframe></iframe></p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to give feedback that the <iframe>
test should be in it's own test, but 1) I don't want to make you convert to pytest
as well, and 2) pyquery will return None
if you just serialize an empty <iframe>
.
👍 |
Codecov Report
@@ Coverage Diff @@
## master #4737 +/- ##
=======================================
Coverage 95.83% 95.83%
=======================================
Files 270 270
Lines 24613 24613
Branches 1745 1745
=======================================
Hits 23589 23589
Misses 814 814
Partials 210 210
Continue to review full report at Codecov.
|
@jwhitlock This is ready for another review. Here's what I changed:
I should add that all new/modified tests were run with and without the fix to ensure that they detected the bug. |
The additional code looks good as well. Thanks @escattone |
This PR fixes Bugzilla #1453796. The bug occurs when the
include
parameter is used during document requests, for example when using thepage
macro. Theinclude
parameter triggers a removal of all elements withclass="noinclude"
, which is done usingpyquery
, after which it re-generates the filtered HTML. During the re-generation of the HTML, any empty HTML elements (e.g.,<iframe ...></iframe>
) are output as self-closing elements (e.g.,<iframe ... />
), which in the case ofiframe
, is illegal. When loaded, the self-closingiframe
seems to be interpreted as simply an opening tag, and so swallows all subsequent HTML.For example, https://developer.mozilla.org/en-US/docs/Web/CSS/perspective?raw=1¯os=1§ion=Setting_perspective works fine, but when the
include=1
is added, the failure appears (https://developer.mozilla.org/en-US/docs/Web/CSS/perspective?raw=1¯os=1&include=1§ion=Setting_perspective).The fix is to call
doc.html(method='html')
instead ofdoc.html()
in thekuma.wiki.content.filter_out_noinclude
function (see http://pyquery.readthedocs.io/en/latest/api.html#pyquery.pyquery.PyQuery.html).This does not appear to be due to a
pyquery
package version update. I'm not sure why and when this bug appeared. Could it be that theiframe
elements used to be populated, but then changed to being empty?