Skip to content

Commit

Permalink
Ignore <p> tags in table rows
Browse files Browse the repository at this point in the history
  • Loading branch information
gpanders committed Apr 19, 2021
1 parent 4592133 commit 855c1a0
Show file tree
Hide file tree
Showing 4 changed files with 30 additions and 0 deletions.
1 change: 1 addition & 0 deletions ChangeLog.rst
Expand Up @@ -6,6 +6,7 @@ UNRELEASED
* Add support for Python 3.9.
* Fix extra line breaks inside html link text (between '[' and ']')
* Fix #344: indent ``<ul>`` inside ``<ol>`` three spaces instead of two to comply with CommonMark, GFM, etc.
* Feature #198: Ignore ``<p>`` tags inside table rows

2020.1.16
=========
Expand Down
2 changes: 2 additions & 0 deletions html2text/__init__.py
Expand Up @@ -365,6 +365,8 @@ def handle_tag(
self.soft_br()
elif self.astack:
pass
elif self.split_next_td:
pass
else:
self.p()

Expand Down
12 changes: 12 additions & 0 deletions test/no_p_in_table.html
@@ -0,0 +1,12 @@
<!DOCTYPE html> <html>
<head lang="en"> <meta charset="UTF-8"> <title></title> </head>
<body> <h1>This is a test document</h1> With some text, <code>code</code>, <b>bolds</b> and <i>italics</i>. <h2>This is second header</h2> <p style="display: none">Displaynone text</p>
<table>
<tr> <th>Header 1</th> <th>Header 2</th> <th>Header 3</th> </tr>
<tr> <td><p>Content 1</p></td> <td><p>2</p></td> <td><img src="http://lorempixel.com/200/200" alt="200"/> Image!</td> </tr>
<tr> <td><p>Content 1 longer</p></td> <td><p>Content 2</p></td> <td><p>blah</p></td> </tr>
<tr> <td><p>Content </p></td> <td><p>Content 2</p></td> <td><p>blah</p></td> </tr>
<tr> <td><p>t </p></td> <td><p>Content 2</p></td> <td><p>blah blah blah</p></td> </tr>
</table>

</body> </html>
15 changes: 15 additions & 0 deletions test/no_p_in_table.md
@@ -0,0 +1,15 @@
# This is a test document

With some text, `code`, **bolds** and _italics_.

## This is second header

Displaynone text

Header 1 | Header 2 | Header 3
---|---|---
Content 1 | 2 | ![200](http://lorempixel.com/200/200) Image!
Content 1 longer | Content 2 | blah
Content | Content 2 | blah
t | Content 2 | blah blah blah

0 comments on commit 855c1a0

Please sign in to comment.