Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ignore blocks for text generation #244

Merged
merged 1 commit into from May 17, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 5 additions & 0 deletions lib/premailer/html_to_plain_text.rb
Expand Up @@ -10,6 +10,11 @@ module HtmlToPlainText
def convert_to_text(html, line_length = 65, from_charset = 'UTF-8')
txt = html

# strip text ignored html. Useful for removing
# headers and footers that aren't needed in the
# text version
txt.gsub!(/<!-- start text\/html -->.*?<!-- end text\/html -->/m, '')

# replace images with their alt attributes
# for img tags with "" for attribute quotes
# with or without closing tag
Expand Down
36 changes: 24 additions & 12 deletions test/test_html_to_plain_text.rb
Expand Up @@ -69,19 +69,31 @@ def test_lists
assert_plaintext "* item 1\n* item 2", "<li class='123'>item 1</li> <li>item 2</li>\n"
assert_plaintext "* item 1\n* item 2\n* item 3", "<li>item 1</li> \t\n <li>item 2</li> <li> item 3</li>\n"
end

def test_stripping_html
assert_plaintext 'test text', "<p class=\"123'45 , att\" att=tester>test <span class='te\"st'>text</span>\n"
end

def test_stripping_ignored_blocks
html = <<END_HTML
<p>test</p>
<!-- start text/html -->
<img src="logo.png" alt="logo">
<!-- end text/html -->
<p>text</p>
END_HTML
premailer = Premailer.new(html, :with_html_string => true)
assert_match /test\n\ntext/, premailer.to_plain_text
end

def test_paragraphs_and_breaks
assert_plaintext "Test text\n\nTest text", "<p>Test text</p><p>Test text</p>"
assert_plaintext "Test text\n\nTest text", "\n<p>Test text</p>\n\n\n\t<p>Test text</p>\n"
assert_plaintext "Test text\nTest text", "\n<p>Test text<br/>Test text</p>\n"
assert_plaintext "Test text\nTest text", "\n<p>Test text<br> \tTest text<br></p>\n"
assert_plaintext "Test text\n\nTest text", "Test text<br><BR />Test text"
end

def test_headings
assert_plaintext "****\nTest\n****", "<h1>Test</h1>"
assert_plaintext "****\nTest\n****", "\t<h1>\nTest</h1> "
Expand All @@ -90,7 +102,7 @@ def test_headings
assert_plaintext "----\nTest\n----", "<h2>Test</h2>"
assert_plaintext "Test\n----", "<h3> <span class='a'>Test </span></h3>"
end

def test_wrapping_lines
raw = ''
100.times { raw += 'test ' }
Expand All @@ -103,7 +115,7 @@ def test_wrapping_lines
end

def test_img_alt_tags
# ensure html imag tags that aren't self-closed are parsed,
# ensure html imag tags that aren't self-closed are parsed,
# along with accepting both '' and "" as attribute quotes

# <img alt="" />
Expand All @@ -119,7 +131,7 @@ def test_img_alt_tags
def test_links
# basic
assert_plaintext 'Link ( http://example.com/ )', '<a href="http://example.com/">Link</a>'

# nested html
assert_plaintext 'Link ( http://example.com/ )', '<a href="http://example.com/"><span class="a">Link</span></a>'

Expand All @@ -131,38 +143,38 @@ def test_links

# complex link
assert_plaintext 'Link ( http://example.com:80/~user?aaa=bb&c=d,e,f#foo )', '<a href="http://example.com:80/~user?aaa=bb&amp;c=d,e,f#foo">Link</a>'

# attributes
assert_plaintext 'Link ( http://example.com/ )', '<a title=\'title\' href="http://example.com/">Link</a>'

# spacing
assert_plaintext 'Link ( http://example.com/ )', '<a href=" http://example.com/ "> Link </a>'

# multiple
assert_plaintext 'Link A ( http://example.com/a/ ) Link B ( http://example.com/b/ )', '<a href="http://example.com/a/">Link A</a> <a href="http://example.com/b/">Link B</a>'

# merge links
assert_plaintext 'Link ( %%LINK%% )', '<a href="%%LINK%%">Link</a>'
assert_plaintext 'Link ( [LINK] )', '<a href="[LINK]">Link</a>'
assert_plaintext 'Link ( {LINK} )', '<a href="{LINK}">Link</a>'

# unsubscribe
assert_plaintext 'Link ( [[!unsubscribe]] )', '<a href="[[!unsubscribe]]">Link</a>'

# empty link gets dropped, and shouldn't run forever
assert_plaintext(("This is some more text\n\n" * 14 + "This is some more text"), "<a href=\"test\"></a>#{"\n<p>This is some more text</p>" * 15}")
end

# see https://github.com/alexdunae/premailer/issues/72
def test_multiple_links_per_line
assert_plaintext 'This is link1 ( http://www.google.com ) and link2 ( http://www.google.com ) is next.',
assert_plaintext 'This is link1 ( http://www.google.com ) and link2 ( http://www.google.com ) is next.',
'<p>This is <a href="http://www.google.com" >link1</a> and <a href="http://www.google.com" >link2 </a> is next.</p>',
nil, 10000
end

# see https://github.com/alexdunae/premailer/issues/72
def test_links_within_headings
assert_plaintext "****************************\nTest ( http://example.com/ )\n****************************",
assert_plaintext "****************************\nTest ( http://example.com/ )\n****************************",
"<h1><a href='http://example.com/'>Test</a></h1>"
end

Expand Down