Skip to content

soundasleep/html2text

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

example workflow Total Downloads

html2text is a very simple script that uses DOM methods to convert HTML into a format similar to what would be rendered by a browser - perfect for places where you need a quick text representation. For example:

<html>
<title>Ignored Title</title>
<body>
  <h1>Hello, World!</h1>

  <p>This is some e-mail content.
  Even though it has whitespace and newlines, the e-mail converter
  will handle it correctly.

  <p>Even mismatched tags.</p>

  <div>A div</div>
  <div>Another div</div>
  <div>A div<div>within a div</div></div>

  <a href="http://foo.com">A link</a>

</body>
</html>

Will be converted into:

Hello, World!

This is some e-mail content. Even though it has whitespace and newlines, the e-mail converter will handle it correctly.

Even mismatched tags.

A div
Another div
A div
within a div

[A link](http://foo.com)

See the original blog post or the related StackOverflow answer.

Installing

You can use Composer to add the package to your project:

{
  "require": {
    "soundasleep/html2text": "~1.1"
  }
}

And then use it quite simply:

$text = \Soundasleep\Html2Text::convert($html);

You can also include the supplied html2text.php and use $text = convert_html_to_text($html); instead.

Options

Option Default Description
ignore_errors false Set to true to ignore any XML parsing errors.
drop_links false Set to true to not render links as [http://foo.com](My Link), but rather just My Link.
char_set 'auto' Specify a specific character set. Pass multiple character sets (comma separated) to detect encoding, default is ASCII,UTF-8

Pass along options as a second argument to convert, for example:

$options = array(
  'ignore_errors' => true,
  // other options go here
);
$text = \Soundasleep\Html2Text::convert($html, $options);

Tests

Some very basic tests are provided in the tests/ directory. Run them with composer install && vendor/bin/phpunit.

Troubleshooting

Class 'DOMDocument' not found

You need to install the PHP XML extension for your PHP version. e.g. apt-get install php7.4-xml

License

html2text is licensed under MIT, making it suitable for both Eclipse and GPL projects.

Other versions

Also see html2text_ruby, a Ruby implementation.