Skip to content

kentfredric/HTML-FormatText-WithLinks-AndTables

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NAME

HTML::FormatText::WithLinks::AndTables - Converts HTML to Text with tables intact

SYNOPSIS

use HTML::FormatText::WithLinks::AndTables;

my $text = HTML::FormatText::WithLinks::AndTables->convert($html);

Or optionally...

my $conf = { # same as HTML::FormatText excepting below
    cellpadding   => 2,  # defaults to 1
    no_rowspacing => 1,  # bool, suppress vertical space between table rows
};

my $text = HTML::FormatText::WithLinks::AndTables->convert($html, $conf);

DESCRIPTION

This module was inspired by HTML::FormatText::WithLinks which has proven to be a useful `lynx -dump` work-alike. However one frustration was that no other HTML converters I came across had the ability to deal affectively with HTML <TABLE>s. This module can in a rudimentary sense do so. The aim was to provide facility to take a simple HTML based email template, and to also convert it to text with the <TABLE> structure intact for inclusion as "multipart/alternative" content. Further, it will preserve both the formatting specified by the <TD> tag's "align" attribute, and will also preserve multiline text inside of a <TD> element provided it is broken using <BR/> tags.

EXPORT

None by default.

METHODS

convert

EXAMPLE

Given the HTML below ...

<HTML><BODY>
<TABLE>
    <TR>
        <TD ALIGN="right">Name:</TD>
        <TD>Mr. Foo Bar</TD>
    </TR>
    <TR>
        <TD ALIGN="right">Address:</TD>
        <TD>
            #1-276 Quux Lane,     <BR/>
            Schenectady, NY, USA, <BR/>
            12345
        </TD>
    </TR>
    <TR>
        <TD ALIGN="right">Email:</TD>
        <TD><a href="mailto:foo@bar.baz">foo@bar.baz</a></TD>
    </TR>
</TABLE>
</BODY></HTML>

... the (default) return value of convert() will be as follows.

Name:  Mr. Foo Bar

    Address:  #1-276 Quux Lane,
       Schenectady, NY, USA,
       12345

      Email:  [1]foo@bar.baz



       1. mailto:foo@bar.baz

SEE ALSO

HTML::FormatText::WithLinks
HTML::TreeBuilder

CAVEATS

* <TH> elements are treated identically to <TD> elements

* It assumes a fixed width font for display of resulting text.

* It doesn't work well on nested <TABLE>s or other nested blocks within <TABLE>s.

AUTHOR

Shaun Fryer, <pause.cpan.org at sourcery.ca> (author emeritus)

Dale Evans, <daleevans at cpan.org> (current maintainer)

BUGS

Please report any bugs or feature requests to bug-html-formattext-withlinks-andtables at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=HTML-FormatText-WithLinks-AndTables. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc HTML::FormatText::WithLinks::AndTables

You can also look for information at:

ACKNOWLEDGEMENTS

Everybody. :) http://en.wikipedia.org/wiki/Standing_on_the_shoulders_of_giants

COPYRIGHT & LICENSE

Copyright 2008 Shaun Fryer, all rights reserved.

Copyright 2015 Dale Evans, all rights reserved

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

About

Perl module converts HTML to Text with tables intact.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Perl 100.0%