Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<script> and <iframe> tags should be returned as-is #45

Open
Quantisan opened this issue Jul 29, 2012 · 6 comments
Open

<script> and <iframe> tags should be returned as-is #45

Quantisan opened this issue Jul 29, 2012 · 6 comments

Comments

@Quantisan
Copy link

No description provided.

@bitboxer
Copy link

👍 for this one.

@mcepl
Copy link

mcepl commented Apr 9, 2014

-1 from me ... html2text IMHO should be kept to the minimum. If you need anything more complicated, go and pre-/post-process its input/output.

@bitboxer
Copy link

bitboxer commented Apr 9, 2014

The problem is that if you want to convert HTML from Wordpress to a Jekyll Markdown, you want to preserve script and iframe tags. They will be lost afterwards. You could create a parser that replaces them by a marker string and replace that marker string after the conversion, but it would be way nicer if this lib has an option for this. And less error prone.

@mcepl
Copy link

mcepl commented Apr 9, 2014

What in the world is the point of storing iframes in Jekyll? Anyway, some escaping of HTML elements ('<' => <) should be sufficient shouldn't it? That's what I meant as pre-/post-processing.

@bitboxer
Copy link

bitboxer commented Apr 9, 2014

What is the point? Maybe I just want to preserve youtube iframes when converting my blog 😉 . Escape the HTML elements is really bad and is very error prone. Why do all this ugyl workarounds when html2text can do this easily.

@Alir3z4
Copy link

Alir3z4 commented Apr 9, 2014

Currently html2text does everything in one place, I guess @mcepl is right about pre-/post-processing. We need to implement such a functionality to enable other control that behavior and do what ever they want to without touching html2text directly and make the stuff dirty.

Of course we can pass any tag to prevent removing them and have an option on html2text but all these stuff would make it ugly as possible.

After all my -1 vote for this issue.

pombredanne pushed a commit to pombredanne/html2text that referenced this issue Oct 10, 2015
pombredanne pushed a commit to pombredanne/html2text that referenced this issue Oct 10, 2015
…andard-input-when-running-under-python-3

Fix aaronsw#45 does not accept standard input when running under python 3
Fixes aaronsw#45

Thanks to:
* Mark Blakeney @bulletmark
* @djr7C4
* @willemw12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants