Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out what to do with HTML #3

Closed
swsnr opened this issue Jan 7, 2018 · 12 comments
Closed

Figure out what to do with HTML #3

swsnr opened this issue Jan 7, 2018 · 12 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@swsnr
Copy link
Owner

swsnr commented Jan 7, 2018

No description provided.

@swsnr swsnr added the help wanted Extra attention is needed label Jan 16, 2018
@PerBothner
Copy link

As mentioned in my comment to issue #12, DomTerm has an escape sequence for handlng inline html, which makes this issue trivial.

@swsnr
Copy link
Owner Author

swsnr commented Apr 24, 2018

@PerBothner Thanks, that's very cool, but we'll still need to figure out how to support other terminal emulators. At least, we must limit the special DomTerm escape codes to DomTerm and not use them on other emulators.

Is there a way to detect whether a program runs in DomTerm? Some environment variable perhaps?

@PerBothner
Copy link

As I mentioned in issue #12, the quick-and-dirty (and probably good enough) way is to check for a DOMTERM environment variable.

For a more robust test, see this discussion.

@swsnr
Copy link
Owner Author

swsnr commented Apr 24, 2018

@PerBothner Thanks, I read your comment there right after I wrote the answer 🙈

@swsnr swsnr added enhancement New feature or request and removed enhancement New feature or request labels Jan 6, 2019
@rien333
Copy link

rien333 commented Apr 14, 2019

This feature would be handy to parse out comments, as most markdown viewers/converters do (e.g. pandoc). If you're interested in filtering out (html) comments specifically I could open up a separate issue tho.

@geoff-nixon
Copy link

Render it with w3m?

@swsnr
Copy link
Owner Author

swsnr commented Jul 29, 2019

No.

@brandonkal
Copy link

brandonkal commented Mar 20, 2020

I'm not sure what the aversion is to using w3m -dump. The w3m browser isn't even 2 MB and is available by default on many Linux distros. The Homebrew w3m doesn't have image support as far as I can tell though...

But if you don't want to render any html, it would be nice to at least handle the block below:

<p align="center">
  <img height="180" width="210" src="https://user-images.githubusercontent.com/1631044/61989571-aae27580-afff-11e9-8f8a-c9768ed7a6b8.png">
</p>

Centered images are often the only HTML in markdown documents such as the README.md of a project. Blocks like the above are quite common in GitHub markdown files even though the align attribute has been deprecated since 1997.

Then just print any other HTML as normal.

@swsnr
Copy link
Owner Author

swsnr commented Mar 20, 2020

I don't think w3m would work well, so I'll not do it myself. You say it doesn't have image support so it wouldn't even fix the one case you're interested in would it?

But go ahead and make a pull request with a prototype implementation and we can try it and see whether it's a viable solution.

In any case it won't happen if you don't do it; that's why there's a "help wanted" label 🤷‍♀️

@thecaralice
Copy link

I'm not sure what the aversion is to using w3m -dump. The w3m browser isn't even 2 MB and is available by default on many Linux distros. The Homebrew w3m doesn't have image support as far as I can tell though...

But if you don't want to render any html, it would be nice to at least handle the block below:

<p align="center">
  <img height="180" width="210" src="https://user-images.githubusercontent.com/1631044/61989571-aae27580-afff-11e9-8f8a-c9768ed7a6b8.png">
</p>

Centered images are often the only HTML in markdown documents such as the README.md of a project. Blocks like the above are quite common in GitHub markdown files even though the align attribute has been deprecated since 1997.

Then just print any other HTML as normal.

w3m works only on Xorg and not with all terminal emulators

@norman-abramovitz
Copy link
Contributor

I am wondering if conceptually using something like html2text or html2markdown would make some sense?

Yes, we would need to rewrite the code in rust, but would the concept work, of identifying the HTML block, feed it to the converter, and then feed the output back into your markdown formatter.

We can start small by processing some simple constructs at first.

@swsnr
Copy link
Owner Author

swsnr commented Oct 18, 2023

Closing this, because I don't intend to do anything about HTML any time soon.

Pull requests are welcome.

@swsnr swsnr closed this as not planned Won't fix, can't repro, duplicate, stale Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

7 participants