Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML <picture> Element not returned as image link from srcset #395

Open
contendaClara opened this issue Oct 14, 2022 · 0 comments
Open

HTML <picture> Element not returned as image link from srcset #395

contendaClara opened this issue Oct 14, 2022 · 0 comments

Comments

@contendaClara
Copy link

The image link from the srcset is not returned in the markdown return in the <picture> html element. I expect it to be returned like if the image src was in the <img> html element.

Code snippet example:

import html2text

html = """
<section>
    <h1>Poorly drawn lines comics</h1>
    <picture>
        <source
            sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px"
            srcset=" https://pbs.twimg.com/media/FbVo3fiUcAAYytB?format=jpg&name=smal 640w, 
                https://pbs.twimg.com/media/FbVo3fiUcAAYytB?format=jpg&name=medium 828w, 
                https://pbs.twimg.com/media/FbVo3fiUcAAYytB?format=jpg&name=large 1400w" />
        <img alt="" />
    </picture>
    <p>
        This is one of my most favorite recent comics. Comes in print too. I want it for my home.
    </p>
</section>
"""
md = html2text.html2text(html)
print(md)

Actual Output:

# Poorly drawn lines comics

This is one of my most favorite recent comics. Comes in print too. I want it
for my home.

Expected Output:

  • includes the image link (though I'm not particular for which one)
  • same result as if using the <img> html element
# Poorly drawn lines comics

![](https://pbs.twimg.com/media/FbVo3fiUcAAYytB?format=jpg&name=small)

This is one of my most favorite recent comics. Comes in print too. I want it
for my home.
  • Version by html2text --version 2020.1.16
  • Python version python --version 3.9.13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant