-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spaces inside and around links are concatenated #58
Comments
I don't think this is a big bug. I'll leave this issue here for now, but It shouldn't be considered as a bug. half-spaces: In Persian based languages such as Arabic, Farsi, Dari, etc is a must. Words like می باشد and میباشد are such examples. |
I do not understand the issue? Is the preservation of spaces the bug in question? |
I had to work aroud this bug in another project, as the text version was showing two spaces where the HTML version (in an email) was showing only one space as intended. As an additional argument, when the spaces are at one side of the link tag, html2text only outputs one space:
Just like browsers do: Note that the space in the second example is part of the link (in Chrome). I don't know how non-Latin-based languages do this formatting. @theSage21 the problem is that there are two spaces rendered where there should be (in my opinion and as browsers indicate) only one. |
A browser renders one space irrespective of how many spaces are there between two words.
shows that the spaces do not matter and only one space is rendered even though the html is evidently not the same. @aykevl My point being that html2text preserves information from the original html and translates that to text. edit |
Not a |
When there's a space before a link and before the link's content, both are preserved:
Browsers strip the text inside the link:
This is html2text installed from pip, on Debian jessie.
The text was updated successfully, but these errors were encountered: