-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTMl enities get out of anchor text #31
Comments
szepeviktor: I expected |
szepeviktor it's because Well this need to be fixed, text should be be utf-8 and encoded too.
or just simply:
I guess we need to encode the text before feeding it to |
szepeviktor: Yes. Only a one-byte HTML entity will get outside the anchor, not an UTF-8 character. |
I guess it should be handled by html2text, I mean encoding the input to Feel free to patch and make it by default. |
Changed call order of handle_charref and handle_entityref. Error was caused due to first element of link title being charref and thus calling handle_charref instead of handle_data where the '[' is inserted.
This is only caused when the link text begins with a char reference. This is because the first call is to Fixed in #77 |
The issue is solved in #77. Can we close this? |
Please wait till I get home, and confirm. |
@szepeviktor sure sure. it is morning again and I have to sleep. See you on the other side of the sun. 😄 |
There is a problem:
Shouldn't |
@szepeviktor The command line by default works with ASCII. Hence they are being converted to ASCII equivalents. As of now there is no command line option for
|
Thank you. |
@szepeviktor:
another strange behaviour
The text was updated successfully, but these errors were encountered: