You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When run through floki:
[{"p", [],
["I want ", {"a", [{"href", "http://www.google.com"}], ["link to"]},
{"strong", [], ["article"]}]}]}]
Maybe this is the intended result... which is perfectly fine. I was hoping to use floki for a weird use-case... basically run HTML through floki, parse through the floki response, removing tags/attributes that aren't allowed, and then re-create html. Essentially using floki to help me cleanse user inputted html to remove tags that aren't allowed and/or malicious javascript attributes (xss attacks).
Because of how I wanted to use it, the space between </a> and <strong> is actually important, otherwise "link to" and "article" will be touching each other.
Feel free to close the issue if it was intended.
The text was updated successfully, but these errors were encountered:
Hi @leighhalliday, great to know that your are using Floki! :)
This is an important issue, but it's not easy to solve. I'm using the mochiweb lib which is the responsible for parsing HTML inside Floki, so the problem is in that library. It's a great parser, but has some issues like this. Unfortunally I don't have much control over the parser.
I'm planning to write and release a complete new HTML parser in Elixir in some point of this year. But for now I can't help you :/.
I'm closing the issue for now, but I will revisit in the future when I have that parser.
Thanks!
Example html:
<p>I want <a href="http://www.google.com">link to</a> <strong>article</strong></p>
When run through floki:
[{"p", [],
["I want ", {"a", [{"href", "http://www.google.com"}], ["link to"]},
{"strong", [], ["article"]}]}]}]
Maybe this is the intended result... which is perfectly fine. I was hoping to use floki for a weird use-case... basically run HTML through floki, parse through the floki response, removing tags/attributes that aren't allowed, and then re-create html. Essentially using floki to help me cleanse user inputted html to remove tags that aren't allowed and/or malicious javascript attributes (xss attacks).
Because of how I wanted to use it, the space between </a> and <strong> is actually important, otherwise "link to" and "article" will be touching each other.
Feel free to close the issue if it was intended.
The text was updated successfully, but these errors were encountered: