-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retrieving modified HTML #66
Comments
Hi! Or I can create serialization as required by the specification. Something like a function that will return |
I would - of course - be very happy if you were to make a full serialization algorithm. It would save a lot of work for me traversing the whole tree to build all the attributes. |
Ok, I will create it |
Done. See example. But, I have not time to test this. Please, test the serialization function. |
It seems to work well when serializing HTML that is already valid. When you parse something invalid and then serialize it the library fixes this to make the output valid (which is good!), but not all the elements that appear in the output can be found using myhtml_get_nodes_by_tag_id. Observe the following - obviously incorrect - HTML code:
This gets serialized to the following output:
As you can see the invalid code tag has caused the the tag to be sort of duplicated. Only the first of these tags is findable using myhtml_get_nodes_by_tag_id. |
Fixed! |
Perhaps I'm being very silly but I cannot find a way to retrieve the modified HTML from the tree. If I print the tree by using myhtml_tree_print_node_children I can see all the modifications, but this output is of course not the HTML I want.
The only way I found to get HTML out again was using myhtml_tree_incoming_buffer_first, but this gets me the unmodified HTML, which is not useful to me.
What is the correct way of doing this?
The text was updated successfully, but these errors were encountered: