Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XHTMLish output #56

Closed
thorn0 opened this issue Feb 19, 2015 · 9 comments
Closed

XHTMLish output #56

thorn0 opened this issue Feb 19, 2015 · 9 comments

Comments

@thorn0
Copy link
Contributor

thorn0 commented Feb 19, 2015

I need to convert a document to such a format that it should be both HTML and XML in the same time. That is, the markup like <select/> is not allowed, should be <select></select> instead, but the markup like <img src="..." /> is required. Would be nice if AngularSharp could provide some control over the process of DOM serialization.

Html Agility Pack has something like this. At least, it can output XHTML.

@FlorianRappl
Copy link
Contributor

Well, the DOM serialization is specified (hence fixed). However, we could now do two things. Either you call a method that takes options, which will do the serialization (well, this is somewhat flexible, but not as much as I would like), or there will be a visitor like pattern, where the user gets to choose what serialization he wants to take - and how he wants to deal with it.

@FlorianRappl
Copy link
Contributor

I now provide another ToHtml method that takes any instance implementing the IMarkupFormatter. Right now there is just a single class implementing this interface, which is the HtmlMarkupFormatter. I will probably also provide a XHtmlMarkupFormatter and / or a XmlMarkupFormatter, but you can also use your own formatter if the existing ones don't fit your needs.

My guess is that this approach is flexible enough to fit all needs associated with node serialization. Let me know if you think something is still missing.

@thorn0
Copy link
Contributor Author

thorn0 commented Feb 20, 2015

Thank you, looks great. There are cases when it's needed to format InnerHtml of an element. Would you consider adding ToHtml to INodeList to make it possible to write: element.ChildNodes.ToHtml(formatter)?

@FlorianRappl
Copy link
Contributor

Yes!

FlorianRappl added a commit that referenced this issue Feb 20, 2015
@thorn0
Copy link
Contributor Author

thorn0 commented Feb 20, 2015

It'd be easier to inherit from AngleSharp.Html.HtmlMarkupFormatter if its methods were virtual and weren't declared as an explicit interface implementation. I submitted #57.

@FlorianRappl
Copy link
Contributor

That all is on purpose. Please FCoI. The class itself has no data members and is only a functional construct. It should be treated as such. The class is also only visible, that you can actually use its methods.

@thorn0
Copy link
Contributor Author

thorn0 commented Feb 20, 2015

Got it. However, I have trouble with writing a formatter implementation for following requirements. Line breaks should be inserted before and after each block element (looking just at its name: p, div, etc.), but two consecutive line breaks must never be inserted.

For example:

<div>123</div>456<div>789</div><div>abc</div><ul><li>aa</li><li>bb</li></ul>

should be converted to

<div>123</div>
456
<div>789</div>
<div>abc</div>
<ul>
<li>aa</li>
<li>bb</li>
</ul>

TinyMCE formats HTML this way. The demo on the main page shows formatted HTML if you click _Tools -> Source code_. Just in case, here is their implementation.

@thorn0
Copy link
Contributor Author

thorn0 commented Feb 20, 2015

I thought I implemented it, but suddenly it turned out that CloseTag for an element is called before the serialization of the children. :)

@FlorianRappl
Copy link
Contributor

Yap, I call CloseTag before serializing the children. For me that was alright, but if you want to allow the scenario you are talking about, it should be called in logical order. I will fix this.

FlorianRappl added a commit that referenced this issue Feb 20, 2015
lahma pushed a commit to lahma/AngleSharp that referenced this issue Apr 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants