You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.
Given that granary doesn't seem to need to parse the HTML anywhere I totally get if this is WON'T FIX. Given that the microformats parser returns the newlines as they are it also seems to be the wrong place to handle this(?)
After further reading it seems like if at the end of the process the feed generation could know if the source text is HTML or plain text this could be solved by keeping HTML unmodified, but the Activitystreams 1 format does not support keeping that distinction? Changing this seems more realistic, but still quite a bit of effort.
Just want to point out that I may have some problems with my own HTML/newline handling right now. Right now my HTML has newlines but no <br> tags, and I use css to get the whitespace to show up right. That means any consumers treating it as HTML will not see the newlines, since literal newlines in HTML are not significant. I think I'm going to need to update how my site handles newlines in general.
for my own notes: @aaronpk may be right above about his HTML in general, but for this specific case, the offending content is indeed inside <pre>s, which granary could still theoretically detect and preserve.
Some more thoughts, both assuming keeping AS1 as the central format:
Since AS1 generally assumes HTML for content, plain text properties could be turned to HTML on the input conversion in a way that transparently converts back on text-only outputs.
Several Python templating libraries have a concept of special string interface for HTML (e.g. available as Jinja.Markup or in MarkupSafe) which does not get escaped on output, so the object could know if it contains HTML or not.
thanks again for the ideas @sknebel. i handled this by adding a custom content_is_html property to AS1 when we generate it from HTML, and then use that to determine whether to strip newlines. we'll see what else breaks.