Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output from expand_urls causes invalid Atom feeds #198

Closed
mikl opened this issue Oct 3, 2011 · 6 comments
Closed

Output from expand_urls causes invalid Atom feeds #198

mikl opened this issue Oct 3, 2011 · 6 comments

Comments

@mikl
Copy link
Contributor

mikl commented Oct 3, 2011

In atom.xml, the following line is used to escape and output the content:

    <content type="html">{{ post.content | expand_urls: site.url | xml_escape }}</content>

However, the output from expand_urls is not escaped, so if you have a code block in your code, you will get something like this:

&lt;p&gt;Mine looks like this:&lt;/p&gt;

&lt;p&gt;<div class='bogus-wrapper'><notextile><figure class='code'><figcaption><span> (solr.xml)</span> <a href='/downloads/code/solr.xml'>download</a></figcaption>
 <div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>

ie. with escaped and unescaped HTML mixed together.

As the spec for text type decrees, an element with type="html"must only contain escaped HTML. It is also allowed to set `type="xhtml" and then have unescaped markup instead, but then the whole thing must be unescaped and valid XHTML.

So, long story short, using code blocks or similar tags yields invalid Atom feeds.

@imathis
Copy link
Owner

imathis commented Oct 3, 2011

Interesting. What do you think is the best fix here?

@fhemberger
Copy link
Contributor

How about wrapping all content in CDATA tags instead of escaping them by hand?
If you use type="xhtml" you obviously have to wrap the content in a div giving the correct XML namespace: http://www.xml.com/pub/a/2005/12/07/handling-atom-text-and-content-constructs.html

Also, the <!-- more --> tag is converted to <!\u2013 more \u2013>, which throws warnings for the feed, see http://validator.w3.org/feed/

I'm using code blocks in my feed and it's validated correctly.

@mikl
Copy link
Contributor Author

mikl commented Oct 4, 2011

The CDATA solution could be better, but we need to be aware that CDATA doesn't nest, so we need to escape CDATA-end-tags (]]>) in the rendered content, so a blog post involving those doesn't end the CDATA envelope prematurely :)

@fhemberger
Copy link
Contributor

We should add this CDATA end tag to the xml_escape perhaps.

@imathis
Copy link
Owner

imathis commented Oct 4, 2011

@fhemberger or @mikl think you can manage a pull request for this? It sounds like you guys have a better idea on this than I do.

@fhemberger
Copy link
Contributor

@imathis Aye, I'll have a look later on ...

briansimmons pushed a commit to briansimmons/octopress that referenced this issue Aug 20, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants