Skip to content
This repository has been archived by the owner on Nov 29, 2017. It is now read-only.

Working, official-unofficial Atom feed code #7

Closed
3 tasks done
ndarville opened this issue Apr 2, 2014 · 48 comments
Closed
3 tasks done

Working, official-unofficial Atom feed code #7

ndarville opened this issue Apr 2, 2014 · 48 comments

Comments

@ndarville
Copy link

I currently use this feed code for my blog. Unfortunately, it doesn’t seem to work in readers (ndarville/ndarville.github.io#11).

Finding the right feed code is a terrible experience I wouldn’t wish upon my worst enemy, and I recall how long it took me to get it to work for the static Blogofile CMS.

Can we establish a working official-unofficial Jekyll feed code that people can use, and which can be improved upon in one central place, should any new bugs be found?

Decisions

  • {{ site.baseurl }} vs. {{ site.url }} w/r/t SSL
  • xml_escape vs. CDATA
  • Filename: /atom/index.xml, atom.xml, feed.atom, etc.
@parkr
Copy link
Member

parkr commented Apr 2, 2014

This is a fantastic idea! @benbalter or @imathis might have an idea. I'd be down to write a plugin which handles this for us.

@budparr
Copy link
Contributor

budparr commented Apr 2, 2014

I think @mdo has a good example of a feed: https://github.com/poole/poole/blob/master/atom.xml

this works for me, realizing that much of this would vary from site to site:


<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

 <title>{{ site.sitename }}</title>
 <link href="{{ site.baseurl }}/atom.xml" rel="self"/>
 <link href="{{ site.baseurl }}/"/>
 <updated>{{ site.time | date_to_xmlschema }}</updated>
 <id>{{ site.baseurl }}/</id>
 <author>
   <name>{{ site.author }}</name>
   {% if  site.author.email  %}<email>{{ site.author.email }}</email>{% endif %}
 </author>

 {% for post in site.posts %}
 <entry>
   <title>{{ post.title }}</title>
   <link href="{{ site.baseurl }}{{ post.url }}"/>
   <updated>{{ post.date | date_to_xmlschema }}</updated>
   <id>{{ site.baseurl }}{{ post.id }}</id>
   <content type="html">{{ post.content | xml_escape }}</content>
 </entry>
 {% endfor %}

</feed>

@ndarville
Copy link
Author

Is there some kind of feed validation available? My method of seeing whether a handful of feed readers will eat my XML probably isn’t the smartest way of testing, especially if someone breaks farther down the road.

@budparr
Copy link
Contributor

budparr commented Apr 2, 2014

This: http://feedvalidator.org/
But beware - it will throw warnings based on content.

@ndarville
Copy link
Author

I implemented mdo’s script, but I still get some errors: http://feedvalidator.org/check.cgi?url=https%3A%2F%2Fndarville.github.io%2Ffeed%2Findex.xml.

It looks like feed readers still won’t eat it as well, so maybe we need something beyond mdo’s script. Maybe wrapping it in CDATA will help. I’ll try to look into it tomorrow.

@ndarville
Copy link
Author

Adding this change fixed it.

So if we add that part, we can canonize mdo’s as the unofficial-official Jekyll feed.

@budparr
Copy link
Contributor

budparr commented Apr 6, 2014

except I think it should be {{site.baseurl}} rather than just {{site.url}} http://jekyllrb.com/docs/upgrading/#baseurl

@ndarville
Copy link
Author

Hmm. I am using a hardcoded, absolute URL to ensure I direct users to the SSL version of my website—although I understand this is something specific to my set-up. Won’t {{ site.baseurl }} just redirect to /feed/index.xml?

Either way, the change is to add an xml_escape filter; it’s also important to figure out which URL is better for SSL nuts, though, albeit a separate discussion. We should still figure it out here, though.

- [ ] Decide on{{ site.baseurl }}vs.{{ site.url }}w/r/t SSL

@mscharley
Copy link

For what it's worth, here's my implementation of an atom feed with only one warning using the validator:

---
---
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>{{ site.title }}</title>
  <link href="http://{{ site.domain }}/"/>
  <link href="http://{{ site.domain }}/atom.xml" rel="self"/>
  <updated>{{ site.time | date_to_xmlschema }}</updated>
  <id>http://{{ site.domain }}/</id>
  <author>
    <name>{{ site.author.name }}</name>
    <email>{{ site.author.email }}</email>
  </author>

  {% for post in site.posts %}
  <entry>
    <title>{{ post.title }}</title>
    <link href="http://{{ site.domain }}{{ post.url }}?utm_source=atom&amp;utm_medium=rss&amp;utm_campaign=atom"/>
    <updated>{{ post.date | date_to_xmlschema }}</updated>
    <id>http://{{ site.domain }}{{ post.id }}</id>
    <content type="html">{{ post.content | xml_escape }}</content>
  </entry>
  {% endfor %}
</feed>

This is combined with a few fairly straightforward settings in config.yml:

title: Matthew Scharley
domain: matt.scharley.me
author:
  name: Matthew Scharley
  email: matt.scharley@gmail.com

http://feedvalidator.org/check.cgi?url=http%3A%2F%2Fmatt.scharley.me%2Fatom.xml

@mscharley
Copy link

After doing some testing, it seems CDATA is equivalent to xml_escape, which makes sense now that I think about it. That said, I've taken to using xml_escape for the <title> and CDATA for the <content> as it will negate the need to escape to XML entities, making the content both shorter (and hence smaller filesize) as well as easier to read the XML directly.

No opinion on baseurl vs url. baseurl seems useless at the moment as you can't include it if it is /, which is the default and the case in 90% of cases.

@budparr
Copy link
Contributor

budparr commented Apr 16, 2014

I don't find baseurl useless. I set my production baseurl in config and then flag my local builds with -baseurl "" so to set url and baseurl would be redundant being that baseurl has meaning and url, to my knowledge does not.

@mscharley
Copy link

An empty baseurl is actually a good idea, but it's not what's reflected in the documentation which recommends doing what you do and setting it to / locally. In that case, it's probably a good idea.

@mscharley
Copy link

That also still doesn't take care of the default being / though, so it would break on default setups.

@troyswanson
Copy link
Member

I've transitioned to using GitHub Pages Metadata instead of using something like baseurl in the configuration file.

Here's an example:

<link rel="icon" href="{{ site.github.url }}/img/favicon-16x16.png" sizes="16x16" type="image/png">
<link rel="icon" href="{{ site.github.url }}/img/favicon-32x32.png" sizes="32x32" type="image/png">
<link rel="stylesheet" href="{{ site.github.url }}/css/cornerstone.css">

@budparr
Copy link
Contributor

budparr commented Apr 16, 2014

Very cool, @troyswanson

@doktorbro
Copy link
Member

According to RFC4287 you can use the file extension atom for atom feeds:

<link rel="self"
  type="application/atom+xml"
  href="http://example.org/feed.atom"/>

@mscharley
Copy link

Yep, it's an official mime type
On 22/04/2014 9:27 am, "Anatol Broder" notifications@github.com wrote:

According to RFC4287 http://tools.ietf.org/html/rfc4287 you can use the
file extension atom for atom feeds:


Reply to this email directly or view it on GitHubhttps://github.com//issues/7#issuecomment-40988478
.

@ndarville
Copy link
Author

Just to get an idea of where we are, I added some tasks to the top post for the remaining things to hash out, before we decide on a canonical Atom file.

Feel free to add to the list.

@mscharley
Copy link

My 2c.

atom.xml.

xml_escape. CDATA will yield slightly smaller file sizes due to less
escaping, but xml_escape will yield more reliability.

@budparr
Copy link
Contributor

budparr commented Apr 22, 2014

@ndarville I think the first issue is solved by @troyswanson's comment about using Github Pages Metadata (site.github.url). I think the use of one of the two (site.url, etc.) may be user preferences, whereas using GH metadata ensures it won't break. SSL might be an issue, but I that's an edge case.

@mscharley
Copy link

The issue with Github Pages Metadata is that it relies on people deploying
to Github Pages, which is far from the only way that people are deploying
jekyll websites.

On 22 April 2014 22:17, Bud Parr notifications@github.com wrote:

@ndarville https://github.com/ndarville I think the first issue is
solved by @troyswanson https://github.com/troyswanson's comment about
using Github Pages Metadata (site.github.url). I think the use of one of
the two (site.url, etc.) may be user preferences, whereas using GH metadata
ensures it won't break. SSL might be an issue, but I that's an edge case.


Reply to this email directly or view it on GitHubhttps://github.com//issues/7#issuecomment-41032484
.

@budparr
Copy link
Contributor

budparr commented Apr 22, 2014

Of course, good point.

@mscharley
Copy link

Out of curiousity, what's the argument for using HTTPS with jekyll anyway?
There's nothing private in a static site, unless the entire site is, but
that would be the minority of uses I think. Perhaps we're overthinking it?

Of course, there's always another option that I just thought of. Scheme
relative URL's, eg. //github.com/jekyll/jekyll, which means use either
HTTP or HTTPS depending on how the original document was fetched.

On 22 April 2014 22:19, Bud Parr notifications@github.com wrote:

Of course, good point.


Reply to this email directly or view it on GitHubhttps://github.com//issues/7#issuecomment-41032663
.

@doktorbro
Copy link
Member

From what I see here site.url doesn’t exist by default. Jekyll expose site.baseurl only. The documentation lies about it. We can use conditions to find the base:

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

{% if site.url %}
  {% assign base = site.url %}
{% elsif site.github.url %}
  {% assign base = site.github.url %}
{% else %}
  {% assign base = '' %}
{% endif %}

<id>{{ base }}{{ page.url }}</id>

</feed>

If one doesn’t use Github, she sets the url in the config file to http://example.org (no trailing slash).

🔓
My votes:

  • site.url
  • xml_escape
  • feed.atom (Github uses the atom extension too)

🔒
Correction:

  • site.github.url if site.url isn't set
  • xml_escape
  • feed.atom

@ndarville
Copy link
Author

I like your idea of using conditionals. It means people also don’t have to alter their Atom code according to their set-up for most use cases.

@parkr
Copy link
Member

parkr commented Apr 22, 2014

Why not use site.github.url if site.url isn't set and site.github.url is available?

@albertogg
Copy link
Member

I vote for:

  • site.github.url if site.url isn't set.
  • xml_escape
  • feed.atom

Same as @penibelst but with @parkr conditional

@doktorbro
Copy link
Member

@parkr This is exactly what I want. Thank you @albertogg for putting everything together.

I second @albertogg’s selection.

@budparr
Copy link
Contributor

budparr commented Apr 23, 2014

I'm still not clear on why anyone would use site.url instead of site.baseurl. Can anyone enlighten me?

@ndarville
Copy link
Author

I’m down with @albertogg and @parkr as well.

@albertogg
Copy link
Member

@budparr from my understanding site.baseurl is that part of the slug or URL where the jekyll site starts, almost always just /. In an atom feed we need the entire URL so for that only reason site.url is the proper variable as it contains the whole domain e.g. example.com.

@budparr
Copy link
Contributor

budparr commented Apr 23, 2014

Thanks, @albertogg - I see now (and I found Andrew Shell's blog post clarifying the docs). I've been making baseurl do double duty, when its application should be more narrow than that. Appreciate it.

@albertogg
Copy link
Member

No problem @budparr.

@ndarville
Copy link
Author

Sounds like we’re all agreed.

How about we close this, when, but only when, it’s made the Jekyll codebase?

@parkr
Copy link
Member

parkr commented Apr 23, 2014

Where would you propose it goes in the Jekyll codebase? I'd rather release this as a plugin – trying to keep Jekyll slim 💃

@ndarville
Copy link
Author

Ah, whatever’s fine with me, as long as it’s one canonical thing to point people to.

@ndarville
Copy link
Author

Was this what we decided on?

---
layout: none
---
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
{% if site.url %}
    {% assign base = site.url %}
{% elsif site.github.url %}
    {% assign base = site.github.url %}
{% endif %}
    <title>{{ site.name }}</title>
    <link href="{{ base }}/feed.atom" rel="self"/>
    <link href="{{ base }}/"/>
    <updated>{{ site.time | date_to_xmlschema }}</updated>
    <id>{{ base }}/</id>
    <author>
        <name>{{ site.author }}</name>
        {% if site.author.email  %}<email>{{ site.author.email }}</email>{% endif %}
    </author>

    {% for post in site.posts %}
    <entry>
        <title>{{ post.title | xml_escape }}</title>
        <link href="{{ base }}{{ post.url }}"/>
        <updated>{{ post.date | date_to_xmlschema }}</updated>
        <id>{{ base }}{{ post.id }}</id>
        <content type="html">{{ post.content | xml_escape }}</content>
    </entry>
    {% endfor %}
</feed>

Three things about this:

  • I wasn’t sure how you wanted the conditional to be precisely, so look that one over.
  • I use four-space indentation. I know people have killed for less, so feel to change it to whichever format you prefer.
  • I left {{ site.author }} mandatory as something to be defined in _config.yml. So it’s not entirely plug and play that way.

@doktorbro
Copy link
Member

@ndarville I add:

  • subtitle
  • page.url instead of hardcoded url
  • site.author.url
  • customizable limit for posts
  • author of posts for multiple authors support
---
layout: none
limit: 10
---
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
{% if site.url %}
    {% assign base = site.url %}
{% elsif site.github.url %}
    {% assign base = site.github.url %}
{% endif %}
    <title>{{ site.name }}</title>
    <subtitle>{{ site.description }}</subtitle>
    <link href="{{ base }}{{ page.url }}" rel="self"/>
    <link href="{{ base }}/" rel="alternate" type="text/html"/>
    <updated>{{ site.time | date_to_xmlschema }}</updated>
    <id>{{ base }}/</id>
    <author>
        <name>{{ site.author.name }}</name>
        <email>{{ site.author.email }}</email>
        <uri>{{ site.author.url }}</uri>
    </author>

    {% for post in site.posts limit: page.limit %}
    <entry>
        <title>{{ post.title | xml_escape }}</title>
        <link href="{{ base }}{{ post.url }}"/>
        <updated>{{ post.date | date_to_xmlschema }}</updated>
        <id>{{ base }}{{ post.id }}</id>
        <author>
            <name>{{ post.author.name }}</name>
            <email>{{ post.author.email }}</email>
            <uri>{{ post.author.url }}</uri>
        </author>
        <content type="html">{{ post.content | xml_escape }}</content>
    </entry>
    {% endfor %}
</feed>

@sondr3
Copy link

sondr3 commented Apr 25, 2014

Is this one going to end up in the jekyll-sitemaps plugin or as a default template in the install of Jekyll?

@ndarville
Copy link
Author

@sondr3 I think putting it in the sitemap plug-in would be too confusing. Sounds like @parkr will set it up as an independent plug-in.

@troyswanson
Copy link
Member

Looks like the jekyll-sitemap plugin is now whitelisted on GitHub Pages via github/pages-gem#62.

@ndarville
Copy link
Author

What happens when you open the link to my feed.atom in your browser? Chrome and Safari don’t seem to take kindly to it on OS X.

Pre-blink Opera handles it, fwiw. :P

@doktorbro
Copy link
Member

@ndarville My Firefox v29 on Ubuntu shows a regular feed. It works here.

@ndarville
Copy link
Author

Looks like it’s working here now as well. Awesome.

@bcomnes
Copy link

bcomnes commented May 19, 2014

Don't forget to add PubSubHubbub to your feed! I wrote up some notes if you are curious.

@stve
Copy link

stve commented Jun 18, 2014

I took at stab at building out the library based around the templates and discussion above. It's implementation is nearly identical to that of jekyll-sitemap. It didn't seem like there was consensus on a filename/extension. Feedback welcome! https://github.com/spagalloco/jekyll-atom

@parkr Was this something you envisioned living under the @jekyll org? I haven't published to rubygems or anything as I wasn't sure what you or the rest of the core team had in mind.

Based on your comments above, I had a glimmer of hope that such a plugin offered enough value to be whitelisted so that it could be used by GitHub Pages.

@ndarville
Copy link
Author

Nice work.

I’m currently doing a site project, and it’s clear that there will always be special use cases, such as post language and translations as well as different types and numbers of authors on a post, but the ideal basis is probably to optimize it for a single-language, single-author blog and have people flesh it out themselves—or by the help of a wiki with guides for supporting different variations.

@parkr
Copy link
Member

parkr commented Jul 23, 2015

We have https://github.com/jekyll/jekyll-feed now. I believe Ben Balter is working on bringing it into GitHub Pages if it isn't already there.

@parkr parkr closed this as completed Jul 23, 2015
@jekyll jekyll locked and limited conversation to collaborators Jul 23, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants