layout | title | date | categories | tags | x-source | |||||
---|---|---|---|---|---|---|---|---|---|---|
post |
The rich in RSS |
2009-04-16 03:21:00 -0700 |
|
Last month my colleagues and I had a team "hackday" – an opportunity to work together (we often work individually on projects) and rapidly develop some software prototypes. We had a few ideas beforehand, did a brainstorm then got down to business in the Digilab. It was a general success – however, we aren't ready to show the results just yet – I'll update when we do ;). Richard, Juliette, Patrick and Will worked with Twitter. Sam and I put together an event feed aggregator, using Yahoo Pipes.
We used RememberTheMilk and Google Calendar feeds as examples, and I was struck again - why don't people use existing standards? Specifically, why don't the feeds provided by RTM and Google use the RSS 1.0 Event module? What they do is markup (or not) the data for the event (start date, location etc.) in HTML, in RSS or Atom. So, for RememberTheMilk we have,
{% highlight xml %}
<entry>
...
<id>tag:rememberthemilk.com,1999:tasks-nfre ...</id>
<content type="xhtml">
<div xmlns="">
<div class="rtm_due"><span class="rtm_due_title">Due: </span>
<span class="rtm_due_value">Thu 10 Apr 08 at 10:00AM</span></div>
<div class="rtm_priority"><span class="rtm_priority_title">Priority: </span>
<span class="rtm_due_value">none</span></div>
...
{% endhighlight %}
And for Google Calendar,
{% highlight xml %}
<entry xmlns="http://www.w3.org/2005/Atom">
<id>http://www.google.com/calendar/feeds/d4 ... </id>
<published>2009-02-18T18:08:04.000Z</published>
<category scheme="http://schemas.google.com/g/2005#kind" term="http://schemas.google.com/g/2005#event"/>
<title type="html">Quick ...</title>
<summary type="html">
When: Wed 18 Feb 2009 18:00 to 18:15 GMT<br>
<br>Event Status: confirmed
</summary>
<author><name>Sam ...</name></author>
...
{% endhighlight %}
Now, the examples above are useful for consumption by humans in a feed reader. However, they are a pain to machine-parse. The HTML 'divs' in RTM are easier, but you have to do something special for each calendar provider (regular expressions for Google, yuck!)
The RSS 1.0 Event module was published in 2001. It defines the elements
startdate
, enddate
(W3CDTF), location
, organizer
(person or body) and type
(fixed taxonomy ??).
So the Google Calendar entry above becomes something like,
{% highlight xml %} http://www.google.com/calendar/feeds/d4 ... 2009-02-18T18:08:04.000Z
<title type="html">Quick ...</title> 2009-02-18T18:00+00:00 2009-02-18T18:15+00:00 ... Sam ...Event Status: confirmed
So now we have data that is easily accessible to humans (via generic feed readers), and to machines (specialist event parsers) - simple? (The code samples above are cut-down for illustration purposes.) [26 March, 3 April]