Skip to content

trygvis/webglue

 
 

Repository files navigation

WebGlue
========

  PubSubHubbub Ruby implementation


Overview
---------

PubSubHubbub (PSHB) is a simple, open, server-to-server web-hook-based pubsub 
(publish/subscribe) protocol as an extension to Atom. For more details see
http://code.google.com/p/pubsubhubbub/ . 
Current project is as simple as possible implementation in Ruby of PSHB Core 0.2 
Draft-compatible hub.

Implemented features:

 - Publishing new topics (Atom Feeds)
 - Subscriptions to existing topics (including callbacks verification) - 'sync' and
   'async' mode verifications
 - Fetching atom feeds, finding the new entries and sending them to all subscribers
 - Authenticated Content Distribution - 'hub.secret'
 
Still missing (not implemented):

 - Publishers need to manually ping the hub (no automatic check for updated feeds)
 - Can only process Atom feeds, not RSS


Required gems
--------------

 - sinatra - web framework - routing etc.
 - httpclient - POST requests, callbacks verification
 - crack - XML parsing
 - ratom - Atom feeds fetching/parsing
 - SystemTimer - for timeouts on unsuccessful requests
 - ruby-hmac - HMAC-SHA1 for content digests

If you are using http://heroku.com/ for deployment, your '.gems' file will look like:

  httpclient
  SystemTimer
  crack
  ratom
  ruby-hmac


Running
------

The whole system is implemented as a Sinatra [ http://www.sinatrarb.com/ ] application. 
To start it locally (on port 4567 for example):

  git clone git://github.com/zh/webglue.git   
  cd webglue
  bundle install
  bundle exec rackup -p 4567 -s thin

For production environment, maybe using 'unicorn' is better:

  bundle exec unicorn -c ./unicorn.conf -E production

(optional)

'async' verification worker process:

- from crontab:

  require 'worker'
  WebGlue::Worker.verify

- independent daemon (checks every WebGlue::Config.CHECK minutes)

  require 'worker'
  WebGlue::Worker.run

Or you can uncomment the proper lines in worker.rb ('if __FILE__ == $0' block) and do

  ruby worker.rb

http://heroku.com/ will automatically recognize the startup file (config.ru) and will 
run your application after the deployment:

  git clone git://github.com/zh/webglue.git
  cd webglue
  heroku create mypubhub
  git push heroku master


PubSub
-------

Both publishing and subscriptions going to the same endpoint - '/'. So if your application
is running on URL http://localhost:4567/ , that will be the endpoint for both publishers
(atom:link[@rel="hub"] in the feed) and subscribers (POST requests for subscription).
Only for debugging purposes, there are two web form on '/publish' and '/subscribe' URLs.


Publishing new topics
----------------------

If your application is running on URL http://localhost:4567/ , go to 
http://localhost:4567/publish and insert your Atom feed in the 'Topic:' text box. Press
"Publish". There will be no changes on the screen, because the hub is responding with
HTTP code 204 "No Content".

You can also POST directly to http://localhost:4567/ with parameters, described
in the "PubSubHubbub Core 0.2" document - 
http://pubsubhubbub.googlecode.com/svn/trunk/pubsubhubbub-core-0.2.html

Also in your feed, insert the line:

  <link rel="hub" href="http://localhost:4567/" />  (adjust for your install)

For now, there is no automatic check for updated feeds implemented, so after changes in
some feed, repeat the actions, described above (from the web form or via POST).
Publishing is possible once every 5 min.


Subscription to existing topics
--------------------------------

If your application is running on URL http://localhost:4567/ , go to
http://localhost:4567/subscribe and fill the web form:

 - Callback - URL to the webhook, which will receive Atom-formated notifications
 - Topic - Atom feed URL, ALREADY PUBLISHED to the system - see 'Publishing new topics' 
   above
 - Verify mode - Both 'Synchronous' and 'Asynchronous' modes supported
 - Mode - 'Subscribe' for adding new subscribers and 'Unsubscribe' for removing 
   the already inserted onces
 - Verify token - something that your callback need to approve
 - Secret - used to compute an HMAC digest of the content, send to the subscriber

You can send also POST subscription requests with the required parameters directly to
the hub endpoint (http://localhost:4567/ in our example).

The hub is implementing callbacks verification, described in the "PubSubHubbub Core 0.2" 
document (sending back 'hub.challenge' parameter in the response body).
In 'Synchronous' mode there will be no changes on the screen after subscription,  
because the hub is responding with HTTP code 204 "No Content".
In 'Asynchronous' mode the hub will respond with HTTP code 202 "Scheduled for verification"


Notifications format
---------------------

Current implementation fetch the Atom feeds and just remove already known entries from 
them, without touching other parts of the feed ('title', 'id', 'author' etc.). 
After that, the feed, with only new entries in it is resend to all topic's subscribers. 
Because of that, the hub cannot process RSS feeds and may have some problems with 
non-well formated feeds. I'll try to fix this in the future releases.
If the subscriber supplied a value for 'hub.secret' in their subscription request, 
the hub will generate an HMAC signature of the payload and include that signature in 
the response headers ('X-Hub-Signature') of the notification.

About

PubSubHubbub Ruby implementation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Ruby 100.0%