RacketMQ, an implementation of a W3C WebSub Hub.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
racketmq
.gitignore
COPYING
Makefile
README.md
TODO.md
gpl.txt
info.rkt
lgpl.txt
nginx.conf

README.md

RacketMQ: An implementation of W3C WebSub

This is an implementation of a W3C WebSub Hub in Racket, using the actor-style research language Syndicate.

What is WebSub?

On the 24th of November 2016, the W3C released a Working Draft of WebSub, "an open, simple, web-scale and decentralized pubsub protocol".

See the specification of the W3C WebSub protocol at https://www.w3.org/TR/websub/ (and track its development at https://github.com/w3c/websub).

Quick Start

  1. Install Racket from http://download.racket-lang.org/
  2. Install RacketMQ by running raco pkg install --auto racketmq
  3. racketmq --baseurl http://localhost:7827/ --listen localhost 7827

To install from git, replace the raco pkg install ... step above with an invocation of make link from the top directory of your git checkout.

Features

  • Offers both local topics, topics whose canonical hub is this hub, and remote topics, topics whose canonical hub is some other ("upstream") hub

  • Support for polling and push-notification for remote topics, with configurable poll interval; this allows hub chaining.

  • Uses HTTP Link headers when retrieving a topic to determine canonical hub and topic URLs; does not extract link elements from any kind of XML or HTML document, nor does it implement .host-meta discovery

  • WebSocket-based subscriptions to WebSub topics, in addition to the usual WebHook-based subscriptions.

Configuration

The most important RacketMQ configuration variable is its canonical base URL: the URL prefix used to build URLs for clients to use.

When the RacketMQ startup script is given a "-f filename" option, it loads configuration data from the named file. The option can be supplied more than once; all named files are imported.

For a fully-commented example configuration file, see racketmq/defaults.rktd.

Within each file, each configuration entry should be a list (see Racket syntax) with a symbol (the "key") as its first item followed by zero or more items. Line comments start with semicolon (;) as usual for S-expression languages.

Each configuration file is automatically reread by the server when it is changed: if you need to make changes, consider doing so atomically by producing an updated configuration file and using rename(2)/mv(1) to activate it.

Required configuration data

(canonical-baseurl "http://localhost:7827/")

Exactly one "canonical-baseurl" key, containing a URL string naming the base URL used for constructing URLs that are given out to third parties, such as subscription endpoints for upstream hubs to use.

This is just for URL construction, and does NOT create any HTTP listeners. Those are configured with "http-listener" keys:

(http-listener "localhost" 7827)
;; (http-listener "localhost" 80)
;; (http-listener "www.example.com" 7827)
;;
;; etc.

At least one "http-listener" key is required. These cause an HTTP server to be spun up for each mentioned port number. Traffic will only be accepted for HTTP Host headers mentioned in these keys.

Since these are the only mandatory configuration item, RacketMQ can run without any configuration file at all if the server is started with the --baseurl and --listen command-line arguments:

racketmq --baseurl http://localhost:7827/ --listen localhost 7827

Fine tuning

You will seldom want to alter these settings.

(max-upstream-redirects 5)

When performing discovery / upstream content retrieval, the hub will follow this many redirects before deciding it has had enough.

(default-lease 86400) ;; 86400 seconds = one day
(max-lease 604800) ;; 604800 seconds = one week

If a subscription request arrives with no specified hub.lease_seconds, then default-lease is used. If a requested lease duration exceeds max-lease, then max-lease is used instead.

(min-poll-interval 60) ;; seconds
(default-poll-interval "none") ;; seconds, or "none"

Upstream topics will be polled from time to time, according to the settings of each local subscription to the topic. Subscriptions may supply hub.poll_interval_seconds as either a number or the string "none". If no hub.poll_interval_seconds is supplied in a subscription, default-poll-interval is used. If all subscriptions to an upstream topic have "none" as their poll interval, no polling will occur; otherwise, polling will occur at the fastest requested rate, but never more frequently than every min-poll-interval seconds.

(subscription-retry-delay 600) ;; seconds

If subscription to an upstream hub fails immediately, we will schedule a retry in this many seconds.

(max-dead-letters 10)
(max-delivery-retries 10)
(initial-retry-delay 5.0) ;; seconds
(retry-delay-multiplier 1.618)
(max-retry-delay 30) ;; seconds

Subscriptions last until explicitly terminated by an unsubscription request, implicitly terminated by lease expiry, or implicitly terminated by sustained delivery failure.

When the hub sends a content distribution request (see the WebSub spec) to a subscription's callback, if a success response is returned, the delivery is considered successful.

Otherwise, the hub begins an exponential backoff process, with an initial delay of initial-retry-delay seconds, increasing by a factor of retry-delay-multiplier (subject to a cap of max-retry-delay seconds) with each subsequent attempt until max-delivery-retries attempts have been made. At that point, if all attempts to deliver the particular content distribution request have failed, the request is considered a "dead letter" and is effectively discarded. Once a request has either succeeded or become a dead letter, the hub continues with any further pending content distribution requests for the subscription.

If more than max-dead-letters dead letters pile up for a subscription, the subscription is considered too damaged to continue to exist, and is terminated.

Hub URL layout

  • /hub — Local subscription management; main Hub URL.

    This is the main URL for creating and deleting subscriptions to (local or remote) topics.

    • method POST: create or delete a subscription, following the specification. Supply hub.mode, hub.topic, hub.callback and other relevant parameters to manage subscriptions.

    • method GET, when an upgrade header with value websocket is present: create a streaming subscription to a topic. See below.

  • /topic/topic — Local topic endpoint.

    A local topic is a topic managed by this hub. Publishers POST their content to the local topic endpoint, and subscribers are notified of the change. Local topics may be managed explicitly or implicitly; any subscription to a local topic will automatically cause it to be created, even if it has not been previously explicitly PUT into existence.

    • method PUT: create a local topic explicitly
    • method DELETE: delete an explicitly-created local topic
    • method HEAD: get headers associated with the most recent topic value
    • method GET: get the most recent topic value
    • method POST: update the topic value with the post body
  • /sub/sub-id — Upstream subscription endpoint.

    When a subscription to a remote topic is created, if the remote topic has an advertised hub, this hub subscribes to the remote hub, and content distribution requests are POSTed to a fresh upstream subscription endpoint URL.

    • method GET: for verification-of-intent requests from upstream.
    • method POST: for content distribution requests from upstream.
  • /path/to/file/in/htdocs — Static resource.

    The racketmq/htdocs subdirectory contains static resources to be served by the hub.

    • method GET: retrieve a static resource.

Streaming WebSocket-based Subscriptions

In addition to the standard WebHook-based subscriptions, RacketMQ offers WebSocket-based subscriptions.

If your server's base URL is https://example.com/, then connecting a WebSocket to URL wss://example.com/hub&hub.topic=MYTOPIC will create a streaming subscription to the topic MYTOPIC. (For plain http:, use ws:.)

Content will be delivered from the server as JSON messages of the form

{
  "topic": "MYTOPIC",
  "link": {
    "hub": "https://example.com/hub",
    "self": "https://example.com/topic/MYTOPIC"
  },
  "content-type": "text/plain",
  "content-base64": "..."
}

The link object corresponds to the Link headers that would usually be sent in a WebSub WebHook-based content distribution request, the content-type string to the Content-Type header, and the content-base64 string to the base64-encoded bytes of the body. The topic string is always based on the hub.topic parameter supplied in the URL that the WebSocket was initially connected to.

Conformance

At the time of writing, no official list of conformance criteria exists; however, there is a draft list of Candidate Recommendation implementation criteria at https://github.com/w3c/websub/issues/56.

Codebase Layout

Files at the toplevel of the git checkout:

  • COPYING, gpl.txt, lgpl.txt: Licensing and copyright information
  • info.rkt: Racket package control metadata
  • nginx.conf: Example nginx configuration file, for running RacketMQ behind nginx

In the racketmq/ directory are the sources for the RacketMQ server:

  • hub.rkt: Main entry point for RacketMQ server
  • config.rkt: Actor that tracks changes in config files
  • protocol.rkt: Definitions of protocol structures for coordination among RacketMQ actors
  • hub/: Source code for the main functions of the RacketMQ server
    • hub/static-content.rkt: Actor serving static content from htdocs/
    • hub/subscription.rkt: Actors implementing downstream WebHook-based subscriptions
    • hub/websocket.rkt: Actors implementing downstream WebSocket-based subscriptions
    • hub/topic-demand.rkt: Actor that analyzes a subscription topic URL, deciding whether it represents a local topic or a remote topic.
    • hub/local-topic.rkt: Actor implementing a local RacketMQ topic
    • hub/remote-topic.rkt: Actors implementing a remote RacketMQ topic and WebSub subscribers that relay content from upstream hubs (if any) to downstream subscribers

The racketmq/ directory also contains a few other files of interest:

  • defaults.rktd: Fully-commented RacketMQ configuration file
  • poke.rkt: Simple interactive tool for interacting with RacketMQ
  • run: Daemontools startup script for the server
  • log/run: Daemontools logging startup script for the server
  • htdocs/: Static files to be served by the server
    • htdocs/500.html: Error document used by nginx when it cannot reach RacketMQ
    • htdocs/client.html: Simple interactive tool for experimenting with WebSocket subscriptions in the browser
    • htdocs/client.js: JavaScript code for client.html

Bug Reports

Please report issues using this project's Github issues page, https://github.com/tonyg/racketmq/issues.

License

Copyright © 2016 Tony Garnock-Jones tonyg@leastfixedpoint.com

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this program (see the files "lgpl.txt" and "gpl.txt"). If not, see http://www.gnu.org/licenses/.