PulpTunes is an app coupled with a service that allows to stream music files from your iTunes library in your desktop to a browser (desktop or mobile) elsewhere in the net.
This repository contains the backend service part, that relays the data between the desktop app and the browser. It must be coupled with some other service for creating the accounts (see table
licenses explained below), that is not provided here.
This is the same service that powers the pulptunes.com site, which offers the app and this service as a paid service.
The desktop app might also be released as open source in the future, if enough interest is manifested (it's a JavaFX app, that I'd like to rewrite to ScalaFX before doing so...)
Table of Contents:
- Tech Stack
- Architecture Summary
- Deploying and Running
- Future Developments
File streaming can be very resource-intensive. Pulptunes-relay was designed from the ground up to be very easily scalable horizontally.
Whenever the need arises, you just need to start new instances and refer them in the
servers table (details explained below). Instances expose a set of webservices through which they can communicate between themselves in a point-to-point fashion.
There are only two single-point-of-failure elements: a load balancer distributing load among all instances, and a single database. The industry already provides many solutions for scaling these services. In pulptunes.com we have spinning multiple pulptunes-relay instances, and a single HAproxy load balancer and a single MySQL database, which have proved enough.
- This is a Scala application written on top of the Play Framework (v.2.5).
- It uses the Slick (v.3.1) library for interacting with a relational database (MySQL by default).
- It uses Akka actors to handle concurrency.
- It relies on the Cats library for handling interactions with the database and the actors asynchronously, and saves us from callback hell.
- It uses Iteratees, a functional concept for dealing with the streams of data.
The PulpTunes desktop app first asks you to install a license obtained from pulptunes.com. This will determine the subdomain under which your music will be accessed, like
john.pulptunes.com. The pulptunes-relay server lets you handle as many subdomains as you like. They must be referred in the
licenses table as explained below. There is no limit to the number of concurrent streams a subdomain can handle; usually the practical limit is set by the bandwidth available to the desktop app network.
This sequence diagram explains how the desktop app joins the pulptunes-relay cluster, and the lifecycle of a file request. (diagram generated with Mermaid, which is awesome)
Desktop App Connection Establishment
When the desktop app (depicted as Desktop Server) starts it requests some specific subdomain and establishes a websocket connection (fallbacks to long-polling if needed) with a pulptunes-relay instance randomly assigned by the load balancer. In this example that instance is identified by
backend_idand depicted as Backend A in the diagram.
That backend adds an entry into the table
serving_listenersmapping the subdomain with
Music File Request Lifecycle
When a browser points to the subdomain and asks for a file, the load balancer connects it to a random backend instance, depicted as Backend B in the diagram.
That instance spawns an Enumerator identified by some
stream_id. Whenever this Enumerator gets fed bytes in the following steps, it will forward those bytes back to the browser.
Given the subomdain the browser used to connect, the backend instance asks the database which
backend_idthat subdomain corresponds to, i.e. what backend the desktop app linked to this subdomain is connected to (Backend A in this case).
Then the Backend B directly calls a webservice in that other backend (A), informing what music file was requested and the
stream_idwaiting for it.
The Backend A sends a message through the websocket connection to the Desktop app, asking it to send the music file directly to the Backend B and
The desktop app connects directly to the Backend B and sends the requested file, sending alongside the ID
The Backend B starts feeding the Enumerator as it receives the bytes from the desktop app, and they get forwarded to the browser that initiated the request.
In that config file we've declared two instances of pulp-relay (called
pulp1 (listening on port 9001) and
pulp2 listening on port 9002), amongst which the load balancer distributes the load in a round-robin fashion.
As explained in the previous section, when the desktop app sends a music file it will connect directly to the instance the browser has connected to. In this example this could be either
pulp2.example.org. You should adapt this config file using your own domain name.
If you're testing locally, make sure you have entries for your subdomains in your
/etc/hosts files (that's for Mac and Linux; for Windows I believe that's under some
System32 directory, at least in older Windows). In our example it would be something like this:
::1 localhost localhost.localdomain pulp.localhost.localdomain pulp1.localhost.localdomain pulp2.localhost.localdomain
You can upgrade your pulptunes-relay instances without downtime by switching off instances one at a time.
To do so, first set the field
false in the entry corresponding to the instance you want to upgrade in the table
servers (read below for a more detailed explanation of the database tables). Wait for current streams for that instance to finish (for this you'll need to rely on your load balancer monitoring interface). Then kill the instance, upgrade it, launch it again and set
online back to
true. Repeat for all your instances!
By default, the database engine we use is MySQL, and the database name is
pulptunes. You can change this by changing the
slick.dbs.default.db.* config directives located in
conf/application.conf. To use a different db engine make sure you also properly set up its JDBC driver dependency in
build.sbt. The database username and password need to be provided as explained in Deploying and Running.
Upon running for the first time, the following tables will be created (using the evolutions scripts under conf/evolutions). So make sure that after starting the first instance for the first time, or after starting an updated instance containing database changes, hit it directly with the browser for the evolution scripts to be applied.
Most importantly it contains the list of subdomains that the server will serve. For each subdomain there is associated user info such as first name and last name, email and password. This user info is used in the pulptunes.com site to manage accouts and you may ignore it, or integrate it into your own solution for user registration handling. For pulptunes-relay the only thing you need here is to have entries for the subdomains you wish to handle. By default, the system gives your a "pulp" subdomain, attached to a user "John" with an email "firstname.lastname@example.org" and a password "whateva".
This table contains the list of pulptunes-relay instances you have running, amongst which the load will be spread.
id: can be any string you like, it must match the
status: can be either
public_dns: network-reachable host name for the instance.
false. Set it to false before bringing down an instance, for example before upgrading it. See above, Staggered Upgrades.
By default this table contains entries for two instances listening to
pulp2.localhost.localdomain:9001, the first only being flagged as online.
This table maps each subdomain with a pulptunes-server instance.
This logs every connection made into the cluster.
Deploying and Running
This is a standard Play application. Please read Play's docs to decide the best deployment/running strategy for you.
These are the config directives you should override in production. You can override them through the command-line as explained in Play's docs. Each instance launched should have a different
pulp.server_id directive, and be referred in the table
servers. Also make sure your start script sets a different port for each instance with
pulp.server_id="pulp1" pulp.production=true play.evolutions.schema=pulptunes slick.dbs.default.db.url="jdbc:mysql://localhost/pulptunes" slick.dbs.default.db.user="root" slick.dbs.default.db.password=""
By pointing your browser to
/stats you'll see a table listing the desktop apps connected to a given instance, as well as the current streams. Note that the HAProxy config we provide blocks all access except for the ip whitelist in there.
subdomain_log table contains the history of all these connections.
- Development process: Collective Code Construction Contract (C4). Among other things this means aggressive (as in fast and without any red tape) PR merging policy :)
- Coding standards: we follow the Databricks Scala Guide
Even though I've tried using modern techniques in this code base, this is an old project and the concept itself is pretty dated. If I had to rewrite this today, I'd build it over WebRTC. Nowadays the majority of desktop browsers support that protocol, and so does Chrome for Android. Mobile Safari doesn't yet, but it seems it will soon.
With WebRTC this would be a truly peer-to-peer streamer, more performant and requiring lesser backend support. In those scenarios were p2p connections fail, one has to provide a relay fallback (TURN) which would replace this pulptunes-relay solution. Since it'd only handle edge-cases that would imply less backend resources consumed. There's a standard TURN server Coturn that we could use. WebRTC also requires other minor backend support, like a STUN server and a signaling mechanism, which are standard enough to be solved through other 3rd party servers.