Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
an experiment with Windows Azure and the Instagram Realtime API
tree: d61f87a2c8

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
backend
frontend
.gitignore
LICENSE.md
README.md
ServiceConfiguration.Cloud.cscfg
ServiceConfiguration.Local.cscfg
ServiceDefinition.csdef
architecture.png
cli.png
deploymentSettings.json
waz-logo.png

README.md

WAZSTAGRAM


Wazstagram is a fun experiment with node.js on Windows Azure and the Instagram Realtime API. The project uses various services in Windows Azure to create a scalable window into Instagram traffic across multiple cities.


How does it work

The application is written in node.js, using cloud services in Windows Azure. A scalable set of backend nodes receive messages from the Instagram Realtime API. Those messages are sent to the front end nodes using Windows Azure Service Bus. The front end nodes are running node.js with express and socket.io.

WAZSTAGRAM Architecture

Websites, Virtual Machines, and Cloud Services, oh my.

One of the first things you need to grok when using Windows Azure is the different options you have for your runtimes. Windows Azure supports three distinct models, which can be mixed and matched depending on what you're trying to accomplish:

Websites

Websites in Windows Azure match a traditional PaaS model, when compared to something like Heroku or AppHarbor. They work with node.js, asp.net, and php. There is a free tier. You can use git to deploy, and they offer various scaling options. For an example of a real time node.js site that works well in the Website model, check out my TwitterMap example. I chose not to use Websites for this project because a.) websockets are currently not supported in our Website model, and b.) I want to be able to scale my back end processes independently of the front end processes. If you don't have crazy enterprise architecture or scaling needs, Websites work great.

Virtual Machines

The Virtual Machine story in Windows Azure is pretty consistent with IaaS offerings in other clouds. You stand up a VM, you install an OS you like (yes, we support linux), and you take on the management of the host. This didn't sound like a lot of fun to me because I can't be trusted to install patches on my OS, and do other maintainency things.

Cloud Services

Cloud Services in Windows Azure are kind of a different animal. They provide a full Virtual Machine that is stateless - that means you never know when the VM is going to go away, and a new one will appear in it's place. It's interesting because it means you have to architect your app to not depend on stateful system resources pretty much from the start. It's great for new apps that you're writing to be scalable. The best part is that the OS is patched automagically, so there's no OS maintenance. I chose this model because a.) we have some large scale needs, b.) we want separation of conerns with our worker nodes and web nodes, and c.) I can't be bothered to maintain my own VMs.

Getting Started

After picking your runtime model, the next thing you'll need is some tools. Before we move ahead, you'll need to sign up for an account. Next, get the command line tools. Windows Azure is a little different because we support two types of command line tools:

  • PowerShell Cmdlets: these are great if you're on Windows and dig the PowerShell thing.
  • X-Platform CLI: this tool is interesting because it's written in node, and is available as a node module. You can actually just npm install -g azure-cli and start using this right away. It looks awesome, though I wish they had kept the flames that were in the first version.

X-Plat CLI

For this project, I chose to use the PowerShell cmdlets. I went down this path because the Cloud Services stuff is not currently supported by the X-Platform CLI (I'm hoping this changes). If you're on MacOS and want to use Cloud Services, you should check out git-azure. To bootstrap the project, I pretty much followed the 'Build a Node.js Chat Application with Socket.IO on a Windows Azure Cloud Service' tutorial. This will get all of your scaffolding set up.

Why not WebMatrix?

After using the PowerShell cmdlets to scaffold my site, I used Microsoft WebMatrix to do the majority of the work. I am very biased towards WebMatrix, as I helped build the node.js experience in it last year. In a nutshell, it's rad because it has a lot of good editors, and just works. Oh, and it has IntelliSense for everything:

I <3 WebMatrix

The Backend

The backend part of this project is a worker role that accepts HTTP Post messages from the Instagram API. The idea is that their API batches messages, and sends them to an endpoint you define. Here's some details on how their API works. I chose to use express to build out the backend routes, because it's convenient. There are a few pieces to the backend that are interesting:

  1. Use nconf to store secrets. Look at the .gitignore. If you're going to build a site like this, you are going to need to store a few secrets. The backend includes things like the Instagram API key, my Windows Azure Storage account key, and my Service Bus keys. I create a keys.json file to store this, though you could add it to the environment. I include an example of this file with the project. DO NOT CHECK THIS FILE INTO GITHUB! Seriously, don't do that. Also, pay close attention to my .gitignore file. You don't want to check in any *.cspkg or *.csx files, as they contain archived versions of your site that are generated while running the emulator and deploying. Those archives contain your keys.json file. That having been said - nconf does makes it really easy to read stuff from your config:
// read in keys and secrets
nconf.argv().env().file('keys.json');
var sbNamespace = nconf.get('AZURE_SERVICEBUS_NAMESPACE');
var sbKey = nconf.get('AZURE_SERVICEBUS_ACCESS_KEY');
var stName = nconf.get('AZURE_STORAGE_NAME');
var stKey = nconf.get('AZURE_STORAGE_KEY');
  1. Use winston and winston-skywriter for logging The cloud presents some challenges at times. Like how do I get console output when something goes wrong. Every node.js project I start these days, I just use winston from the get go. It's awesome because it lets you pick where your console output and logging gets stored. I like to just pipe the output to console at dev time, and write to Table Storage in production. Here's how you set it up:
// set up a single instance of a winston logger, writing to azure table storage
var logger = new (winston.Logger)({
    transports: [
        new (winston.transports.Console)(),
        new (winston.transports.Skywriter)({ 
            account: stName,
            key: stKey,
            partition: require('os').hostname() + ':' + process.pid
        })
    ]
});

logger.info('Started wazstagram backend');
  1. Use Service Bus - it's pub/sub (+) a basket of kittens Service Bus is Windows Azure's swiss army knife of messaging. I usually use it in the places where I would otherwise use the PubSub features of Redis. It does all kinds of neat things like PubSub, Durable Queues, and more recently Notification Hubs. I use the topic subscription model to create a single channel for messages. Each worker node publishes messages to a single topic. Each web node creates a subscription to that topic, and polls for messages. There's great support for Service Bus in the Windows Azure Node.js SDK.

To get the basic implementation set up, just follow the Service Bus Node.js guide. The interesting part of my use of Service Bus is the subscription clean up. Each new front end node that connects to the topic creates it's own subscription. As we scale out and add a new front end node, it creates another subscription. This is a durable object in Service bus that hangs around after the connection from one end goes away (this is a feature). To make sure sure you don't leave random subscriptions lying around, you need to do a little cleanup:

function cleanUpSubscriptions() {
    logger.info('cleaning up subscriptions...');
    serviceBusService.listSubscriptions(topicName, function (error, subs, response) {
        if (!error) {
            logger.info('found ' + subs.length + ' subscriptions');
            for (var i = 0; i < subs.length; i++) {
                // if there are more than 100 messages on the subscription, assume the edge node is down 
                if (subs[i].MessageCount > 100) {
                    logger.info('deleting subscription ' + subs[i].SubscriptionName);
                    serviceBusService.deleteSubscription(topicName, subs[i].SubscriptionName, function (error, response) {
                        if (error) {
                            logger.error('error deleting subscription', error);
                        }
                    });
                }                
            }
        } else {
            logger.error('error getting topic subscriptions', error);
        }
        setTimeout(cleanUpSubscriptions, 60000);
    });
}
  1. The NewImage endpoint All of the stuff above is great, but it doesn't cover what happens when the Instagram API actually hits our endpoint. The route that accepts this request gets metadata for each image, and pushes it through the Service Bus topic:
 serviceBusService.sendTopicMessage('wazages', message, function (error) {
    if (error) {
        logger.error('error sending message to topic!', error);
    } else {
        logger.info('message sent!');
    }
})

Questions?

If you have any questions, feel free to submit an issue here, or find me @JustinBeckwith

Something went wrong with that request. Please try again.