Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Massively Distributed Browser-based Javascript Map Reduce Framework. node.js + socket.io (websockets) + webworkers + fun = global warming

branch: master

Fetching latest commit…

Octocat-spinner-32-eaf2f5

Cannot retrieve the latest commit at this time

Octocat-spinner-32 public
Octocat-spinner-32 test
Octocat-spinner-32 views
Octocat-spinner-32 .gitignore
Octocat-spinner-32 README.textile
Octocat-spinner-32 app.js
README.textile

Map Crowd Reduce

This project is a “SETI-at-home-like infrastructure for massively distributed cpu-intensive jobs based on HTML5 WebWorkers and node.js for distributing the tasks”. (quote from Pedro Teixeira’s blog).

This project made into the final SAPO Codebits 2010 programming contest, but unfortunately didn’t win :)

The code was built in an afternoon with the help of some friends. It was my first real-world JavaScript code, so there’s a lot to cleanup and improve :)

Quick Overview

The project consists of a node.js server that created a HTTP server on port 3000.

To start a new computation, you must open the main page, and supply 3 or 4 things:

A payload (optional)

You can supply a file payload that will be chuncked by the next function.

A split/segmenter JS function


  function s(input) {
    // optionally read the input and generate a list of segments to be mapped

    return [ ... ];
  }

This function will run on the server side, and will generate a list of jobs to map.

A map JS function


  function map(args) {
    // do heavy computation with args

    return result;
  }

This function will run on client side. Please generate as many as you want :)

A reduce JS function


  function reduce(arg) {
    // arg is [result, result, result, result]

    // reduce them :)
    return final_result;
  }

This function will run on the client side.

After you define the functions, the system gives you an URL that you can share with anyone wanting to contribute with CPU to the computation.

The client side functions are run on a WebWorker so it doesn’t block the main UI thread.

WebSockets are used for maintaining communication between server and clients.

Requirements / installation

On the client you need a modern browser (latest version of Firefox, Chrome and Webkit should work).

On the sever you will need latest node.js stable and npm, and the following node modules:

  • express
  • connect-form
  • uuid
  • haml
  • socket.io

Examples

With the code there are 2 prewritten examples:

  • find the number of primes between 0 and 60000 (brute force)

this uses a brute-force method of testing if each number is a prime, from 0 to 60000. it runs very fast and can be used to debug.

  • find a MD5 collision with the MD5 on all words with 5 or less alfa words

this lasts 15 minutes on a single machine. when I tested it on the main stage, 50 people connected their browsers at it lasted like 20 seconds (#win).

To run the examples you just open the root path of the web server and click on the button example. It will fill all the JS boxes with the prewritten code.

Limitations

The main limitation is how we handle and store the input and results. Right now, when you upload the payload, node.js loads it into memory to pass them to the s function.

Also, if your s function generates billion of jobs, the things could not work because all the data and arguments are kept in memory.

This also means that right now, if you close the server, you loose the current computation.

Security Considerations

There are two main security problems at the moment. First of all, the s function is run on the server side without any security check. I admit that I have no idea on if and how I could do that on node.js. Any hints (and patches) are appreciated :)

The second security problem is a feature :) At the end of the computation, the result of the reduce function is sent to all connected clients and injected in the DOM. This means that the reduce function can generate cool stuff like canvas with a beautiful fractal, but it also means that a malicious user can inject some nasty HTML on the client side.

Further work

As this was built as a quick hack, and by people with no experience on JS other that quick hacks on the browser, the code has a lot to improve. I’m specially looking into better modularization/encapsulation of the code.

As I’m writing this, some friends of mine are looking to build other cool examples, like building a distributed Mandelbrot fractal!

Also, I’m trying to get my feet wet with WebGL, because I think this could be improved by making some heavily number crunching on the client GPU :)

Something went wrong with that request. Please try again.