goggles! or: a separate layer for visualization
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README.md
gog.png

README.md

gog

connector server visualization
gogr (for R) gogd (in Clojure) gog-dummy (toy example)
gogpy (for Python) gog-charted.co (the charted.co interface)
gogi (general scatterplot)

What is gog?

gog separates data processing and data visualization. Everybody wants to have nice interactive visualizations in a browser anyway. gog is a three-piece architecture:

  1. connector from data processing environment to server
  2. gog server to pass data from connector to visualization
  3. browser-based data visualization that accepts data from server

All the pieces can be swapped around and even hosted in different places, allowing quite a few combinations.

1. connector from data processing environment

Analogy with ggplot2: ggplot(data=your_data)

All you need is a function (gog) that HTTP POSTs your data to a gog server. As currently implemented, that means POST to http://localhost:4808/data. Currently, data is passed as a JSON array of simple objects, like [{"var_name": 5, ....

  • gogr: an R package for sending data to a gog server
  • gogpy: a Python package for sending data to a gog server

These are super easy to make in any language with support for JSON and HTTP.

2. gog server

Analogy with ggplot2: you don't need a server because everything's in R

As currently implemented, a gog server runs on port 4808. That port is also used by the game "Command and Conquer Red Alert" and it is certainly acceptable to use another port.

As currently implemented, a gog server accepts a POST body at /data and rebroadcasts it to all clients listening to the websocket at /data. The server only passes the contents through, as text.

These are super easy to make in any language with support for HTTP and websockets.

3. browser-based data visualization

Analogy with ggplot2: aes(x=variable) + geom_histogram() etc.

or

"Dear internet, please port ggplot from R to Javascript" - Joshua Gourneau, 2011

These are just HTML/CSS/JavaScript, viewed in a browser. They connect via websocket to http://localhost:4808/data and accepting incoming JSON arrays of simple objects, like [{"var_name": 5, .... Then they present a data visualization and support some level of interactivity.

It's not super easy to make a good component here, but a component can then be used with any language/environment/system that sends data into gog.

It would be nice to have visualizations that support useful features like exporting to common formats, maintaining a history of recent data sets and visualizations, and switching between common visualization types.

Something like the "graphboard" from Wilkinson's Grammar of Graphics would be nice.

Why is this good?

You should be able to use whatever language you want for data processing and still have all the same visualization tools at your fingertips.

You should be able to visualize interactively—both quickly making new plots and interacting with your current plots—regardless of what machine(s) your data code is running on.

You should be able to have total control over your data and visualization systems, without handing data over to or otherwise relying on external providers.

Why is this bad?

We need more and better browser-based data visualization tools that are gog-compatible, sufficiently flexible, and sufficiently feature-rich.

There are places where the separation between data processing and data visualization is not always clear. Which end of the system is responsible for binning a histogram?

Ad hoc development and extension of gog could break compatibility between components.

Ideas for extension

  • Gosh a lot of it is just building out cool front-end pieces.
  • Add an additional control channel for interacting with visualizations from programming environments.
  • Develop or implement an existing format for representing a visualization for interoperability.
  • Some clever scheme for dynamic port assignments and so on.
  • Would it make sense to implement with web components somehow?
  • Bundle a gog server with some good visualizations and distribute as an easy-to-run package.
  • A web service for sharing visualizations.

Related Things

ggobi

ggobi, successor to XGobi, is a system for multivariate data visualization. It's mostly separate from data processing tools because of its focus on visualization. There is a package (rggobi) for interacting with ggobi from R. So with that, the architecture is something like gog. Unfortunately, ggobi doesn't seem to be actively maintained. And it uses Gtk2. But it does have a lot of neat features that aren't availble many other places. Hadley says that ggvis and tourr will eventually succeed ggobi.

imMens

imMens (read as "immense") is a cool project that has a gog-style split architecture but very tight coupling between the data/server and browser side. It does pre-processing of possibly large datasets, then passes data encoded as PNG graphics to a browser where it is further processed and displayed using clever WebGL. The whole idea is a lot of fun and they say (in the paper) that they're working on making it easier to create imMens visualizations.

R htmlwidgets

htmlwidgets is a very neat project that makes it easy to generate web visualizations from and in R. It's all very R-based, and the functions that get produced can take any sort of input data and arguments. The way they've standardized the approach, however, means it would likely be relatively straightforward to take an htmlwidget-ized visualization and transform it to a gog visualization.

Plotly

Plotly is pretty neat. It's similar to gog but has more requirements for articulating data and plot options in the connectors (specifying traces, etc.) and the server and front-ends come from Plotly's machines. Also Plotly is a business that needs to make money, and their products are not Free or open source.