Skip to content

Guide for Developers

wattenberg edited this page Sep 14, 2010 · 18 revisions

The TimeFlow Analytical Timeline is a work in progress, and we’d love to have you help it grow. This page gives an overview of the code, along with some possible places to dive in. Whether you’re just curious about the TimeFlow inner workings, want to customize for a new use, or are a student looking for a long-term project, read on!

If you just want a few tips before reading the code, the top things to know are:


  • The main data structure for a timeline database is the timeflow.data.db.ActDB, a simple in-memory database.

  • The many UI components communicate via the timeflow.model.TFModel , which holds the current data, selections, and visual mappings and fires events when they change.

  • timeflow.views is the package with the entry points for the main data views

  • timeflow.app.* contains the entry point for the application: TimeflowAppLauncher.

Compiling and building

The application was developed as an Eclipse project, and we’ve included the appropriate Eclipse .project and .settings material so that Eclipse uses can import it directly. But we because this is all Java and currently uses no third-party libraries, you could also compile easily with other tools or even from the command line. If you do use Eclipse, and want to contribute back to the main codebase, please put your project settings not to autoformat: this keeps text changes to a minimum.

If you’re interested in editing images, the images_original directory contains layered Photoshop files with text preserved.

Code Structure

TimeFlow is a Java desktop application, using the Swing UI framework. Creating a desktop program sidesteps the issues of security and reliability that surround hosted applications, and allows us to take advantage of Java’s fast processing speed and flexible graphics.

The program itself is divided into three main parts: data handling, visualization, and the main application UI code. If you’re the kind of person who likes to dive headfirst into the code, go for it! Otherwise, here’s a roadmap.

Data

timeflow.data.time
As you might expect, a timeline program needs to deal with dates and times at many points. We’ve steered clear of complex third-party libraries, and have created a few minimal classes for handling time data explicitly. Internally, we represent Date/Time objects in two ways: sometimes as long-valued timestamps, and sometimes as a RoughTime object that contains a timestamp plus a TimeUnit object that functions as an indicator of precision. This package also contains an Interval object representing a span between two long-valued timestamps.

Some additional time-related classes can be found in the timeflow.vis.timeline package; these handle axis-creation and are designed to find pleasing round units to demarcate a given interval.

timeflow.data.db
The data that backs “acts” on a timeline is represented using a simple in-memory database, an ActDB that contains individual records called Acts. An ActDB has a schema that includes Fields that can be of arbitrary types.

In addition, each ActDB has a source field and a description field, to note its provenance. It also allows “aliasing,” where a Field with one name can be referred to by a different one. This may seem odd, but it turns out to be extremely convenient in setting up the visualization: we can simply alias special Strings to mean the fields that define color, size, and other visual attributes.

timeflow.data.db.filter
Selecting subsets of Acts is handled by ActFilter subclasses that either “accept” or “reject” individual Acts. This package contains an unsurprising set of filters, for selecting items by particular field values, by regex matching, etc.

timeflow.data.analysis
TimeFlow offers a small set of “analysis” operators in its Summary view—quick statistics involving number and frequencies of values, ranges, etc. These statistics are defined in this package. These have a very straightforward API, so extending to include new statistics would be a good first entry-point into the code.

timeflow.data.format.field
This package handles reading and writing the values of individual fields. The main subtlety is that we need to do a lot of guessing. We want to be able to import data from sources where types may not be specified, so we need to guess the type of a value given only its String representation. Sometimes this is straightforward: for example, a string beginning “http://” is probably a URL. But in other cases it’s tricky. Is “90210” really a number?

To encapsulate the code for this messiness, there is an abstract FieldFormat class that contains stubs for parsing, guessing, and formatting data. Then we have subclasses corresponding the Strings, String arrays, Date/Times, numbers, and URLs.

timeflow.data.format.file
Code for importing and exporting timelines resides in this package. There is a native “TimeFlow” format, which consists of some metadata (field names and types, etc.) followed by tab-delimited text. There are also routines for reading tab/comma-separated values (TSV, CSV) and the MIT SIMILE timeline format. Some of the parsing code is a bit complex because it turns out that our key use case for TSV is cutting and pasting from Excel, and Excel’s output TSV can actually be ambiguous in situations involving quoted text with line breaks.

Visualizations

timeflow.vis
We don’t have anything like a general visualization framework—the timeline and calendar view are created very much “by hand”—but this package contains some very simple classes for handling mouseovers, etc. It also contains some timeline-specific code, including VisualAct and its subclasses, which represent objects to be rendered onscreen. The main complexity here lies in the fact that in the Calendar we sometimes want to dynamically aggregate items that are close together onscreen. The code to handle this turns out to be subtle.

timeflow.views contains the UI for the various views. The BarChartView is an excellent starting place to see how these work.

timeflow.vis.calendar
Classes for rendering the calendar visualization; the meat of the code is in the Grid class, and much of the work is delegated to objects in the timeflow.vis package.

timeflow.vis.timeline
The timeline code itself has to handle a few special tasks:


  • Finding and drawing pleasing axis increments, handled by AxisTicMarks and AxisRenderer

  • Navigating the timeline, handled by the TimeSlider UI component

  • Dividing items into “tracks” and laying them out appropriately. TimelineTrack does the layout. It offers several different layout methods, including the “loose” layout which, bizarrely, requires the use of the Golden Ratio and the Euclidean algorithm.

The Application

timeflow.app holds the main application code itself. The app is launched in the TimeflowAppLauncher, but the JFrame that holds the whole thing is TimeflowApp. (This indirection seemed to be necessary to get correct menu bar behavior on Macs.)

timeflow.app.ui has a collection of various Swing UI components for controls, editing, etc.

timeflow.app.ui.filter has UI components specifically related to filtering by date, number, and category.

timeflow.app.actions has Swing Actions that relate to various menu items in the main application. Not every action is implemented as a Java Action, however; we found that keeping the file-related actions in the main application actually simplified the code a lot.

Entry points for contributors

We’d be delighted if you’d like to help us with this project! As you play with the application and look through code, you’ll likely have your own ideas of things that could be improved. But here are a few ways we can think of to extend the code.

Input/Output

It would be wonderful to have better ways of acquiring data and exporting it. Wikipedia has many timelines of data… the format’s not completely standard, but writing a flexible parser would instantly give access to a huge amount of information.

It would be terrific to allow users to export timelines as images, PDF/EPS, or as Simile interactive timelines. Implementing these features may not be too difficult. Possible approaches are:


  • Images The current rendering implementation of the timeline and calendar were specifically designed to decouple drawing from UI components. It might be relatively simple to render to an offscreen image and save it.

  • EPS/PDF We could not find a good PDF library compatible with the BSD license, but if you find one—or use one with a different license for your own use—this would likely be very similar to the case of saving to a JPEG or TIFF.

  • SIMILE The MIT Simile project gives an excellent way to show timelines on the web. Their format is quite nice, and writing to it is probably not hard. The most difficult thing might be deciding how to let people customize their export.

More visualizations

As we programmed this project, we realized that it was fairly easy to write non-time-based views. See the Bar Chart, for example. If you’re interested in writing Java visualization code, the data and selection infrastructure of TimeFlow may be quite useful. (In fact, conceivably it could point the way to a more general open-source visualization application; currently there are many open source toolkits but fewer actual applications.)

More statistics

From adding outlier/error detection to statistical modeling, TimeFlow could benefit from more analytics. This would be a good place to start in the code; the timeflow.data.analysis package, for instance, is very self-contained.