Week 3

dmak78 edited this page Feb 10, 2012 · 5 revisions

This week we're going to talk about 3d.

Class Notes


Unlike most of the other technology we'll discuss in this class, 3d scanning has had strong ties to the entertainment industry. In 1972 Ed Catmull (founder of Pixar) was already digitizing models and making short films. 3d scanning has been used for entertainment, surveying, plastic surgery, city planning, and practically anywhere you can imagine needing more information than a 2d camera can provide. It's easier to talk about the individual techniques than to try and theorize a global arc for 3d scanning in general, so I'll just mention the recent history.

In 2008, Radiohead released the music video for House of Cards. They even released the data and a visualizer and people made a bunch of remixes. They branded it as being made "without lights or cameras", but this was just misleading marketing. Three kinds of scanning technology were involved in the shoot, and one of them, structured light, relies on cameras and projectors.

The release of the Kinect in 2010 marked the introduction of the first 3d scanner to the consumer market. It was quickly appropriated by hackers after adafruit offered a bounty. It's worth noting that before the open source solution was provided, a separate team tried fundraising $10k before releasing their solution. A month later, Primesense released OpenNI, which provided skeleton tracking and a reference implementation for connecting to the Kinect. In mid 2011 Microsoft released their own SDK, and in late 2011 they tried to associate themselves with the hacker movement by launching The Kinect Effect campaign.

Greg has some thoughts on the relationship between Primesense and the military.

Active Techniques

Coordinate measurement machines

This is what Ed Catmull was using in the video above. Pixar still uses very expensive machines for this, but you can always build your own out of legos.

Optical triangulation (laser scanning)

Laser line scanning will almost always be the highest resolution data for any scenario. You can get started with the free version of DAVID Laserscanner.

Using a turntable with a laser line scanner is super popular. Here are four examples.

If you need a line, you can always make one with a wine glass stem. Or you can use a shadow like they did at SIGGRAPH a few years ago. You don't have to use a turntable, you can move yourself.

Or you can work with submersion instead of projection. It can be quite theatrical.


flying over netherlands in a plane and rendering the massive point cloud

newer (last few years) lidar can run at high speeds, e.g., SICK scanners or the velodyne scanners

velodyne was used for radiohead video (outdoor scenes and party scene) doing a lidar scan looks something like this

can you make your own lidar scanner? maybe starting with a laser rangefinder?

lidar is similar to sonar + radar (light detection and ranging, sound navigation and ranging, radio detection and ranging). lidar uses pulses of light, radar uses pulses of radio-frequency light, sonar uses pulses of sounds.

Time-of-flight infrared

wikipedia has a good overview

example video is 320x240, poor fps. compare to kinect at 640x480 and 30 fps. this video comes from the baumer tzg01

Structured light

some of my work: started out with gray code subdivision, moved on to three phase scanning. started getting realtime results from the scanner. the process of scanning looks like this. a year before the kinect it started running around 60. i shot and visualized 3d data for a broken social scene video. i collaborated with three other artists on the janus machine

the highest resolution structured light comes from debevec, regularly used in hollywood. see 'the digital emily project'

maybe the lowest resolution structured light comes from this iphone app which is actually more similar to 'shape from shading' assuming it calculates surface normals and propagates them.

you can detect edges with multiple flashes (there is a quick explanation in this generally mindblowing lecture from ramesh raskar).

if you don't put any light in the scene, you might still be able to use the known lighting somehow (shape from shading).

At the airport


The Kinect is just a structured light system, using an infrared projector.

The video that explained what the Kinect is.

Skeletonization is historically incredibly difficult. Kinect makes it almost trivial.

MRI + CT Scans

These are the only ways to get genuine volumetric data. They're both expensive, and CT scans are somewhat dangerous. Maybe it's similar to what a shark sees? Can we recreate that?

If you extrude the human form, captured by physical separation rather than MRI/CT, through space, you get 12:31.

Passive Techniques

There are so many techniques it's hard to list all of them. Using light field cameras, you can make depth guesses. Or just defocus.

Stereo vision and Multiview stereo

Before the Kinect, the Bubmlebee2 from Point Grey was your next best bet at $2k. Golan used this for Double-Taker/Snout and you can see it in his debug screen. The Bumblebee drove a Radiohead-style music video in 2006, two years before Radiohead, and an performance that same year.

Stereo is interesting because there are so many different techniques for decoding the data. I think the state of the art, as of a year ago, is based on using passive data for the highest resolution details.

If you have more than two cameras, you can start reconstructing more complex geometry, at all different scales: from an entire city to a single face. Meshlab now has tools to intercept Photosynth data, which you can then export in any format you like. Photosynth is based on a tool called Bundler that is also used by a tool called CMVS (formerly PMVS2) which gives some great scans.

Structure from motion

SfM is very similar in theory to multiview stereo, but it is used in a different market: matchmoving for special effects in movies. You can run SfM in realtime with toolkits like PTAM. Nonrealtime SfM software ranges in price from free to $10k.

Output Techniques

3d on 2d screens

Good depth cues are hard to find. The wiggle is one place to look. Depth of field is another (I've explored this technique). Sometimes giving someone access to a 3d controller helps for understanding the space.

Laser cutting

Olafur Eliasson's Housebook

Jared Tarbell's slices and height maps.

CNC Milling

If you want to make something really big, CNC milling large pieces of foam and stacking them is the easiest way to go.

3d printing and laser etching

For commercial printers, see Shapeways, Ponoko and Crystal Fox. Sophie Kahn is a great reference for 3d printed work.

Reading Code Together

There are a ton of examples for this week that demonstrate a number of different ways of visualizing 3d data. We'll walk through the examples in order. First the basics:

  • CloudExample: visualize a depth image as a point cloud
  • MeshExample: visualize a depth image as a mesh
  • SlicesExample: visualize MRI slices as texture planes
  • VoxelsExample: visualize MRI points in a voxel grid

Then the more advanced:

  • CloudDOFExample: apply a depth of field effect to the point cloud
  • FilmicExample: apply a number of film artifacts to the point cloud
  • LightingExample: how to generate normals and add lights
  • LineArtExample: more advanced toon rendering techniques

And two tools:

  • KinectExportExample: export an RGB+D PNG from a Kinect
  • ModelLoaderExample: load any kind of 3d file using the Assimp library



Find an interesting subject. This could mean:

  1. Finding a data set online (there are LIDAR datasets, elevation datasets, depth images, 3d models...)
  2. Find something interesting to scan, and use a Kinect to scan it.
  3. Use your own technique to build a mechanism for capturing 3d information.

Once you have your subject, you should do one of two things with it:

  1. Render it as a still image or video, starting with one of the examples from this week, or starting with your own code.
  2. Create a tool for exploring the data in an interactive way.

I'll also offer some alternatives to doing the above (talk to me first):

  1. Build a DIY LIDAR scanner.
  2. Get really good looking SSAO+DOF working in OF 007 with Kinect. Akira has a start.
  3. Write an intuitive 3d optical flow implementation.

Discussion and Link

I will be sending an email with a prompt, and I would like you to respond with your thoughts in a short email (less than 300 words). I would prefer you respond to the list, but you may respond directly to me if you like.

Finally, add a link to your favorite 3d scanning or Kinect project on the wiki.


I recommend Hands Up by Golan Levin for thinking about what has been called "the cactus" or "psi" (or, as Primesense says internally, "stick em up").