Efficient video analysis at scale
Clone or download
Type Name Latest commit message Commit time
Failed to load latest commit information.
.travis [travis] update deploy key for homebrew Sep 19, 2018
cmake Make non-required dependencies optional Jan 17, 2019
docker Remove unused enum34 Jan 10, 2019
docs Update example to cinematography Jan 18, 2019
examples Update old facedetection pipeline to work with images Jan 14, 2019
python Add TorchKernel to stdlib, more informative error message on differin… Jan 16, 2019
scanner Handle decoder automata creation failure gracefully Jan 17, 2019
scripts Initial revamp of website/documentation Jan 18, 2019
stdlib Make non-required dependencies optional Jan 17, 2019
tests Make non-required dependencies optional Jan 17, 2019
thirdparty/resources Add missing cuda resources Apr 26, 2018
.Doxyfile Fleshed out C++ API commentary. Closes #11 Mar 24, 2017
.clang-format clang-format pass Apr 23, 2017
.dockerignore Added docker publish Nov 1, 2016
.gdbinit Added gdbinit Aug 1, 2017
.gitignore Use json for sql row format May 12, 2018
.scanner.example.toml some clean up Feb 28, 2017
.travis.yml Temporary stop OS X builds Nov 16, 2018
CMakeLists.txt Switch back to building scanner libs inside pip package May 25, 2018
INSTALL.md Update Docker to build and install python package Dec 21, 2017
LICENSE WIP code dump Jul 7, 2016
README.md Add bibtex to README Jan 18, 2019
build.sh Install Python package to system prefix, not .local Dec 16, 2018
deps.sh Make non-required dependencies optional Jan 17, 2019
google.md Added s3 compat Nov 1, 2017
setup.cfg Add missing setup.cg file Jan 17, 2019
setup.py Move setup.py code under function to avoid issue with python spawn Jan 17, 2019


Scanner: Efficient Video Analysis at Scale GitHub tag Build Status

Scanner is a system for developing applications that efficiently process large video datasets. Scanner has been used for both video analysis and video synthesis tasks, such as:

  • Labeling and data mining large video collections: Scanner is in use at Stanford University as the compute engine for visual data mining applications that detect faces, commercials, human poses, etc. in datasets as big as 70,000 hours of TV news (12 billion frames, 20 TB) or 600 feature length movies (106 million frames). We've used Scanner to run these tasks on hundreds of GPUs or thousands of CPUs on Google Compute Engine.
  • VR Video synthesis: Scanner is use at Facebook to scale the Surround 360 VR video stitching software to hundreds of CPUs. This application processes fourteen 2048x2048 input videos to produce 8k omidirectional stereo video output for VR display.

To learn more about Scanner, see the documentation below, check out the various example applications, or read the SIGGRAPH 2018 Technical Paper: "Scanner: Efficient Video Analysis at Scale".

For easy access to off-the-shelf pipelines like face detection and optical flow built using Scanner, check out our scannertools library.

Key Features

Scanner's key features include:

  • Video processing computations as dataflow graphs: Like many modern ML frameworks, Scanner structures video analysis tasks as dataflow graphs whose nodes produce and consume sequences of per-frame data. Scanner's embodiment of the dataflow model includes operators useful for video processing tasks such as sparse frame sampling (e.g., "frames known to contain a face"), sliding window frame access (e.g., stencils for temporal smoothing), and stateful processing across frames (e.g., tracking).

  • Videos as logical tables: To simplify the management of and access to large-numbers of videos, Scanner represents video collections and the pixel-level products of video frame analysis (e.g., flow fields, depth maps, activations) as tables in a data store. Scanner's data store features first-class support for video frame column types to facilitate key performance optimizations, such as storing video in compressed form and providing fast access to sparse lists of video frames.

  • First-class support for GPU acceleration: Since many video processing algorithms benefit from GPU acceleration, Scanner provides first-class support for writing dataflow graph operations that utilize GPU execution. Scanner also leverages specialized GPU hardware for video decoding when available.

  • Fault tolerant, distributed execution: Scanner applications can be run on the cores of a single machine, on a multi-GPU server, or scaled to hundreds of machines (potentially with heterogeneous numbers of GPUs), without significant source-level change. Scanner also provides fault tolerance, so your applications can not only utilize many machines, but use cheaper preemptible machines on cloud computing platforms.

What Scanner is not:

Scanner is not a system for implementing new high-performance image and video processing kernels from scratch. However, Scanner can be used to create scalable video processing applications by composing kernels that already exist as part of popular libraries such as OpenCV, Caffe, TensorFlow, etc. or have been implemented in popular performance-oriented languages like CUDA or Halide. Yes, you can write your dataflow graph operations in Python or C++ too!


Scanner's documentation is hosted at scanner.run. Here are a few links to get you started:

Example code

Scanner applications are written using the Python API. Here's an example application that resizes every third frame from a video and then saves the result as an mp4 video (the Quickstart walks through this example in more detail):

from scannerpy import Database, Job

# Ingest a video into the database (create a table with a row per video frame)
db = Database()
db.ingest_videos([('example_table', 'example.mp4')])

# Define a Computation Graph
frame = db.sources.FrameColumn()                                    # Read input frames from database
sampled_frame = db.streams.Stride(input=frame, stride=3)            # Select every third frame
resized = db.ops.Resize(frame=sampled_frame, width=640, height=480) # Resize input frames
output_frame = db.sinks.Column(columns={'frame': resized})          # Save resized frames as new video

# Set parameters of computation graph ops
job = Job(op_args={
    frame: db.table('example_table').column('frame'), # Column to read input frames from
    output_frame: 'resized_example'                   # Table name for computation output

# Execute the computation graph and return a handle to the newly produced tables
output_tables = db.run(output=output_frame, jobs=[job], force=True)

# Save the resized video as an mp4 file

If you'd like to see other example applications written with Scanner, check out the Examples directory in this repository.


If you'd like to contribute to the development of Scanner, you should first build Scanner from source.

Please submit a pull-request rebased against the most recent version of the master branch and we will review your changes to be merged. Thanks for contributing!

Running tests

You can run the full suite of tests by executing make test in the directory you used to build Scanner. This will run both the C++ tests and the end-to-end tests that verify the python API.


Scanner is an active research project, part of a collaboration between Stanford and Carnegie Mellon University. Please contact Alex Poms and Will Crichton with questions.

Scanner was developed with the support of the NSF (IIS-1539069), the Intel Corporation (through the Intel Science and Technology Center for Visual Cloud Computing and the NSF/Intel VEC program), and by Google.

Paper citation

Scanner was published at SIGGRAPH 2018 as "Scanner: Efficient Video Analysis at Scale" by Poms, Crichton, Hanrahan, and Fatahalian. If you use Scanner in your research, we'd appreciate it if you cite the paper with the following bibtex:

 author = {Poms, Alex and Crichton, Will and Hanrahan, Pat and Fatahalian, Kayvon},
 title = {Scanner: Efficient Video Analysis at Scale},
 journal = {ACM Trans. Graph.},
 issue_date = {August 2018},
 volume = {37},
 number = {4},
 month = jul,
 year = {2018},
 issn = {0730-0301},
 pages = {138:1--138:13},
 articleno = {138},
 numpages = {13},
 url = {http://doi.acm.org/10.1145/3197517.3201394},
 doi = {10.1145/3197517.3201394},
 acmid = {3201394},
 publisher = {ACM},
 address = {New York, NY, USA},