Efficient video analysis at scale
Clone or download
Permalink
Failed to load latest commit information.
.travis [travis] update deploy key for homebrew Sep 19, 2018
cmake Fix incorrect add_refs for new allocation strategy Sep 23, 2018
docker Build with RelWithDebInfo in docker images Aug 29, 2018
docs Update homebrew instructions Sep 17, 2018
examples Update python reg to support device_sets Oct 4, 2018
python Remove debug statement in profiler Oct 22, 2018
scanner Fix the same job becoming blacklisted multiple times Oct 19, 2018
scripts [travis] handle case where brew has already been updated Sep 26, 2018
stdlib Fix issue with Python source causing segfault due to pybind bug Oct 4, 2018
tests Fix tutorials + add them to regular testing Sep 14, 2018
thirdparty/resources Add missing cuda resources Apr 26, 2018
.Doxyfile Fleshed out C++ API commentary. Closes #11 Mar 24, 2017
.clang-format clang-format pass Apr 23, 2017
.dockerignore Added docker publish Nov 1, 2016
.gdbinit Added gdbinit Aug 1, 2017
.gitignore Use json for sql row format May 12, 2018
.scanner.example.toml some clean up Feb 28, 2017
.travis.yml [travis] attempt to fix osx brew build Sep 18, 2018
CMakeLists.txt Switch back to building scanner libs inside pip package May 25, 2018
INSTALL.md Update Docker to build and install python package Dec 21, 2017
LICENSE WIP code dump Jul 7, 2016
README.md README update, add scannertools Aug 12, 2018
build.sh Reset build.sh to working version Jul 30, 2018
deps.sh Pin googletest dependency Sep 26, 2018
google.md Added s3 compat Nov 1, 2017
setup.py Bump version Sep 28, 2018

README.md

Scanner: Efficient Video Analysis at Scale GitHub tag Build Status

Scanner is a system for developing applications that efficiently process large video datasets. Scanner has been used for both video analysis and video synthesis tasks, such as:

  • Labeling and data mining large video collections: Scanner is in use at Stanford University as the compute engine for visual data mining applications that detect faces, commercials, human poses, etc. in datasets as big as 70,000 hours of TV news (12 billion frames, 20 TB) or 600 feature length movies (106 million frames). We've used Scanner to run these tasks on hundreds of GPUs or thousands of CPUs on Google Compute Engine.
  • VR Video synthesis: Scanner is use at Facebook to scale the Surround 360 VR video stitching software to hundreds of CPUs. This application processes fourteen 2048x2048 input videos to produce 8k omidirectional stereo video output for VR display.

To learn more about Scanner, see the documentation below, check out the various example applications, or read the SIGGRAPH 2018 Technical Paper: "Scanner: Efficient Video Analysis at Scale".

For easy access to off-the-shelf pipelines like face detection and optical flow built using Scanner, check out our scannertools library.

Key Features

Scanner's key features include:

  • Video processing computations as dataflow graphs: Like many modern ML frameworks, Scanner structures video analysis tasks as dataflow graphs whose nodes produce and consume sequences of per-frame data. Scanner's embodiment of the dataflow model includes operators useful for video processing tasks such as sparse frame sampling (e.g., "frames known to contain a face"), sliding window frame access (e.g., stencils for temporal smoothing), and stateful processing across frames (e.g., tracking).

  • Videos as logical tables: To simplify the management of and access to large-numbers of videos, Scanner represents video collections and the pixel-level products of video frame analysis (e.g., flow fields, depth maps, activations) as tables in a data store. Scanner's data store features first-class support for video frame column types to facilitate key performance optimizations, such as storing video in compressed form and providing fast access to sparse lists of video frames.

  • First-class support for GPU acceleration: Since many video processing algorithms benefit from GPU acceleration, Scanner provides first-class support for writing dataflow graph operations that utilize GPU execution. Scanner also leverages specialized GPU hardware for video decoding when available.

  • Fault tolerant, distributed execution: Scanner applications can be run on the cores of a single machine, on a multi-GPU server, or scaled to hundreds of machines (potentially with heterogeneous numbers of GPUs), without significant source-level change. Scanner also provides fault tolerance, so your applications can not only utilize many machines, but use cheaper preemptible machines on cloud computing platforms.

What Scanner is not:

Scanner is not a system for implementing new high-performance image and video processing kernels from scratch. However, Scanner can be used to create scalable video processing applications by composing kernels that already exist as part of popular libraries such as OpenCV, Caffe, TensorFlow, etc. or have been implemented in popular performance-oriented languages like CUDA or Halide. Yes, you can write your dataflow graph operations in Python or C++ too!

Documentation

Scanner's documentation is hosted at scanner.run. Here are a few links to get you started:

Example code

Scanner applications are written using the Python API. Here's an example application that resizes every third frame from a video and then saves the result as an mp4 video (the Quickstart walks through this example in more detail):

from scannerpy import Database, Job

# Ingest a video into the database (create a table with a row per video frame)
db = Database()
db.ingest_videos([('example_table', 'example.mp4')])

# Define a Computation Graph
frame = db.sources.FrameColumn()                                    # Read input frames from database
sampled_frame = db.streams.Stride(input=frame, stride=3)            # Select every third frame
resized = db.ops.Resize(frame=sampled_frame, width=640, height=480) # Resize input frames
output_frame = db.sinks.Column(columns={'frame': resized})          # Save resized frames as new video

# Set parameters of computation graph ops
job = Job(op_args={
    frame: db.table('example_table').column('frame'), # Column to read input frames from
    output_frame: 'resized_example'                   # Table name for computation output
})

# Execute the computation graph and return a handle to the newly produced tables
output_tables = db.run(output=output_frame, jobs=[job], force=True)

# Save the resized video as an mp4 file
output_tables[0].column('frame').save_mp4('resized_video')

If you'd like to see other example applications written with Scanner, check out the Examples directory in this repository.

Contributing

If you'd like to contribute to the development of Scanner, you should first build Scanner from source.

Please submit a pull-request rebased against the most recent version of the master branch and we will review your changes to be merged. Thanks for contributing!

Running tests

You can run the full suite of tests by executing make test in the directory you used to build Scanner. This will run both the C++ tests and the end-to-end tests that verify the python API.

About

Scanner is an active research project, part of a collaboration between Stanford and Carnegie Mellon University. Please contact Alex Poms and Will Crichton with questions.

Scanner was developed with the support of the NSF (IIS-1539069), the Intel Corporation (through the Intel Science and Technology Center for Visual Cloud Computing and the NSF/Intel VEC program), and by Google.

Paper citation

Scanner will appear in the proceedings of SIGGRAPH 2018 as "Scanner: Efficient Video Analysis at Scale" by Poms, Crichton, Hanrahan, and Fatahalian. If you use Scanner in your research, we'd appreciate it if you cite the paper.