Parallel k-D Tree Construction
C++ C Python
Latest commit 6fb58d6 Mar 13, 2012 Choi gcc 4.4.x fix

README

# Copyright (c) 2010 University of Illinois
# All rights reserved.
#
# Developed by:           DeNovo group, Graphis@Illinois
#                         University of Illinois
#                         http://denovo.cs.illinois.edu
#                         http://graphics.cs.illinois.edu
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the
# "Software"), to deal with the Software without restriction, including
# without limitation the rights to use, copy, modify, merge, publish,
# distribute, sublicense, and/or sell copies of the Software, and to
# permit persons to whom the Software is furnished to do so, subject to
# the following conditions:
#
#  * Redistributions of source code must retain the above copyright
#    notice, this list of conditions and the following disclaimers.
#
#  * Redistributions in binary form must reproduce the above
#    copyright notice, this list of conditions and the following disclaimers
#    in the documentation and/or other materials provided with the
#    distribution.
#
#  * Neither the names of DeNovo group, Graphics@Illinois, 
#    University of Illinois, nor the names of its contributors may be used to 
#    endorse or promote products derived from this Software without specific 
#    prior written permission.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
# IN NO EVENT SHALL THE CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR
# ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS WITH THE SOFTWARE.

                                 ParKD README
                                 ============

Date: 2010-09-15 16:05:44 CDT



Table of Contents
=================
1 Introduction
2 How to...
    2.1 Setup
    2.2 Build
    2.3 Running
    2.4 Regression Tests
    2.5 Benchmarking
3 make Target Index
4 Acknowledgements
    4.1 Sponsors
    4.2 Credits


1 Introduction
==============
  The k-D tree is a well-studied acceleration data structure for ray
  tracing. It is used to organize primitives in a scene to allow efficient
  execution of intersection operations between rays and the primitives. The
  highest quality k-D tree can be obtained using greedy cost optimization based
  on a surface area heuristc (SAH). While the high quality enables very fast
  ray tracing times, a key drawback is that the k-D tree construction time
  remains prohibitively expensive. This cost is unreasonable for rendering
  dynamic scenes for future visual computing applications on emerging multicore
  systems. Much work has therefore been focused on faster parallel k-D tree
  construction performance at the expense of approximating or ignoring SAH
  computation, which produces k-D trees that degrade rendering time. Here, we
  present new, faster multicore algorithms for building precise SAH-optimized
  k-D trees.

  This package, ParKD, contains implementations of all of SAH k-D tree
  construction algorithms discussed in the paper published at High-Performance
  Graphics 2010 [1]. All the implementations are written in C++ using
  Intel Threading Building Blocks (tm).

    Algorithm                                          Directory                
   --------------------------------------------------+-------------------------
    1. Serial                                          Accel-serial             
    2. "Nested" parallel                               Accel-nested             
    3. "In-place" parallel                             Accel-inplace-AoS        
    4. "In-place" parallel (AoS-to-SoA [2])   Accel-inplace-SoA        

2 How to...
===========

2.1 Setup
---------

* TBB configuration
  ParKD uses Intel Threading Building Blocks (tm) extensively. Please modify
  the Makefile.tbb_version file according to your TBB installation.
  
* External utilities
  ParKD uses a number of external utilities. These are defined at the top of
  Makefile.common.
  

2.2 Build
---------
   Simply invoking "make" builds all the available implementations (Accel-*
   directories). For every Accel-# directory, an executable named "parkd-#" is
   created. The executables can be built individually as well: "make parkd-#".

   debug binaries can be built by simply attaching "-debug" to the target
   name, such as "all-debug", "parkd-inplace-SoA-debug", and "packer-debug".

   By default, Makefile/s suppress excessive messages. To view the actual
   commands executed by make, define VERBOSE variable.

   > make VERBOSE=1

* packer
  We found that loading the mesh files can be very time-consuming, due mostly
  to their size. We include six input models (in Tests/Models/) that
  consists of a few thousand triangles to over a million triangles.
  
                          Mesh         # of triangles  
                         ------------+----------------
                          teapot.obj             2256  
                          bunny.obj             69451  
                          fairy.obj            171949  
                          angel.obj            474048  
                          dragon.obj           871414  
                          happy.obj           1087716  
  
  Loading bigger models take considerable amount of time, by the use of
  ASCII-based .obj file format. "packer" is a custom utility that converts
  the .obj input file to a binary representation that can be readily loaded
  by the parkd binaries. This can cut down the loading time by up to 75%,
  which significantly speeds up the benchmarking process. The "packed" .obj
  files are given .blob suffix.
  

2.3 Running
-----------
   The simplest way to run a parkd binary is to simply specify the input mesh.

   > ./parkd-inplace-SoA Tests/Models/teapot.blob

   The parkd binary can accept both plain .obj and .blob file types.

   For help:

   > ./parkd-inplace-SoA -h

* Output
  
  Here is an example output. Note that this is printed on stderr stream.
  
    -------------------------------------------------------------------------
    ParKD - In-place (SoA) version                                                  
                                                                                
        Input mesh : Tests/Models/teapot.blob                                   
           Threads : 1                                                          
          MaxDepth : 8                                                          
                                                                                
              TIMING INFORMATION (in CPU ticks)                                 
                                                                                
    Start time                        :    57996741994251831                        
    Mesh load finish time             :              +876186                        
    Build start time                  :    57996741995128053                        
    Build finish time                 :            +21368025                        
                                                                                
     In-place (SoA) TIMING INFORMATION (in CPU ticks)                           
                                                                                
    Start time                        :    57996741995128926                        
    Init (CreateEdges)                :             +2021634                        
    Init (parallel_sort)              :             +3198726                        
    Init (SetupTriangles)             :              +141876                        
    Init (AoS->SoA)                   :              +764937                        
    Init (Total)                      :             +6127173                        
                                                                                
    Build start time                  :    57996742001256225                        
    FindBestPlane time (Avg)          :              1679724                        
    NewGen time (Avg)                 :                22308                        
    ClassifyTriangles time (Avg)      :               148776                        
    Fill time                         :               381951                        
    Total build time                  :             21348666                        
    Total build time (w/o Init)       :             15221367       
    -------------------------------------------------------------------------
  
  
  First, the header portion includes identifying marks to indicate the
  implementation, the input used, the number of hardware threads used, and
  finally the depth of the tree constructed.
  
  Second portion, staring with the line "TIMING INFORMATION" shows the
  implementation-independent timing statistics. Note that the numbers
  starting with "+" indicates the time relative to the previous line. For
  instance, in this invocation, the input mesh was loaded in 876186 CPU
  ticks.
  
    Title                   Description                        
   -----------------------+-----------------------------------
    Start time              At the beginning of execution      
    Mesh load finish time   Right after the input processing   
    Build start time        At the implementation entry point  
    Build finish time       At the implementation exit point   
  
  (Build finish time) - (Build start time) represents the run time of the
  implementation function perceived from outside.
  
  The last portion contains the implementation-dependent timing
  statistics. More detailed information on each item can be found in later
  sections of this document.
  

2.4 Regression Tests
--------------------
   "make check" works as expected, invoking internal self-tests. Models used by
   the self-tests can be configured by changing the "TEST_MODELS" variable
   found in Makefile.common.

2.5 Benchmarking
----------------
   Currently, the x86-specific "rdtsc" instruction is used to measure
   timing. rdtsc instruction returns an unsigned 64-bit integer that represents
   the number of CPU ticks since boot.

   The number of hardware threads used for benchmarking purposes can be
   configured by changing the "BENCH_THREADS" variable found in
   Makefile.common.

   The input meshes used can also be configured using the "MODELS" variable
   found in Makefile.common.

* TODO Quick run (benchmark) - overall scalability analysis
  Under development.
  
* Detailed timing (benchmark-csv) - csv output
  Gathering more detailed, per-phase timing information can be achieved by
  "make benchmark-csv". For each implementation, this runs the parkd binary
  using varied number of threads ("BENCH_THREADS") for all the specified
  inputs. The result is a set of .csv files, one per model, that contains a
  detailed, per-phase timing information, a line per invocation (threads).
  

3 make Target Index
===================
 -----------------+------------------------------------------------
  all, all-debug    Build all targets                               
                    parkd-* + packer                                
 -----------------+------------------------------------------------
  parkd-*(-debug)   Build the parkd executable using the            
                    implementation found in Accel-* directory       
  packer(-debug)    Build packer executable                         
  dev(-debug)       Used to build dev- targets                      
 -----------------+------------------------------------------------
  benchmark         Run all benchmarks                              
  benchmark-*       Run benchmark using Accel-* only                
  benchmark-csv     Create .csv output for all implementations      
  benchmark-csv-*   Create .csv output using Accel-* only           
 -----------------+------------------------------------------------
  check             Run regression tests for all implementations    
  check-*           Run regression tests for Accel-* only           
 -----------------+------------------------------------------------
  clean             Clean all the executables and .o files          
  clean-results     Clean up regression test and benchmark results  
  clean-models      Clean up uncompressed and packed input files    
  clean-all         clean + clean-results + clean-models            

4 Acknowledgements
==================

4.1 Sponsors
------------
   This work was funded by the Universal Parallel Computing Research Center at
   the University of Illinois at Urbana-Champaign
   [http://upcrc.illinois.edu]. The Center is sponsored by Intel Corporation and
   Microsoft Corporation.

4.2 Credits
-----------

This software was developed by Byn Choi, Rakesh Komuravelli, Victor Lu, Hyojin
Sung, and Robert L. Bocchino, who are members of DeNovo group and Graphics
group at University of Illinois, Urbana-Champaign.

[1] Byn Choi, Rakesh Komuravelli, Victor Lu, Hyojin Sung, Robert
L. Bocchino, Sarita V. Adve, John C. Hart "Parallel SAH k-D Tree Construction"
Proceedings of High-Performance Graphics (HPG), June, 2010

[2] "Array-of-Structs to Struct-of-Arrays" optimization