Distributed Gradient-Domain Processing of Planar and Spherical Images
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
ClientSocket
Edinburgh
FunctionBasis
JPEG
LaplacianMatrix
LinearAlgebra
PNC3
PNG
ServerSocket
TIFF
Util
ZLIB
.gitignore
DMG.sln
LICENSE
Makefile
README.md
iGrid.html
system.jpg

README.md

Distributed Gradient-Domain Processing of Planar and Spherical Images

links description executable usage changes
LINKS
ToG 2010 Paper
Windows (x64) Executables
Source Code
(Older Versions: Version 4.1 Page Version 4 Page Version 3.5 Page Version 3.11 Page Version 3.1 Page Version 3 Page Version 2 Page Version 1 Page)
License

CODE DESCRIPTION
    This distribution provides an implementation of our distributed multigrid Poisson solver for image stitching, smoothing, and sharpenting:
    • Stitching: Given an input image consisting of composited images, and given a mask which assigns the same color ID to pixels coming from the same source image, our code outputs the stitched image whose gradients are a best-fit to the gradients of the composite image, with forced zero gradients across seam boundaries.
    • Smoothing/Sharpening/High-Low Compositing: Given input low- and high-frequency images, a gradient modulation term, and a pixel fidelity term, our code outputs the best-fit image whose gradients match the modulated gradients of the high-frequency input image and whose pixel values match the pixel values of the input low-frequency image. Formally, if F1 is the low-frequency image, F2 is the high-frequency image, α is the fidelity term, and β is the gradient modulation term, the output image G is the image minimizing the sum of square norms:
    Our system is implemented using a client-server model, and our distribution consists of two executables:
    • ServerSocket: This executable acts as the hub of the solver. Clients connect to it, it establishes all necessary connections and orchestrates the solver. It is also responsible for solving the linear system once multigrid has reduced the problem size to a size that can be solved on a single core.
    • ClientSocket: This executable is the client-side of the solver, and is responsible for managing the solution of a subset of the image. The clients can be distributed across a network and only require that the subset of the image that they are responsible for is on a locally accessible disk. Since these clients can be run in a multi-threaded fashion, there should be no reason to run multiple clients off of the same physical machine.

    NOTE: This version of the code is cross-platform and compilation requires installation of Boost. (The code was developed and tested under Boost version 1.55.0.)
    The code also requires installation of the zlib, png, tiff, and jpg libraries to support image I/O. Source code for these is included for compilation using Visual Studios. The Makefile assumes that the header files can be found in /usr/local/include/ and that the library files can be found in /usr/local/lib/.

EXECUTABLE ARGUMENTS
  • ServerSocket:
    --count <client count>
    This integer specifies the number of clients that will be connecting to the server.
    [--port <server port>]
    This optional integer specifies the port at which the server should listen for client requests. If no port is specified, the server will ask the system to provide and one, and will print out the address and port to the command line.
    [--iters <number of Gauss-Seidel iterations>]
    This optional integer specifies the number of Gauss-Seidel iterations that should be performed per pass. (Default value is 5.)
    [--minMGRes <coarsest resolution of the multigrid solver>]
    This optional integer specifies the coarsest resolution at which Gauss-Seidel relaxation should be performed by the multigrid solver. Once the solver reaches this resolution, it solves using a conjugate-gradients solver. (Default value is 64.)
    [--inCoreRes <in-core/out-of-core transition resolution>]
    This optional integer specifies the transition at which the solver should switch from a distributed solver to a serial solver. (Default value is 1024.)
    [--verbose]
    If this optional argument is specified, the solver will output the magnitude of the residuals at the different levels.
    [--progress]
    If this optional argument is specified, the solver will show the progress of the solver through the restriction and prolongation phases.
    [--unknownType <uknown label behavior>]
    This optional integer specifies how pixels with white (255, 255, 255) labels should be treated. A value of 0 indicates that white should be treated as any other label color. A value of 1 indicates that the associated region should be filled in in black. A value of 2 indicates that the associated region should be filled in with a harmonic function. (Default value is 1.)
    [--quality <JPEG compression quality>]
    This optional integer, in the range [0,100], specifies the compression quality that should be used if the output image is in JPEG format. (Default value is 100.)
    [--hdr]
    By default, the output image pixels are represented at 8 bits per pixel. When using either PNG or TIFF output, enabling this flag will output the images at 16 bits per pixel.
    [--gray]
    By default, the image pixels are assumed to be color. If they are gray, enabling this flag will reduce running time and memory usage by only solving for the single channel.
    [--deramp]
    Removes ramping artefact by subtracting off the average gradient.
    [--spherical/--cylindrical]
    If enabled, these flags indicate the image should be treated as either a spherical (TOAST parameterization) or cylindrical image so that the appropriate boundary continuity constraints are met.
    [--lump]
    If enabled, the mass-matrix is lumped to a diagonal.
    [--tileWidth <iGrid tile width>]
    [--tileHeight <iGrid tile height>]
    [--tileExt <iGrid tile extnsion>]
    When outputting to a tiled grid of images in iGrid format, these parameters specify the width, height, and file-type for the output tiles. The default resolution for the output tiles is 8192x8192 and the default file type is JPG.
    [--iWeight <pixel fidelity term>]
    If the system is solving the Poisson equation to perform image smoothing or sharpening, this value specifies the fidelity term α.
    [--gScale <gradient modulation term>]
    If the system is solving the Poisson equation to perform image smoothing or sharpening, this value specifies the gradient modulation β.
  • ClientSocket:
    --pixels <input composite/high-frequency image>
    This string is the the name of the image file containing the image band that this client is responsible for. (Currently supported file-types include PNG, JPEG, BMP, WDP, TIFF, and our tiled image format iGrid.) The image height is unconstrained (though it must be the same across all clients). However, the width for all but the last band should be a multiple of a nice-power of two. (Roughly, the power should be equal to the number of levels over which the solver is parallelized.)
    --lowPixels <input low-frequency image>
    This string is the the name of the image file containing the low-frequency image band that this client is responsible for. (Currently supported file-types include PNG, JPEG, BMP, WDP, TIFF, and our tiled image format iGrid.) The image height is unconstrained (though it must be the same across all clients). However, the width for all but the last band should be a multiple of a nice-power of two. (Roughly, the power should be equal to the number of levels over which the solver is parallelized.) If this file is not specified, the argument to --pixels is used for both low- and high-frequency content.
    --labels <input mask image>
    This string is the name of the image file serving as the mask for stitching. (Since the values of the mask are used to determine if adjacent pixels in the composite image come from the same source, the mask should not be compressed using lossy compression. Similarly, in representing the composited pixels, be wary of using JPEG compression. Even at 100% quality, it can blur out the seams between images, so that setting the seam-crossing gradient to zero is no longer sufficient.)
    This parameter is required when performing stiching, as it lets the system know where to set the seam-crossing gradients to zero. However, when performing smoothing or sharpening (i.e. if a value is specified for either of the server parameters --iWeight and --gScale), then this parameter is ignored and the system assumes that all the pixels come from a single image.
    --out <ouput image>
    This string is the name of the image to which the output will be written.
    --address <server address>
    This string specifies the address to which the client should connect in order to establish a connection with the server.
    --port <server port>
    This integer specifies the port to which the client should connect in order to establish a connection with the server.
    [--index <client index>]
    This optional integer specifies the index of the band within the whole image that the client is reponsible for. Indexing starts at zero, and if no value is specified, the client is by default assigned an index of zero.
    [--temp <scratch directory>]
    This optional string specifies the directory to which the temporary I/O streams used for storing data between the restriction and prolongation are to be written out.
    [--threads <number of threads>]
    This optional integer specifies the number of threads the client should spawn in order to solve its part of the problem. (Default value is 1.)
    [--inCore]
    By default, the executable assumes that the problem is large and uses a solver that streams the system to and from disk. If this argument is specified, the entire system will be solved in-core.

USAGE
For testing purposes, two sample datasets are provided.
  • PNC3: This dataset, courtesty of Matt Uyttendaele, consists of a panorama of 7 images resulting in an image of resolution 7,963 x 3,589.
    For distributed processing, the image and the mask are broken up into two roughly uniform-sized bands, shown below.
    Composite
    Mask
    Process 0 Bands Process 1 Bands
    To obtain the stitched image, we first run the server, letting it know that there will be two processes connecting to it:
    ServerSocket.exe --count 2
    which gives an output of:
    Server Address: 123.456.789.101:11213
    Using the address and the port, we can now start up the clients. We start the first client, specifying the input and output data, the address and port for connecting to the server, the number of threads to use for processing, and the index of the band:
    ClientSocket.exe --pixels pixels.0.png --labels labels.0.png --out out.0.jpeg --address 123.456.789.101 --port 11213 --threads 2 --index 0 --inCore
    Similarly, we start the second client:
    ClientSocket.exe --pixels pixels.1.png --labels labels.1.png --out out.1.jpeg --address 123.456.789.101 --port 11213 --threads 1 --index 1 --inCore
    The resulting, stitched, image bands are:
    Process 0 Output Process 1 Output
    Alternatively, if we just want to run on a single machine, we set up the server to expect a single connection, and use the iGrid format to merge the data:
    ServerSocket.exe --count 1 --port 12345
    ClientSocket.exe --pixels pixels.iGrid --labels labels.iGrid --out out.jpeg --address 127.0.0.1 --port 12345 --threads 3 --inCore
    The resulting, stitched, image is:
    Process 0 Output
  • Edinburgh: This dataset, courtesty of Brian Curless, consists of a panorama of 25 images resulting in an image of resolution 16,950 x 2,956 For distributed processing, the image and the mask are broken up into 9 blocks of width 2048, shown below.
    Composite
    Mask
    Process 0 Data Process 1 Data
    These images are grouped into two iGrid files for distributed processing over two processors, and we stitch the images by first running the server, letting it know that there will be two processes connecting to it:
    ServerSocket.exe --count 2 --port 11213 --tileWidth 5000 --tileHeight 5000 --tileExt jpg
    Then, we run the two clients:
    ClientSocket.exe --address 123.456.789.101:11213 --index 0 --pixels pixels.0.iGrid --labels labels.0.iGrid --out out.0.iGrid
    ClientSocket.exe --address 123.456.789.101:11213 --index 1 --pixels pixels.1.iGrid --labels labels.1.iGrid --out out.1.iGrid
    The resulting, stitched, image bands are:
    Process 0 Output Process 1 Output

    Using the stitched output, we can now perform gradient-domain sharpening and smoothing by specifying gradient modulation and pixel fidelity terms. To smooth the images, we specify that we would like to set the gradients to zero while preserving the pixel values:

    ServerSocket.exe --count 2 --port 11213 --iWeight 0.005 --gScale 0 --tileWidth 5000 --tileHeight 5000 --tileExt jpg
    ClientSocket.exe --address 123.456.789.101:11213 --index 0 --pixels out.0.iGrid --out smooth.0.iGrid
    ClientSocket.exe --address 123.456.789.101:11213 --index 1 --pixels out.1.iGrid --out smooth.1.iGrid
    and to sharpen the data we amplify the gradients while preserving the pixel values:
    ServerSocket.exe --count 2 --port 11213 --iWeight 0.005 --gScale 2 --tileWidth 5000 --tileHeight 5000 --tileExt jpg
    ClientSocket.exe --address 123.456.789.101:11213 --index 0 --pixels out.0.iGrid --out sharp.0.iGrid
    ClientSocket.exe --address 123.456.789.101:11213 --index 1 --pixels out.1.iGrid --out sharp.1.iGrid
    The smoothed and sharpened image bands are:
    Process 0 Output Process 1 Output
    Note that the fidelity weight, --iWeight 0.005, was chosen so that the smoothing/sharpening would be perceptible in the down-sampled images displayed on this web-page. In practice the interpolation weight should be set higher, as its value is (roughly) the reciprocal of the width of the smoothing/sharpening filter, in pixels.

CHANGES
Version 2:
  1. The code has been modified so that --gScale and --iWeight can be applied when peforming stitching.
  2. The --gray flag has been added to make processing of gray images more efficient.
  3. The --deramp flag has been added to remove structured ramping artefacts in the image tiles.
  4. The --spherical/--cylindrical flags have been added to support the processing of spherical/cylindrical images.

Version 3:

  1. The code has been re-written to be cross-platform.
  2. The code has been modified by replacing the --noBlackOut flag with the more powerful --unknownType flag to support harmonic fill-ins.

Version 3.1:

  1. The parameter --lowPixels has been added to allow users to specify a low-frequency band that is different from the high-frequency band. This enables blending of frequency content, where the ``amount'' of low-frequency preserved increases with larger values of --iWeight.

Version 3.11:

  1. Fixed a memory allocation problem in ImageStream.inl.

Version 3.5:

  1. Removed a dead-lock opportunity that could cause the code lock up right before termination.

Version 4:

  1. General code clean-up.
  2. Improved support for single-channel images.

Version 4.1:

  1. Fixed int to float casting bug.
  2. Fixed HDR I/O bug.

Version 4.5:

  1. Fixed bad memory access when no labels are provied
  2. Fixed missing scaling when threads are forced to merge

HOME