Shane Neph and Scott Kuehn
An efficient implementation of the the Maximal Overlap Discrete Wavelet Tranform (MODWT). See D. B. Percival and A. T. Walden (2000), Wavelet Methods for Time Series Analysis. Cambridge, England: Cambridge University Press. This is not the usual discrete wavelet transform found in, for example, gsl but an extended set of algorithms designed to overcome some problems with the usual discrete wavelet transform.
See http://faculty.washington.edu/dbp/PDFFILES/4-Lund-A4.pdf for an overview and comparison to the regular discrete transform.
make -C src/
bin/modwt --help
doc/ has an html document to open in your browser (same information as shown below)
bin/modwt --help includes all option arguments
The Maximal Overlap Discrete Wavelet Transform (MODWT) library is written to be as efficient in RAM and time requirements as possible with particular emphasis on RAM. The application utilizes the library in the most efficient way allowing us to scale to the whole genome level.
- Make it fast and memory efficient, with particular emphasis on RAM requirements.
- Build as a generic library API that can work with any number of different data types, such as simple numeric, BED, WIG, etc. A generic API may be used in any number of ways in any number of applications. The application discussed here does NOT utilize the full features of the library API, and is only a single example of how an application may be built from the library components.
- Make computing any type of MODWT wavelet values independent of the level/scale requested in terms of RAM requirements.
- Build a wrapper around the most useful features of the library and expose as a command-line tool
- Use the library in the most efficient ways possible, even if the application itself becomes slightly cumbersome (see Output)
NOTE modwt --help
shows a lot of useful information. It includes all available filters, boundary conditions and more.
modwt
[--boundary <string = periodic>]
[--filter <string = LA8>]
[--help]
[--level <integer = 4>]
[--operation <string = smooth>]
[--prefix <string = "">]
[--to-stdout]
<file-name>
Where
- periodic [default]
- reflected
- haar
- d4, d6, d8, d10, d12, d14, d16, d18, d20 (daubechies)
- la8, la10, la12, la14, la16, la18, la20 (least asymmetric) [la8 by default]
- bl14, bl18, bl20 (best localized)
- c6, c12, c18, c24, c30 (coiflet)
- is the number of levels the program will sweep through [4 by default]
- all
- details
- mra
- scale (coefficients)
- smooth [default]
- wave (coefficients)
- wave-scale (coefficients)
- may be anything you want as a prefix to all output files generated. This may not be used with --to-stdout.
- only available when --operation set to smooth or scale
- may not be used with --prefix
Option names are NOT case sensitive
Values passed to --boundary, --filter or --operation are NOT case sensitive
File names produced from the application (not the library) are of the form:
- details.i : i = 1..level
- scaling-coefficients.level
- smoothing.level
- wavelet-coefficients.i : i = 1..level
Any --prefix
specified by the end user precedes each name shown above.
Not all of these files are produced unless --operation is set to ALL
- Only MODWT and related items are available from the library right now. See D. B. Percival and A. T. Walden (2000), Wavelet Methods for Time Series Analysis. Cambridge, England: Cambridge University Press.
- We did not expose the capability to feed files back into the program to recalculate the original series. The library does have this capability.
- Files are spit out in the current working directory (cwd) when not using --to-stdout nor --prefix.