Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Clone this wiki locally
PyCASP aims to provide a single software environment for productive, efficient, portable and scalable application development. PyCASP is a collection of specializers (mini-compilers) that automatically map computations onto parallel processors (NVIDIA GPUs, Intel multicore CPUS and clusters). PyCASP targets audio content analysis applications (speech and music processing for example) however, the specializers can be used for other applications (at your own risk, of course). PyCASP's specializers are built on top of the ASP framework (https://github.com/shoaibkamil/asp).
AUTHORS: Katya Gonina, Henry Cook, Shoaib Kamil
DISCLAIMER: This is research code and is a work in progress, use at your own risk!
We have documented known Issues and Things to Note that may be helpful if you encounter strange behavior/bugs.
egonina (at) eecs (dot) berkeley (dot) edu for questions etc.
For details, see E. Gonina's PhD thesis.
Installing the framework
Simply check the code out of the repo, and in the base directory run
$> python setup.py install --user or
$> sudo python setup.py install
The package managers should fetch ASP and all its attendant dependencies and install all of them on your machine. If you have trouble with this step, consult these directions for manual installation of ASP. You can also get ASP pre-installed on a VM Image.
However, there are some external requirements that Pythonic package managers cannot take care of on your behalf, specifically the compilers required to actually build the specialized code.
If you want to use the CUDA backend for GPUs, you must install NVIDIA's compiler (nvcc), runtime, driver and at least one GPU card. The compiler must be on your $PATH, and the runtime libraries must be on your $LD_LIBRARY_PATH. We recommend a >3.0 release of the CUDA toolkit (especially 4.1), but the specializer should work with card compute capabilities as low as 1.2.
If you want to use the Cilk+ backend for multicores, you must install Intel's compiler (icc), libraries, and the Cilk+ runtime. The compiler must be on your $PATH, and the runtime libraries must be on your $LD_LIBRARY_PATH. We recommend the 12.0.5 release of Cilk+.
Finally, all specializers built on ASP have a configuration file that contains some simple directives for each specializer. We provide an example configuration in
asp_config.yml. If you already have some ASP-based specializers installed, just append this file to the existing one. Otherwise, copy it to
~/.asp_config.yml. With these settings you can control whether the Cilk or CUDA backend will be the target of specialization, which CUDA device specialized code will be run on, and whether the specializer will attempt to auto-tune itself to your particular machine and problem space (experimental).
Once you think the python dependencies, compilers, environment variables and config file are set up correctly, try
Then take a look at the sample applications provided in
examples/ and read on.
PyCASP comes with the following specializers:
PyCASP comes with the following examples:
- Speaker Verification (GMM-SVM system). For details please refer to E. Gonina's PhD thesis.
- Speaker Diarization (Agglomerative clustering of GMMs). See documentation here.
- Music Recommendation (Based on UBM adaptation). For details please refer to TOMCCAP paper and demo slides.
- Video event detection using MapReduce. This is an example using MapReduce (Hadoop via mrjob), see documentation here.
Importing PyCASP specializers
When you install PyCASP, all of PyCASP's specializers are installed in the python package directory as well as an internal package containing PyCASP's composition logic.
To import a particular specializer, in your python code use:
from gmm_specializer.gmm import *
from svm_specializer.svm import *
For usage description of each specializer see the corresponding wiki pages.