A Parallel Implementation of Smolyak Method in CUDA
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.



A Parallel Implementation of Smolyak Method

In this project, I show how to parallelize popular projection method called Smolyak algorithm involving sparse grids. The main hotspot in projection methods is the evaluation of a large polynomial on a large grid size. Fortunately, this problem turns out to be embarrassingly parallel. My program works in MATLAB by invoking a precompiled CUDA (Compute Unified Driver Architecture) kernel function as PTX (parallel thread execution) assembly for NVidia graphical processing units. This allows users to use their existing MATLAB codes without having to translate them into C language. I illustrate the practical application of my method by solving the international real business cycle model with ten countries. My algorithm improves performance in double precision by up to 140 times compared with serial implementation in Judd, Maliar, Maliar, and Valero's Smolyak toolbox also written in MATLAB. For example, their model with twenty states can be now solved with the third level of approximation in 30 minutes on two Tesla K80 NVIDIA GPUs rather than 70 hours on Intel CPU.