A Parallel Implementation of Smolyak Method
In this project, I show how to parallelize popular projection method called Smolyak algorithm involving sparse grids. The main hotspot in projection methods is the evaluation of a large polynomial on a large grid size. Fortunately, this problem turns out to be embarrassingly parallel. My program works in MATLAB by invoking a precompiled CUDA (Compute Unified Driver Architecture) kernel function as PTX (parallel thread execution) assembly for NVidia graphical processing units. This allows users to use their existing MATLAB codes without having to translate them into C language. I illustrate the practical application of my method by solving the international real business cycle model with ten countries. My algorithm improves performance in double precision by up to 400 times compared with serial implementation in Judd, Maliar, Maliar, and Valero's Smolyak toolbox also written in MATLAB. For example, their model with twenty states can be now solved with the third level of approximation in 6 minutes on Nvidia Tesla V100 GPU rather than 41 hours on the Intel CPU.