Inverse transform sampling with Chebyshev polynomial approximation
These codes provide a Fortran 90 implementation of inverse transform sampling with Chebyshev polynomial approximation in one and two dimensions. Cumulative and probability distribution functions are approximated by Chebyshev polynomials, which significantly speeds up the evaluation of functions in root finding. Implementation on massively parallel computers with MPI allows for generation of large sample size. For our application, we use this sampling algorithm to load particles in particle-in-cell simulations. Similarly, this algorithm can be used to initialize other distributions in Monte Carlo simulations, molecular dynamics simulations, and gravitational simulations. The practical use of this method is demonstrated through concrete examples in space plasmas. Compared with the classical rejection sampling in low dimensions, our sampling algorithm is particularly efficient when the distribution function is highly localized into a small fraction of the domain.
The core of Chebsampling is an iterative bisection root-finding algorithm with functions represented by Chebyshev polynomials, which is contained in the module invsampling (invsampling.f90). In one dimension, the sampling algorithm is wrapped in the subroutine inv_sampling_1D (available in module invsampling). In two dimensions, samples are generated by drawing from marignal and conditional distribution functions using the subroutine inv_sampling_1D repetitively, which is wrapped in the subroutine pp_inv_sampling_2D (available in module invsampling). The discrete Chebyshev tranform in module invsampling uses trusted fast Fourier transform code inherited from the UPIC framework (libmfft1.f, libmfft1_h.f90, modmfft1.f90; see https://github.com/UCLA-Plasma-Simulation-Group/UPIC-2.0.git).
The domain decomposition for MPI parallelization is done in the module ppush2 (ppush2.f90). In this module, subroutine pfedges2 decomposes the domain such that the number of particles in each partition is approximately the same; subroutine pdcomp2 decomposes the domain such that the number of grids in each partition is approximately the same. We use the subroutine pfedges2 in our application for better load balance. But it is also possible to pdcomp2 if uniform paritition is preferred.
The library pplib2 (pplib2.f90) for basic MPI communications is inherited from the UPIC framework.
The target distribution data is generated by the module distr (distr.f90). Users can add new subroutines in this module to generate other distributions. Alternatively, because these target distribution data are defined on grids. Users can easily modify this module to read realistic distribution data from external files.
The workflow of our sampling algorithm is wrapped in main.f90. For input, the number of particles (npx, npy) and the number of grids (ncx, ncy) are taken from the namelist 'input1.nml'. For output, the deposited density data and the target density data are stored in 'data.bin'.
A detailed description of all subroutines in the codes is available in 'manual.pdf'.
For prerequisites, users are suggested to install Linux environment with GNU compilers, openmpi and make.
A makefile is available to compile the code using either GNU or Intel compilers. To compile the code, type 'make' or 'make chebsampling'.
To run the code, type 'echo
A docker container to run our code and plot the output data is available at https://codeocean.com/capsule/0988490/tree/v2, where the enviroment with all the required softwares has been readily set up.
This sampling algorithm and its applications in space plasmas are documented in 'inv_cheb_sampling.pdf', which has been published in Journal of Geophysical Research https://doi.org/10.1029/2021JA030031 (arxiv: https://doi.org/10.48550/arXiv.2202.08203). Useful references in developing our software are given in the manuscript. In particular, we benefit a lot from the textbook 'Approximation Theory and Approximation Practice' by Nick Trefethen.