A 2D spectral Galerkin code written in CUDA C and runs on nVidia GPUs
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
demos
src
tools
COPYING
Makefile
README

README

Sg2 is a simple 2D spectral hydrodynamic code written in CUDA C

COMPILE: make sure that the nVidia compiler `nvcc` is in your path so
  that `which nvcc` returns something like "/usr/local/cuda/bin/nvcc".
  There is no need for configuration.  Simply typing `make` in the
  source root compiles sg2 in double precision.  The resulting binary
  is "bin/ihd".  Running `make float` compiles sg2 in single precision
  mode, which is about two times faster on Fermi Architecture.

USAGE: sg2 doesn't use configuration file.  It follows Unix trandition
  and takes comman line options at runtime.  To get a list of options,
  type `bin/ihd --help`.  It returns the following:

    Usage: ihd [OPTION...] [SEED/INPUT_FILE]
    Spectral Galerkin Incompressible Hydrodynamic in 2D (with CUDA)

          --help        display this help and exit
      -b                quasi-geostrophy beta parameter
      -d                device id
      -f                forcing [amplitude and] wavenumber
      -m                Ekman coefficient
      -n                kinematic viscosity
      -o                prefix of the outputs
      -rk3, -rk4        pick different time integrators
      -s                number of frames and grids
      -t                [Courant number and] total time [and fixed step size]

    Report bugs to <ckch@nordita.org>.

  To run a simulation of Kelvin-Helmholtz instability with 100 outputs
  (not including the initial condition) and resolution 256x256 up to
  time = 10, one simply type

    bin/ihd KH -f 0 -s 100 256 256 -t 10

  The above command creates log files 0000.txt, 0001.txt, ...,
  0100.txt and binary data files 0000.raw, ..., 0100.raw.

  We have provided some IDL scripts in "tools/" to visualize the
  outputs.  To use them, you can either add "tools/" to !path, or just
  copy all the IDL scripts to your working directory.

  If one like the idea of configuration files, one can of course wrap
  the above command in a simple shell script or a submission file.  We
  have provided some examples in the "demos/" folder.

CONVENTIONS: let u be the two dimensional velocity.  We define the
  stream function f such that

    u = curl(f).

  In component form, we have ux = df/dy, uy = -df/dx.  It is fine that
  we do not distinguish between the scalar f and the vector (0, 0, f)
  in this README file.  The vorticity is defined as

    w = curl(u).

  Similarly, we do not distinguish between w or (0, 0, w).  It is
  clear that the stream function and the vorticity satisfy the
  following Poisson equation:

    w = - grad^2 f,

  or, in the Fourier space,

    W_k = k^2 F_k.  (*)

  We will use nu to denote the kinematic viscosity, mu be the Ekman
  coefficient, and beta be the linear expansion coefficient of the
  Coriolis parameter.  We will also use

    J(f, w; x, y) = (df/dx)(dw/dy) - (dw/dx)(df/dy)

  to denote the Jacobian determinant.

EQUATION: from equation (*), it is clear that W_k falls off slower
  than F_k.  For a pseudospectral (or collocation) method, it is
  better to solve the vorticity equation

    dw/dt - J(f, w) + beta uy = nu grad^2 w - mu w,

  instead of the stream function equation because of the finite
  precision in the fast Fourier transforms.

  For Galerkin spectral method, we keep track of the Fourier modes in
  stead of the functions.  This allows easy implementation of implicit
  or semi-implicit integrators for the linear terms.  Although
  evolving the stream function does not reduce accuracy in Galerkin
  spectral method, we will stick with the vorticity equation.

NYQUIST FREQUENCY: for an n-point sampling of a real function f_i,
  there are two modes in the Fourier space that are guaranteed to be
  real if n is even, namely, the "mean value" F_0 and the "Nyquist
  mode" F_KN, where KN = n / 2.

  If one naively takes the derivative at the Nyquist frequency, i KN
  F_KN becomes purely imaginary.  This seems to causes problem to
  inverse transform back to a real function.  Nevertheless, the
  Nyquist mode is simply [1,-1,1,-1,...] in the real space.  One can
  therefore interpret the Nyquist mode as the super position of two
  --- the imaginary part vanishes because of the Hermit symmetry:

    F_KN = G_KN + G_{-KN} = G_KN + G_KN* = 2 Re G_KN

  Taking derivatives of those two modes independently and sum them up,

    i KN G_KN - i KN G_{-KN} = i KN (G_KN - G_KN*) = - 2 KN Im(G_KN).

  The above expression is of course real.  There is no inconsistency
  in taking derivative of the Nyquist mode.  However, it is impossible
  to obtain the value Im(G_KN) from the sampling data f_i.

  For simplicity, we always set F_KN = 0.  Indeed, this is done
  automatically in the Galerkin truncation.  See next section for
  details.

GALERKIN TRUNCATION: because of the Nyquist frequency complication
  described above, we only keep odd number of modes along the wave-
  space axes.  That is, along k_x, we use the modes [-K, ..., K] for
  certain K.  K is chosen so that that are enough zeros to avoid
  aliasing error.  I.e.,

    Number of zeros = N - (2 * K + 1) >= K,

  For multi-dimension problems, we zero out all modes outside the
  circle |k| = K.  The above inequality needs to be satisfied for each
  dimension, yet we want to keep more information that are not along
  the axes.  Therefore, we choose

    K = 0.99 + (min(N) - 1) / 3.

  The division in the second term is an integer operation so it rounds
  down.  The value 0.99 is chosen so that the test k * k <= K * K is
  accurate enough, even in single-precision float, for N ~ 2048.

TIME-STEP: The stability condition for the explicit (advection) step
  is u dt < 3.34 / K.  Using K = n / 3, we have

    dt u n < 10.0

  The stability condition for the implicit (diffusive) step is
  1 - 0.5 dt (alpha[i+1] - alpha[i]) (nu K^2 + mu) >= 0.  Because
  max(alpha[i+1] - alpha[i]) ~ 0.336 for 4th-order Runge-Kutta/Crank-
  Nicolson, we have

    dt (nu n^2 / 9 + mu) < (2 / 0.336) ~ 5.95

  The overall stable time step is the smaller one.

FORCING: we implemented two kinds of forcings.  One is determinant
  Kolmogorov forcing, and the other one is random forcing at a given
  wavenumber.  Both forcing terms are implemented in force.cu.

  For the Kolmogorov forcing, we use the following equation:

    f_K(x, y) = fi ki cos(ki x)

  where fi and ki are parameters describing the amplitude and wavenumber
  of the forcing.  This forcing is integrated explicitly in the
  low-storage 4th-order Runge-Kutta method.

  For random forcing, we use

    f_r(x, y) = 2 fi ki cos(ki_x x + ki_y y) / sqrt(dt)

  and update the vorticity at the end of each Runge-Kutta (full) step.
  The extra factor 2 is introduced so that the energy input is simply
  fi^2.  The update is only 1st order in time.  The randomness comes
  in because of a phase and the direction of ki.

ENERGY AND ENSTROPHY TRANSFER: the non-linear energy transfer (rate)
  is one of the most important quantities in turbulence theories.  In
  two dimension, the energy inversely cascades to large scale; while
  the enstrophy forward cascades to small scale.  We should compute
  both of them.

  Let's write the two dimensional Navier-Stokes equation in Fourier
  space as

    (d/dt + mu + nu k^2) u_k = NL_k

  where NL denotes the non-linear and pressure terms.  Using "*" to
  denote complex conjugate, we can easily write down the energy
  transfer through each mode

    T(k) = (d/dt + 2 mu + 2 nu k^2) u_k*.u_k / 2
         = (u_k*.NL_k + u_k.NL_k*) / 2
	 = Re(u_k*.NL_k)

  Because we evolve the vorticity equation in this code, it is easier
  to compute the enstrophy transfer instead:

    T_Z(k) = (d/dt + 2 mu + 2 nu k^2) w_k* w_k / 2
           = (w_k* J_k + w_k J_k*) / 2
	   = Re(w_k* J_k)

  We can now use the following relation to obtain the energy transfer

    T(k) = T_Z(k) / k^2.

  The energy flux Pi(k) and the enstrophy flux Pi_Z(k) are given by

    d/dk Pi  (k) = - T  (k),
    d/dk Pi_Z(k) = - T_Z(k).