```
This notebook sets up and runs a set of benchmarks to compare
different numerical discretizations of the SWEs

Copyright (C) 2016  SINTEF ICT

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.
```

# Basic Particle Filter

In this notebook we will make test implementations of basic particle filters.

The aim is to find a decent implementation of the particles, which can be used by both simulators and particle filter.

All post-processing of particles will be done on the CPU in this first iteration.

## Overview of basic resampling strategies

The following four resampling algorithms are described in Section 3a) in van Leeuven's 2009 review paper.

The starting point is a prior distribution of model states with pdf $p(\psi^0)$, from which $N$ model state samples (particles) $\psi_i^0$, $i = 1,...,N$ are drawn.
Run the simulation model on all particles $\psi_i^n = f(\psi_i^{n-1})$, or in our case `sim.step(t)`. This is the same as sampling from the $p(\psi^n | \psi_i^{n-1})$.
At this point we see an observation $d$.

Now, define a posterior distribution $p(\psi^n | d)$. **Think a bit here before writing more** - from this we obtain weights $w_i$.


The most basic resampling strategies are as follows:
### Probabilistic resampling
Use the weights as a discrete distribution and sample directly from this.

### Residual sampling
Here, we first resample particles deterministic based on their weights, so that particle $i$ is resampled `np.floor`$(Nw_i)$ times. Define the left-over weights as $w^*_i = Nw_i \% 1$, and use $w^*_i$ as a discrete distribution. Draw particles from the descrete distribution defined by $w^*_i$ until the ensemble consists of $N$ particles again.


### Stochastic Universal sampling
Let each weight represent a bucket on the line $[0, 1]$ with length $w_i$, and draw a random number $u \sim U[0, 1/N]$.
Put $N$ line pieces starting from $u$ with lengths $1/N$ on the line $[0,1]$. 
The bucket in which each line piece ends defines the particle that is resampled for each of the $N$ line pieces.

### Monte Carlo Metropolis-Hastings sampling
This sampling scheme is described very algorithmicly in the paper (2009, van Leeuwen) already, section 3a4).


#### Import modules and set up environment

In [None]:
#Lets have matplotlib "inline"
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

#Import packages we need
import numpy as np
from matplotlib import animation, rc
from matplotlib import pyplot as plt
from matplotlib import gridspec


import os
import pyopencl
import datetime
import sys

#Set large figure sizes
rc('figure', figsize=(16.0, 12.0))
rc('animation', html='html5')

#Import our simulator
from SWESimulators import CTCS, CDKLM16, PlotHelper, Common
#Import initial condition and bathymetry generating functions:
from SWESimulators.BathymetryAndICs import *
from SWESimulators import CPUDrifter
from SWESimulators import Resampling

In [None]:
#Make sure we get compiler output from OpenCL
os.environ["PYOPENCL_COMPILER_OUTPUT"] = "1"

#Set which CL device to use, and disable kernel caching
if (str.lower(sys.platform).startswith("linux")):
    os.environ["PYOPENCL_CTX"] = "0"
else:
    os.environ["PYOPENCL_CTX"] = "1"
os.environ["CUDA_CACHE_DISABLE"] = "1"
os.environ["PYOPENCL_COMPILER_OUTPUT"] = "1"
os.environ["PYOPENCL_NO_CACHE"] = "1"

#Create OpenCL context
cl_ctx = pyopencl.create_some_context()
print "Using ", cl_ctx.devices[0].name

## Thoughts on code structure

The observation will in these initial cases be a chosen model realization. When initializing a data assimilation with N particles, N+1 particles should be created and distributed on simulators.

One hypothesis for our ocean simulator is that integrating 100 particles within the same simulation is equally expensive as integrating 1 particle. Each particle integration should be done with a single thread on the GPU, so all 100 particles will can be processed in parallel.

If this assumption is true, it is best to have all particle positions continuous in memory. Hence, it will be implemented as an struct of array.

## Create random particles, and create random observation

Particles are created in a GlobalParticle class, which holds the positions of all ensemble member particles, and one additional particle which serves as our observation.
This class should have all computational functionality that relies on the relationships between particles and the observation, and the ensemble itself.

Filtering and resampling of particles does not belong in this class.

List of functions that could be useful (and which of them are implemented):
- [x] initialize uniform on unit square
- [ ] initialize gaussian 
- [x] Calculate distances from observation
- [x] get weights from Gaussian distribution
- [x] get weights from Cauchy distribution
- [x] find ensemble mean position
- [ ] find ensemble variance
- [ ] set observation to a given coordinate
- [x] copy function

**About distances**: In order to calculate distances in a unified way, information about boundary conditions needs to be known by the class. E.g., the distance between a particle at $(0.99, 0.99)$ from an observation at $(0.01, 0.02)$ on a unit square domain, is about $\sqrt{2}$. However, with periodic boundary conditions, their distance is only $0.05$.

Let the domain size be $(L_x, L_y)$, and define the particle and observation positions as $(x_p, y_p)$ and $(x_o, y_o)$, respectively.
The minimal distance with a periodic boundary can then be found by
$$ d_{x,min} = \min \left\{ |x_p - x_o|, |(x_p - L_x) - x_o|, |(x_p + L_x) - x_o| \right\}$$
$$ d_{y,min} = \min \left\{ |y_p - y_o|, |(y_p - L_y) - y_o|, |(y_p + L_y) - y_o| \right\}$$
and 
$$ d_{min} = \sqrt{ d_{x,min}^2 + d_{y,min}^2}$$

** About ensemble mean position**: When finding the mean position of particles, the above considerations needs to be taken as well. The ensemble mean position should be found by the coordinate position which results in the minimal distance given above.
In other words, the position we should consider for the mean is
$$ x^*_p = {\arg\min}_{x \in \{x_p, x_p \pm L_x \}} |x - x_o|, $$
$$ y^*_p = {\arg\min}_{y \in \{y_p, y_p \pm L_y \}} |y - y_o|. $$

## Small test with Particle classs

In [None]:
# Initialize an ensemble of particles:
reload(CPUDrifter)
N = 100
observation_variance = 0.08
resample_variance = 0.005
bc = 2
boundaryConditions = Common.BoundaryConditions(bc,bc,bc,bc)
mainGlobalParticles = CPUDrifter.CPUDrifter(N, observation_variance, boundaryConditions=boundaryConditions)
mainGlobalParticles.initializeParticles()

###
mainGlobalParticles.positions[-1,0] = 0.1
mainGlobalParticles.positions[-1,1] = 0.5
###

# The particles are by now drawn from the prior distribution

# Here, the simulation/time integration should take place

# Inspect initial ensemble
mainGlobalParticles.plotDistanceInfo(title="Initial particles")
print "Observation: ", mainGlobalParticles.getObservationPosition()

print "Mean: ", mainGlobalParticles.getEnsembleMean()



In [None]:
print "Testing distance calculation over periodic boundary conditions"
test_N = 3
far   = "far      "
close = "close    "
test_solutions = np.array([[[far, far, far], [far, close, far]], [[far, far, close], [close, close, close]]])
for test_bc_ns in [1, 2]:
    for test_bc_ew in [1, 2]:
        test_boundaryConditions = Common.BoundaryConditions(test_bc_ns, test_bc_ew, test_bc_ns, test_bc_ew)
        test_particles = CPUDrifter.CPUDrifter(test_N, boundaryConditions=test_boundaryConditions)
        test_particles.positions[0,:] = [0.9, 0.9]
        test_particles.positions[1,:] = [0.9, 0.1]
        test_particles.positions[2,:] = [0.1, 0.9]
        test_particles.positions[3,:] = [0.1, 0.1]
        print "(test_bc_ns, test_bc_ew)", (test_bc_ns, test_bc_ew)
        print test_particles.getDistances()
        print test_solutions[test_bc_ns-1, test_bc_ew-1, :]
        print "--------------------"
print test_particles.positions
#test_particles.plotDistanceInfo()

print "--------------------------"
print "Testing enforcement of periodic boundary conditions"

test_N = 5
test_bc = 2
test_boundaryConditions = Common.BoundaryConditions(test_bc, test_bc, test_bc, test_bc)
test_particles = CPUDrifter.CPUDrifter(test_N, boundaryConditions=test_boundaryConditions)
test_particles.positions[0,:] = [0.5, 0.6]
test_particles.positions[1,:] = [0.6, 0.5]
test_particles.positions[2,:] = [-0.1, 0.5]
test_particles.positions[3,:] = [0.5, -0.1]
test_particles.positions[4,:] = [1.1, 1.1]
test_particles.positions[-1,:] = [0.5, 0.5]
test_particles.plotDistanceInfo()
test_particles.enforceBoundaryConditions()
test_particles.plotDistanceInfo()
#print test_particles.getDistances()
print test_particles.positions
#test_particles.plotDistanceInfo()



## Probabilistic Resampling

In [None]:
globalParticles = mainGlobalParticles.copy()
globalParticles.plotDistanceInfo(title="Initial particles")

Resampling.probabilisticResampling(globalParticles, reinitialization_variance=0.0)
globalParticles.plotDistanceInfo(title="From probabilistic resampling (identical resampling)")

globalParticles = mainGlobalParticles.copy()
Resampling.probabilisticResampling(globalParticles, reinitialization_variance=resample_variance)
globalParticles.plotDistanceInfo(title="From probabilistic resampling (gaussian resampling)")

## Residual Sampling

In [None]:
reload(Resampling)
globalParticles = mainGlobalParticles.copy()
globalParticles.plotDistanceInfo(title="Initial particles")

Resampling.residualSampling(globalParticles, reinitialization_variance=resample_variance)
globalParticles.plotDistanceInfo(title="From residual sampling")

resize_ensemble_enabled = False
if resize_ensemble_enabled:
    globalParticles = mainGlobalParticles.copy()
    Resampling.residualSampling(globalParticles, reinitialization_variance=resample_variance, \
                                                               onlyDeterministic=True)
    globalParticles.plotDistanceInfo(title="From residual sampling - Deterministic part only")

    globalParticles = mainGlobalParticles.copy()
    Resampling.residualSampling(globalParticles, reinitialization_variance=resample_variance,\
                                                                onlyStochastic=True)
    globalParticles.plotDistanceInfo(title="From residual sampling - Stochastic part only")


## Stochastic Universal Sampling

In [None]:
reload(Resampling)
globalParticles = mainGlobalParticles.copy()
globalParticles.plotDistanceInfo(title="Initial particles")

Resampling.stochasticUniversalSampling(globalParticles, reinitialization_variance=resample_variance)
globalParticles.plotDistanceInfo(title="From Stochastic universal sampling")


## Monte Carlo Metropolis-Hasting

In [None]:
globalParticles = mainGlobalParticles.copy()
globalParticles.plotDistanceInfo(title="Initial particles")

Resampling.metropolisHastingSampling(globalParticles, reinitialization_variance=resample_variance)
globalParticles.plotDistanceInfo(title="From Monte Carlo Metropolis-Hasting")

print "First particle is automatically chosen - position: ", globalParticles.positions[0,:]

# Naïve drift trajectories in the SWE simulators

Here, we will make a naive implementation of particles drifting within our simplified ocean models.
For simplicity, a non-staggered implementation is chosen, as it makes it easier to evaluate the velocity field.


In [None]:
# DEFINE PARAMETERS

#Coriolis well balanced reconstruction scheme
nx = 50
ny = 50

dx = 4.0
dy = 4.0

dt = 0.1
g = 9.81

f = 0.5
r = 0.0

waterHeight = 10

# WIND
wind = Common.WindStressParams(type=99)

ghosts = np.array([2,2,2,2]) # north, east, south, west
validDomain = np.array([2,2,2,2])
boundaryConditions = Common.BoundaryConditions(2,2,2,2)

# Define which cell index which has lower left corner as position (0,0)
x_zero_ref = 2
y_zero_ref = 2

dataShape = (ny + ghosts[0]+ghosts[2], 
             nx + ghosts[1]+ghosts[3])

eta0 = np.zeros(dataShape, dtype=np.float32, order='C');
u0 = np.zeros(dataShape, dtype=np.float32, order='C');
v0 = np.zeros(dataShape, dtype=np.float32, order='C');

# Bathymetry:
Hi = np.ones((dataShape[0]+1, dataShape[1]+1), dtype=np.float32, order='C')*waterHeight

# Add disturbance:
addBump(eta0, nx, ny, dx, dy, 0.3, 0.5, 0.05, validDomain)
addBump(eta0, nx, ny, dx, dy, 0.7, 0.2, 0.10, validDomain)
addBump(eta0, nx, ny, dx, dy, 0.1, 0.8, 0.03, validDomain)
eta0 = eta0*0.3

#Calculate radius from center of bump for plotting
x_center = dx*nx/2.0
y_center = dy*ny/2.0
y_coords, x_coords = np.mgrid[0:ny*dy:dy, 0:nx*dx:dx]
#x_coords = np.subtract(x_coords, x_center)
#y_coords = np.subtract(y_coords, y_center)
radius = np.sqrt(np.multiply(x_coords, x_coords) + np.multiply(y_coords, y_coords))



In [None]:
np.random.seed(4)

## Define a bunch of particles to be released within the given domain
numParticles = 50
observation_variance = 5*dx
resample_variance = 10*dx

constOceanParticles = CPUDrifter.CPUDrifter(numParticles, observation_variance, boundaryConditions)
constOceanParticles.initializeParticles(dx*nx, dy*ny)

constOceanParticles.plotDistanceInfo(title="Initial particles")

copyOfConstOceanParticles = constOceanParticles.copy()
Resampling.residualSampling(copyOfConstOceanParticles, reinitialization_variance=resample_variance)
copyOfConstOceanParticles.plotDistanceInfo(title="From probabilistic resampling (gaussian resampling)")

In [None]:
def particleDrifter(particles, eta, hu, hv, H0, dt, nx, ny, dx, dy, x_zero_ref, y_zero_ref, \
                    sensitivity=1, doPrint=False):
    # Change positions by reference
    positions = particles.positions
    
    numParticles = positions.shape[0]
    # Loop over particles
    for i in range(numParticles):
        if doPrint: print "---------- Particle " + str(i) + " ---------------"
        x0, y0 = positions[i,0], positions[i,1]
        if doPrint: print "(x0, y0): ", (x0,y0)
        
        # First, find which cell each particle is in
        
        # In x-direction:
        cell_id_x = int(np.ceil(x0/dx) + x_zero_ref)
        cell_id_y = int(np.ceil(y0/dy) + y_zero_ref)
        
        if (cell_id_x < 0 or cell_id_x > nx + 4 or cell_id_y < 0 or cell_id_y > ny + 4):
            print "ERROR! Cell id " + str((cell_id_x, cell_id_y)) + " is outside of the domain!"
            print "\t\Particle position is: " + str((x0, y0))
        
        if doPrint: print "cell values in x-direction: ", ((cell_id_x-2-0.5)*dx, (cell_id_x-2+0.5)*dx)
        if doPrint: print "cell values in y-direction: ", ((cell_id_y-2-0.5)*dy, (cell_id_y-2+0.5)*dy)
        
        h = waterHeight + eta[cell_id_y, cell_id_x]
        u = hu[cell_id_y, cell_id_x]
        v = hv[cell_id_y, cell_id_x]
        
        if doPrint: print "Velocity: ", (u, v)
        
        x1 = sensitivity*u*dt + x0
        y1 = sensitivity*v*dt + y0
        if doPrint: print "(x1, y1): ", (positions[i,0], positions[i,1])
        
        positions[i,0] = x1
        positions[i,1] = y1
        
    
    # Check what we assume is periodic boundary conditions    
    particles.enforceBoundaryConditions()
    #applyPeriodicBoundaryConditionsToParticles(positions, nx, ny, dx, dy)
        
        
#eta1, hu1, hv1 = sim.download()
#particleDrifter(oceanParticles.positions, eta1, hu1, hv1, waterHeight, \
#                dt, nx, ny, dx, dy, x_zero_ref, y_zero_ref, \
#                doPrint=True)

In [None]:

oceanParticles = constOceanParticles.copy()

#Clean up old simulator if any:
if 'sim' in globals():
    sim.cleanUp()
    
#Initialize simulator
reload(CDKLM16)
reload(PlotHelper)
sim = CDKLM16.CDKLM16(cl_ctx, eta0, u0, v0, Hi, \
                nx, ny, dx, dy, dt, g, f, r, \
                boundary_conditions=boundaryConditions)

fig = plt.figure()
plotter = PlotHelper.PlotHelper(fig, x_coords, y_coords, radius, 
                                eta0[validDomain[2]:-validDomain[0], validDomain[3]:-validDomain[1]], 
                                u0[validDomain[2]:-validDomain[0], validDomain[3]:-validDomain[1]], 
                                v0[validDomain[2]:-validDomain[0], validDomain[3]:-validDomain[1]])

plotter.showParticles(oceanParticles)

T = 200
sensitivity = 20
loopsPerFrame = 10
oceanParticleSets = [oceanParticles.copy()]
plotTitles = ["Initial ensemble"]
def animate(i):
    if (i>0):
        for j in range(loopsPerFrame):
            t = sim.step(dt)
            eta1, hu1, hv1 = sim.download()

            particleDrifter(oceanParticles, eta1, hu1, hv1, waterHeight, \
                            dt, nx, ny, dx, dy, x_zero_ref, y_zero_ref, sensitivity=sensitivity)

    else:
        t = 0.0

    eta1, hu1, hv1 = sim.download()
    plotter.plot(eta1[validDomain[2]:-validDomain[0], validDomain[3]:-validDomain[1]], 
                 hu1[validDomain[2]:-validDomain[0], validDomain[3]:-validDomain[1]], 
                 hv1[validDomain[2]:-validDomain[0], validDomain[3]:-validDomain[1]]);
    
    plotter.showParticles(oceanParticles)
    
    fig.suptitle("CDKLM16 with CPU drift (high sensitivity) using residualSampling - Time = " + "{:04.0f}".format(t) + " s", fontsize=18)
    
    if (i%50 == 0 and i > 0):
        oceanParticleSets.append(oceanParticles.copy())
        plotTitles.append("Before particle filter at t = " + str(t))
        
        Resampling.residualSampling(oceanParticles, reinitialization_variance=resample_variance)
        
        oceanParticleSets.append(oceanParticles.copy())
        plotTitles.append("After particle filter at t = " + str(t))

    if (i%20 == 0):
        print "{:03.0f}".format(100*i / T) + " % => t=" + str(t) + "\tMax eta: " + str(np.max(eta1)) + \
        "\tMax hu: " + str(np.max(hu1)) + \
        "\tMax hv: " + str(np.max(hv1))
        print "\t\tObservation pos: ", oceanParticles.getObservationPosition()
                     
anim = animation.FuncAnimation(fig, animate, range(T), interval=100)
plt.close(anim._fig)
anim

In [None]:
for i in range(len(oceanParticleSets)):
    oceanParticleSets[i].plotDistanceInfo(title=plotTitles[i])

In [None]:
fig = plt.figure(figsize=(4,4))
posis = np.array([[0.1, 0.2, 0.3, 0.4],[0.2, 0.4, 0.6, 0.8]])
print posis
scat = plt.scatter(x=posis[0,:], y=posis[1,:])
plt.xlim(0,1)
plt.ylim(0,1)
plt.grid()
posis[0,:] = posis[0,:]*1.8
print posis
scat.set_offsets(posis.T)


In [None]:
#### Sorting example from https://stackoverflow.com/a/21077060

people = np.array(['Jim', 'Pam', 'Micheal', 'Dwight'])
ages = np.array([27, 25, 4, 9])
sorted_indices = ages.argsort()
print people[sorted_indices]