# NumPy array excursions

The purpose of this notebook is to explore multidimensional NumPy arrays so that DECSKS can manipulate higher order objects instead of passing only 1D arrays in the application of convected scheme (DECSKS.lib.convect.scheme). We aim to better understand the array structures by some index slicing exercises as well as verifying products of higher order objects reproduce the same computations as the 1D array multiplications that are involved in convected scheme.

The 1D1V version of DECSKS evolves a density function $f = f(t,x,v)\in \mathbb{R}^+\times\mathbb{R}\times\mathbb{R}$, stored as a three-dimensional array $\underline{\underline{f}} = \underline{\underline{f}}(\underline{t},\underline{x},\underline{v})\in [0,T]\times\mathcal{D}_x\times\mathcal{D}_v$

where

\begin{eqnarray*}
\underline{t} & = (t^n) = & (t^0, t^1, t^2, \ldots , t^{N_t})\\
&& \\
\underline{x} & = (x_i) = & (x_0, x_1, x_2, \ldots , x_{N_x - 1}) \\
&& \\
\underline{v} & = (v_j) = & (v_0, v_1, v_2, \ldots , v_{N_v - 1}) 
\end{eqnarray*}

We define spacings on the uniform grids as:

$$\Delta t = \frac{T}{N_t}, \qquad \Delta x = \frac{L_x}{N_x - 1}, \qquad \Delta v = \frac{L_v}{N_v - 1}$$

$T\in\mathbb{R}^+$ is the simulation time, $L_x = |\mathcal{D}_x| = |b_x - a_x|$ and $L_v = |\mathcal{D}_v| = b_v - a_v$, so that

$$t^n = n\Delta t, \qquad x_i = a_x + i\Delta x, \qquad v_j = a_v + j\Delta v$$

for $\mathcal{D}_x = [a_x,b_x]$, $\mathcal{D}_v = [a_v,b_v]$, $a_x,a_v,b_x,b_v\in\mathbb{R}$ consistute the $x$ and $v$ grids. In this notebook, we are only interested in understanding the shapes of higher dimensional arrays and how to index them in a way that allows confident access of anything desired. Hence, the contents of the array does not matter inasmuch as the elements of each vector can be traced. We create numpy arrays of strings for easy identification and decide on distinct sizes: $N_x = 5, N_v = 7, N_t = 3$ since it is a rare case that we elect to use the same number of gridpoints in more than one grid. Each vector $\underline{x}, \underline{v}, \underline{t}$ are themselves 1D which altogether constitute a 3D object by straightforward cartesian product. Note, that $N_t$ is the number of timesteps, not the number of time gridpoints, hence we have $N_t+1$ time gridpoints, whereas all other phase space variables we use the definition that $N_{x,v}$ denotes the total number of gridpoints, hence we have the following vectors enumerated per 0-based indexing

    x = ['x0', 'x1', 'x2', 'x3', 'x4']
    v = ['v0', 'v1', 'v2', 'v3', 'v4', 'v5', 'v6']
    t = ['t0', 't1', 't2', 't3']
    
Construct the arrays:

In [75]:
import numpy as np

x = np.array(['x0', 'x1', 'x2', 'x3', 'x4'])
v = np.array(['v0', 'v1', 'v2', 'v3', 'v4', 'v5', 'v6'])
t = np.array(['t0', 't1', 't2', 't3'])

Nx,Nv = len(x), len(v)
Nt = len(t) - 1

create a higher order object f = f(t,x,v). In a 1D1V Vlasovian electrostatic plasma, f is density function whose value at each (x,v) gives the density at some time t. Here, layering more physical significant gets in the way of our goal of our indexing exercise. Hence, for clarity we choose f to be the identity function, so that f = f(t,x,v) = (t,x,v) is the 3D object itself. We load densities in a straightforward way, casting each entry as a 3-tuple.

In [76]:
f = np.chararray([Nt+1,Nx,Nv], itemsize = 14)
comma = ','
delimiter_left = '('
delimiter_right = ')'

for n in range(Nt+1):
    for i in range(Nx):
        for j in range(Nv):
            entry = delimiter_left + t[n] + comma + x[i] + comma + v[j] + delimiter_right
            f[n,i,j] = entry

we now do some index slicing to see what kinds of arrays we get. If I try f(t0,x,v) on paper, I would expect to get a 2D array with ordered pairs (x,v) for time zero t0.

In [77]:
print f[0,:,:]

[['(t0,x0,v0)' '(t0,x0,v1)' '(t0,x0,v2)' '(t0,x0,v3)' '(t0,x0,v4)'
  '(t0,x0,v5)' '(t0,x0,v6)']
 ['(t0,x1,v0)' '(t0,x1,v1)' '(t0,x1,v2)' '(t0,x1,v3)' '(t0,x1,v4)'
  '(t0,x1,v5)' '(t0,x1,v6)']
 ['(t0,x2,v0)' '(t0,x2,v1)' '(t0,x2,v2)' '(t0,x2,v3)' '(t0,x2,v4)'
  '(t0,x2,v5)' '(t0,x2,v6)']
 ['(t0,x3,v0)' '(t0,x3,v1)' '(t0,x3,v2)' '(t0,x3,v3)' '(t0,x3,v4)'
  '(t0,x3,v5)' '(t0,x3,v6)']
 ['(t0,x4,v0)' '(t0,x4,v1)' '(t0,x4,v2)' '(t0,x4,v3)' '(t0,x4,v4)'
  '(t0,x4,v5)' '(t0,x4,v6)']]


This checks out. If I try f(0,x,v1), I would expect to get a 1D array of all x values with velocity v1, at time 0.

In [78]:
print f[0,:,1]

['(t0,x0,v1)' '(t0,x1,v1)' '(t0,x2,v1)' '(t0,x3,v1)' '(t0,x4,v1)']


This also checks out. If I try f(t3,x4,v) I would expect to get a 1D array of all v at location x4 at time t3

In [79]:
print f[3,4,:]

['(t3,x4,v0)' '(t3,x4,v1)' '(t3,x4,v2)' '(t3,x4,v3)' '(t3,x4,v4)'
 '(t3,x4,v5)' '(t3,x4,v6)']


This also checks out. Everything works as expected incidentally. If I try f(t,x2,v6), I expect to get a 1D array of all times correspondent to the pair (x2,v6). This sounds a little silly in this language, in an actual problem this would correspond to the density at (x2,v6) over all times in the simulation.

In [80]:
print f[:,2,6]

['(t0,x2,v6)' '(t1,x2,v6)' '(t2,x2,v6)' '(t3,x2,v6)']


In [83]:
print x
print v
print t

['x0' 'x1' 'x2' 'x3' 'x4']
['v0' 'v1' 'v2' 'v3' 'v4' 'v5' 'v6']
['t0' 't1' 't2' 't3']


Next, we try to recast 1D array multiplications with CFL numbers at a given x or v, and try to do it in one sweep as array (matrix) multiplication.

To examine how array multiplication transpires, we require numerical entries. We set up

\begin{eqnarray*}
\underline{x} & = & (x_0, x_1, x_2, x_3, x_4) = (10, 11, 12, 13, 14) \\
&& \\
\underline{v} & = & (v_0, v_1, v_2, v_3, v_4, v_5, v_6) = (20, 21, 22, 23, 24, 25, 26) \\
&& \\
\underline{t} & = & (t_0, t_1, t_2, t_3) = (0, 1, 2, 3)
\end{eqnarray*}

And construct the array f as

$$\underline{\underline{f}} = \left\{
\left(\begin{array}{ccc ccc c} 
x_0v_0 & x_0v_1 & x_0v_2 & x_0v_3 & x_0v_4 & x_0v_5 &x_0v_6 \\
x_1v_0 & x_1v_1 & x_1v_2 & x_1v_3 & x_1v_4 & x_1v_5 &x_1v_6 \\
x_2v_0 & x_2v_1 & x_2v_2 & x_2v_3 & x_2v_4 & x_2v_5 &x_2v_6 \\
x_3v_0 & x_3v_1 & x_3v_2 & x_3v_3 & x_3v_4 & x_3v_5 &x_3v_6 \\
x_4v_0 & x_4v_1 & x_4v_2 & x_4v_3 & x_4v_4 & x_4v_5 &x_4v_6 
\end{array}
\right)_{t = t_0}\\
\left(\begin{array}{ccc ccc c} 
x_0v_0 & x_0v_1 & x_0v_2 & x_0v_3 & x_0v_4 & x_0v_5 &x_0v_6 \\
x_1v_0 & x_1v_1 & x_1v_2 & x_1v_3 & x_1v_4 & x_1v_5 &x_1v_6 \\
x_2v_0 & x_2v_1 & x_2v_2 & x_2v_3 & x_2v_4 & x_2v_5 &x_2v_6 \\
x_3v_0 & x_3v_1 & x_3v_2 & x_3v_3 & x_3v_4 & x_3v_5 &x_3v_6 \\
x_4v_0 & x_4v_1 & x_4v_2 & x_4v_3 & x_4v_4 & x_4v_5 &x_4v_6 
\end{array}
\right)_{t = t_1}\\
\left(\begin{array}{ccc ccc c} 
x_0v_0 & x_0v_1 & x_0v_2 & x_0v_3 & x_0v_4 & x_0v_5 &x_0v_6 \\
x_1v_0 & x_1v_1 & x_1v_2 & x_1v_3 & x_1v_4 & x_1v_5 &x_1v_6 \\
x_2v_0 & x_2v_1 & x_2v_2 & x_2v_3 & x_2v_4 & x_2v_5 &x_2v_6 \\
x_3v_0 & x_3v_1 & x_3v_2 & x_3v_3 & x_3v_4 & x_3v_5 &x_3v_6 \\
x_4v_0 & x_4v_1 & x_4v_2 & x_4v_3 & x_4v_4 & x_4v_5 &x_4v_6 
\end{array}
\right)_{t = t_2} \\
\left(\begin{array}{ccc ccc c} 
x_0v_0 & x_0v_1 & x_0v_2 & x_0v_3 & x_0v_4 & x_0v_5 &x_0v_6 \\
x_1v_0 & x_1v_1 & x_1v_2 & x_1v_3 & x_1v_4 & x_1v_5 &x_1v_6 \\
x_2v_0 & x_2v_1 & x_2v_2 & x_2v_3 & x_2v_4 & x_2v_5 &x_2v_6 \\
x_3v_0 & x_3v_1 & x_3v_2 & x_3v_3 & x_3v_4 & x_3v_5 &x_3v_6 \\
x_4v_0 & x_4v_1 & x_4v_2 & x_4v_3 & x_4v_4 & x_4v_5 &x_4v_6 
\end{array}
\right)_{t = t_3}
\right\}$$

Here, $t$ fulfills the third dimension. To help with our intuition, we can regard $t$ as enumerating "pages" of 2D arrays $x_iv_j = x_i(t)v_j(t)$. In DECSKS, we begin with a container f = np.zeros([t.Ngridpoints, x.N, v.N]), and intialize the problem by filling the density at time zero f[0, :, :] $\equiv \underline{\underline{f}}(0,\underline{x},\underline{v})$. Time stepping must one-by-one fill in the remaining entries f[1:, :, :] $\equiv \underline{\underline{f}}(\underline{t} > 0, \underline{x},\underline{v})$. Thus, the multiplications we are interested in involve 2D arrays. We pick any $t$ above, and thus examine matrices that look like

$$\underline{\underline{g}} = \left(\begin{array}{ccc ccc c} 
x_0v_0 & x_0v_1 & x_0v_2 & x_0v_3 & x_0v_4 & x_0v_5 &x_0v_6 \\
x_1v_0 & x_1v_1 & x_1v_2 & x_1v_3 & x_1v_4 & x_1v_5 &x_1v_6 \\
x_2v_0 & x_2v_1 & x_2v_2 & x_2v_3 & x_2v_4 & x_2v_5 &x_2v_6 \\
x_3v_0 & x_3v_1 & x_3v_2 & x_3v_3 & x_3v_4 & x_3v_5 &x_3v_6 \\
x_4v_0 & x_4v_1 & x_4v_2 & x_4v_3 & x_4v_4 & x_4v_5 &x_4v_6 
\end{array}
\right)$$

Which we decide is an object that has been computed as the dyadic $\underline{x}\underline{v}$ just so we can keep track easily of the multiplication that transpires. We construct the array g below:

In [92]:
import numpy as np

x = np.array([10, 11, 12, 13, 14])
v = np.array([20, 21, 22, 23, 24, 25, 26])

g = np.outer(x,v)

We check the entries have been computed properly in the following loop:

In [104]:
for i in range(len(x)):
    for j in range(len(v)):
        entry = x[i]*v[j]
        
        if g[i,j] == entry:
            print "TRUE: g[%d,%d] = x[%d]v[%d] = %d*%d = %d" % (i,j, i,j, x[i], v[j], entry)
        else:
            print "FALSE: the entry is not the expected product"

TRUE: g[0,0] = x[0]v[0] = 10*20 = 200
TRUE: g[0,1] = x[0]v[1] = 10*21 = 210
TRUE: g[0,2] = x[0]v[2] = 10*22 = 220
TRUE: g[0,3] = x[0]v[3] = 10*23 = 230
TRUE: g[0,4] = x[0]v[4] = 10*24 = 240
TRUE: g[0,5] = x[0]v[5] = 10*25 = 250
TRUE: g[0,6] = x[0]v[6] = 10*26 = 260
TRUE: g[1,0] = x[1]v[0] = 11*20 = 220
TRUE: g[1,1] = x[1]v[1] = 11*21 = 231
TRUE: g[1,2] = x[1]v[2] = 11*22 = 242
TRUE: g[1,3] = x[1]v[3] = 11*23 = 253
TRUE: g[1,4] = x[1]v[4] = 11*24 = 264
TRUE: g[1,5] = x[1]v[5] = 11*25 = 275
TRUE: g[1,6] = x[1]v[6] = 11*26 = 286
TRUE: g[2,0] = x[2]v[0] = 12*20 = 240
TRUE: g[2,1] = x[2]v[1] = 12*21 = 252
TRUE: g[2,2] = x[2]v[2] = 12*22 = 264
TRUE: g[2,3] = x[2]v[3] = 12*23 = 276
TRUE: g[2,4] = x[2]v[4] = 12*24 = 288
TRUE: g[2,5] = x[2]v[5] = 12*25 = 300
TRUE: g[2,6] = x[2]v[6] = 12*26 = 312
TRUE: g[3,0] = x[3]v[0] = 13*20 = 260
TRUE: g[3,1] = x[3]v[1] = 13*21 = 273
TRUE: g[3,2] = x[3]v[2] = 13*22 = 286
TRUE: g[3,3] = x[3]v[3] = 13*23 = 299
TRUE: g[3,4] = x[3]v[4] = 13*24 = 312
TRUE: g[3,5]

Thus, we have the following array to work with

In [103]:
print 'g = '
print g

print '\n which has dimensions %s' % (str(g.shape))

g = 
[[200 210 220 230 240 250 260]
 [220 231 242 253 264 275 286]
 [240 252 264 276 288 300 312]
 [260 273 286 299 312 325 338]
 [280 294 308 322 336 350 364]]

 which has dimensions (5, 7)


Currently, DECSKS updates every 1D subarray in routines that look like (note, the phase space variables x and v and time variable t are instantiations of class objects that carry with them attriutes that fully characterize them, number of gridpoints, prepoint values, postpoints of all the MCs, domain boundaries, etc. The attribute names below should be clear enough to follow the meaning in the below pseudo-code) :

    if convecting in x:
        for j in v.prepoints:
            x.MCs   = x.generate_Lagrangian_mesh(x.prepointvalues, v.prepointvalues[j], frac*t.width)
            g[:,j] = DECSKS.lib.convect.scheme(g[:,j], sim_params, args*)
    
    elif convecting in v:
        compute acceleration a = a(x)
        for i in x.indices:
            v.MCs   = v.generate_Lagrangian_mesh(v.prepointvalues, a[i], frac*t.width)
            g[i,:] = DECSKS.lib.convect.scheme(g[i,:], sim_params, args*)
        
where a is a self-consistently computed acceleration. We wish to recast the above in terms of something to the effect of:

    if convecting in x:
        x.MCs   = x.generate_Lagrangian_mesh(x.prepointvalues, v.prepointvalues, frac*t.width)
        g = DECSKS.lib.convect.scheme(g, sim_params, args*)
    elif convecting in v
        compute acceleration a = a(x)
        v.MCs   = v.generate_Lagrangian_mesh(v.prepointvalues, frac*t.width)
        g = DECSKS.lib.convect.scheme(g, sim_params, args*)
    
where 2D arrays are passed. and x.MCs and v.MCs should be recasted from 1D arrays to a collection of 1D arrays (i.e. 2D array) that can calculate and store all the MC postpoints for all prepoints in x and v by passing, say instead of v.prepointvalues[j], we pass v.prepointvalues as a whole).

In [115]:
print x
print v

[10 11 12 13 14]
[20 21 22 23 24 25 26]


In [133]:

one = np.ones(len(v))
X = np.outer(x,one)

X + v*.01

array([[ 10.2 ,  10.21,  10.22,  10.23,  10.24,  10.25,  10.26],
       [ 11.2 ,  11.21,  11.22,  11.23,  11.24,  11.25,  11.26],
       [ 12.2 ,  12.21,  12.22,  12.23,  12.24,  12.25,  12.26],
       [ 13.2 ,  13.21,  13.22,  13.23,  13.24,  13.25,  13.26],
       [ 14.2 ,  14.21,  14.22,  14.23,  14.24,  14.25,  14.26]])

In [172]:
onet = np.ones([1,len(v)])

In [173]:
onet.shape

(1, 7)

In [175]:
onet[0,1]

1.0

In [176]:
onett = np.ones([1,len(x)])

In [190]:
np.outer(x,np.ones([1,len(v)])) + np.outer(np.ones([len(x),1]),v*.01)

array([[ 10.2 ,  10.21,  10.22,  10.23,  10.24,  10.25,  10.26],
       [ 11.2 ,  11.21,  11.22,  11.23,  11.24,  11.25,  11.26],
       [ 12.2 ,  12.21,  12.22,  12.23,  12.24,  12.25,  12.26],
       [ 13.2 ,  13.21,  13.22,  13.23,  13.24,  13.25,  13.26],
       [ 14.2 ,  14.21,  14.22,  14.23,  14.24,  14.25,  14.26]])

In [188]:
print np.transpose(x).shape
print x.shape

(5,)
(5,)


In [194]:
z = np.zeros([1,5])
print z.shape
print np.transpose(z).shape

(1, 5)
(5, 1)


In [198]:
lst = ['a', 'c', 'd']

if 'b' in lst:
    print 'yes'
else:
    print 'no'

no


In [199]:
import numpy as np
import numpy.ma as ma
x = np.array([1, 2, 3, -1, 5])

In [200]:
 mx = ma.masked_array(x, mask=[0, 0, 0, 1, 0])


In [204]:
mx.mean()

2.75

In [205]:
x.mean()

2.0

In [206]:
(1 + 2 + 3 + 5) / 4.

2.75

In [207]:
(1 + 2 + 3 - 1 + 5) / 5.

2.0

In [208]:
len(mx)

5

In [209]:
len(x)

5

In [210]:
print mx

[1 2 3 -- 5]


In [211]:
sum(mx)

masked

In [213]:
mx

masked_array(data = [1 2 3 -- 5],
             mask = [False False False  True False],
       fill_value = 999999)

In [214]:
mx.data

array([ 1,  2,  3, -1,  5])

In [220]:
mx.data[3]

-1

In [221]:
x

array([ 1,  2,  3, -1,  5])

In [223]:
xmask = ma.masked_greater(x,0)

In [224]:
xmask

masked_array(data = [-- -- -- -1 --],
             mask = [ True  True  True False  True],
       fill_value = 999999)

In [225]:
xmask.data

array([ 1,  2,  3, -1,  5])

In [226]:
np.floor(2.5)

2.0

In [231]:
np.ceil(ma.masked_greater_equal(x,0))

masked_array(data = [-- -- -- -1.0 --],
             mask = [ True  True  True False  True],
       fill_value = 1e+20)

In [233]:
b = np.floor(ma.masked_less(x,0))

In [234]:
b.astype(int)

masked_array(data = [1 2 3 -- 5],
             mask = [False False False  True False],
       fill_value = 999999)