# Numpy: Numerical Python package 

- [NumPy](https://www.numpy.org) is perhaps the most important  package for numerical computing in python
- the n-dimentional array in NumPy used a basic  object in most python packages for data exchange
  - we will look at its methods and semantics starting today through examples
- some of the most important features of Num Py 
  - ndarray: multidimensional array for fast and efficient array-oriented operations and arithmetics
  - mathematical functions for fast operation on  arrays  without using loops and iterations
  - tools for I/O to and from disk
  - Linear algebra
  - random generation 
  - API to connect NumPy with C and C++ libraries

## NumPy ndarray: a multidimensional array object
while we will oftenuse it for 2D and 3D calculations , an ndarray can have an arbitary dimension

Let's first see how much faster is an ndarray object

In [1]:
import numpy as np

my_arr = np.arange(100000)
print(type(my_arr))

my_list = list(range(100000))
print(type(my_list))

%time for _ in range(10): my_arr2 = my_arr * 2

%time for _ in range(10): my_list2 = [ x*2 for x in my_list ]
    

<class 'numpy.ndarray'>
<class 'list'>
CPU times: user 1.4 ms, sys: 698 µs, total: 2.1 ms
Wall time: 1.86 ms
CPU times: user 67.5 ms, sys: 14.3 ms, total: 81.8 ms
Wall time: 81.9 ms


__NumPy based algorithms are generally 10 to 100 times faster than pure python counterparts!__

In [2]:
import numpy as np
arr2 = np.array( [ [ 1,2,3], [4,5,6], [7,8,9]  ] )
print(arr2)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [3]:
arr3 = np.random.rand(2,5)
print(arr3)
arr4 = np.random.rand(2,2,2)
print(arr4)

[[0.84427596 0.27861956 0.25615588 0.11302969 0.36882147]
 [0.12166928 0.45103546 0.1540956  0.51679095 0.32092015]]
[[[0.54182932 0.57173973]
  [0.35215243 0.66480996]]

 [[0.23963299 0.23191085]
  [0.60556823 0.37112058]]]


## creating ndarrays
As seen above, all NumPy functions and classes use ndarray as return type. You only need to specify the dimensions and the size of the array to be created

### uniform random numbers

In [19]:
np.random.rand(10,3)

array([[0.96153329, 0.68227069, 0.74699186],
       [0.56530556, 0.78139006, 0.98501338],
       [0.0031835 , 0.75006218, 0.28725375],
       [0.4853958 , 0.93083974, 0.77128018],
       [0.71460934, 0.13556474, 0.91235606],
       [0.87502189, 0.33686751, 0.64443819],
       [0.81927906, 0.5344741 , 0.85360352],
       [0.73920228, 0.79252824, 0.31396606],
       [0.80641295, 0.76174506, 0.5673283 ],
       [0.42876781, 0.3999522 , 0.20277509]])

### array of zeros

In [21]:
v = np.zeros(3)
print(v)

[0. 0. 0.]


In [23]:
A = np.zeros( (3,4) )
print(A)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]


### arrays of 1

In [24]:
w = np.ones(4)
print(w)

[1. 1. 1. 1.]


In [26]:
B = np.ones ( (3,4))
print(B)


[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]


In [27]:
z = np.full( 4, fill_value=3.12)
print(z)

[3.12 3.12 3.12 3.12]


In [29]:
C = np.full( (4,5), fill_value=-4.3)
print(C)

[[-4.3 -4.3 -4.3 -4.3 -4.3]
 [-4.3 -4.3 -4.3 -4.3 -4.3]
 [-4.3 -4.3 -4.3 -4.3 -4.3]
 [-4.3 -4.3 -4.3 -4.3 -4.3]]


### Identity array

In [74]:
data = np.identity(3)
print(data)

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


## Shape of arrays
each ndarray is characterized by its 
- shape
- size 
- type of data

In [75]:
data = np.array([ [-1., 2.3], [2.3, 4.5], [-8.4, 1.9] ])
print(data)
print(type(data))
print(data.ndim)
print(data.shape)
print(data.dtype)

[[-1.   2.3]
 [ 2.3  4.5]
 [-8.4  1.9]]
<class 'numpy.ndarray'>
2
(3, 2)
float64


## Reshaping arrays

In [127]:
data = np.arange(1,101)
data

array([  1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,
        14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,  26,
        27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,  39,
        40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,  52,
        53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,  65,
        66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,  78,
        79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,  91,
        92,  93,  94,  95,  96,  97,  98,  99, 100])

In [128]:
mat1 = data.reshape(4,25)
mat1

array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,
         14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25],
       [ 26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,
         39,  40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50],
       [ 51,  52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,
         64,  65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75],
       [ 76,  77,  78,  79,  80,  81,  82,  83,  84,  85,  86,  87,  88,
         89,  90,  91,  92,  93,  94,  95,  96,  97,  98,  99, 100]])

In [129]:
mat2 = mat1.reshape(10,10)
mat2

array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10],
       [ 11,  12,  13,  14,  15,  16,  17,  18,  19,  20],
       [ 21,  22,  23,  24,  25,  26,  27,  28,  29,  30],
       [ 31,  32,  33,  34,  35,  36,  37,  38,  39,  40],
       [ 41,  42,  43,  44,  45,  46,  47,  48,  49,  50],
       [ 51,  52,  53,  54,  55,  56,  57,  58,  59,  60],
       [ 61,  62,  63,  64,  65,  66,  67,  68,  69,  70],
       [ 71,  72,  73,  74,  75,  76,  77,  78,  79,  80],
       [ 81,  82,  83,  84,  85,  86,  87,  88,  89,  90],
       [ 91,  92,  93,  94,  95,  96,  97,  98,  99, 100]])

Note that the original data array has not been modified

In [130]:
data

array([  1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,
        14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,  26,
        27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,  39,
        40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,  52,
        53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,  65,
        66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,  78,
        79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,  91,
        92,  93,  94,  95,  96,  97,  98,  99, 100])

You can create an array with the same shape of an existing array

In [135]:
np.ones_like(data)
print(data2)

[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]


In [112]:
np.zeros_like(data)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [114]:
np.full_like(data, -4.5)

array([-4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4,
       -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4,
       -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4,
       -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4,
       -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4,
       -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4])

Since the array used for the shape had integers, the new array is also made of integers, despite you providing `4.5` as fill_value. If you need a float array, then you have to specify it:

In [133]:
np.full_like(data, fill_value=-4.5,dtype='float64')

array([-4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5])

## Operations with ndarray
you can use an ndarray for  basic mathematical operations tyoically used with scalars.



### Arithmetics with arrays

In [134]:
data = np.arange(1,17).reshape(4,4)
print(data)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]]


In [136]:
data2 = np.ones_like(data)
print(data2)

[[1 1 1 1]
 [1 1 1 1]
 [1 1 1 1]
 [1 1 1 1]]


In [137]:
data + data2

array([[ 2,  3,  4,  5],
       [ 6,  7,  8,  9],
       [10, 11, 12, 13],
       [14, 15, 16, 17]])

In [138]:
data - data2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

Division by an array applies the divison to each element. It is not the same as inverting an array

In [139]:
1./data

array([[1.        , 0.5       , 0.33333333, 0.25      ],
       [0.2       , 0.16666667, 0.14285714, 0.125     ],
       [0.11111111, 0.1       , 0.09090909, 0.08333333],
       [0.07692308, 0.07142857, 0.06666667, 0.0625    ]])

In [142]:
10 / data

array([[10.        ,  5.        ,  3.33333333,  2.5       ],
       [ 2.        ,  1.66666667,  1.42857143,  1.25      ],
       [ 1.11111111,  1.        ,  0.90909091,  0.83333333],
       [ 0.76923077,  0.71428571,  0.66666667,  0.625     ]])

Adding a scalar to an array, adds the same value to all cells

In [144]:
3+data

array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

In [146]:
3.67+ np.sin(data)/data

array([[4.51147098, 4.12464871, 3.71704   , 3.48079938],
       [3.47821515, 3.62343075, 3.76385523, 3.79366978],
       [3.71579094, 3.61559789, 3.5790918 , 3.62528559],
       [3.70232054, 3.74075767, 3.71335252, 3.65200604]])

### functions
you have to use the NumPy functions so __np.sin__ instead of __math.sin__

In [119]:
data**2

array([    1,     4,     9,    16,    25,    36,    49,    64,    81,
         100,   121,   144,   169,   196,   225,   256,   289,   324,
         361,   400,   441,   484,   529,   576,   625,   676,   729,
         784,   841,   900,   961,  1024,  1089,  1156,  1225,  1296,
        1369,  1444,  1521,  1600,  1681,  1764,  1849,  1936,  2025,
        2116,  2209,  2304,  2401,  2500,  2601,  2704,  2809,  2916,
        3025,  3136,  3249,  3364,  3481,  3600,  3721,  3844,  3969,
        4096,  4225,  4356,  4489,  4624,  4761,  4900,  5041,  5184,
        5329,  5476,  5625,  5776,  5929,  6084,  6241,  6400,  6561,
        6724,  6889,  7056,  7225,  7396,  7569,  7744,  7921,  8100,
        8281,  8464,  8649,  8836,  9025,  9216,  9409,  9604,  9801,
       10000])

In [120]:
print(data)

[  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18
  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36
  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54
  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72
  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90
  91  92  93  94  95  96  97  98  99 100]


Note how applying the function __does not__ modify data. Instead a new array is returned. This is the same behavior as with scalars in other langguage and also in python.

In [121]:
data_sqr = data**2
print(data_sqr)

[    1     4     9    16    25    36    49    64    81   100   121   144
   169   196   225   256   289   324   361   400   441   484   529   576
   625   676   729   784   841   900   961  1024  1089  1156  1225  1296
  1369  1444  1521  1600  1681  1764  1849  1936  2025  2116  2209  2304
  2401  2500  2601  2704  2809  2916  3025  3136  3249  3364  3481  3600
  3721  3844  3969  4096  4225  4356  4489  4624  4761  4900  5041  5184
  5329  5476  5625  5776  5929  6084  6241  6400  6561  6724  6889  7056
  7225  7396  7569  7744  7921  8100  8281  8464  8649  8836  9025  9216
  9409  9604  9801 10000]


In [122]:
data_2 = np.log(data_sqr) + np.sin( data*np.pi)

In [123]:
data_3 = np.sqrt( data_sqr )
print(data_3)

[  1.   2.   3.   4.   5.   6.   7.   8.   9.  10.  11.  12.  13.  14.
  15.  16.  17.  18.  19.  20.  21.  22.  23.  24.  25.  26.  27.  28.
  29.  30.  31.  32.  33.  34.  35.  36.  37.  38.  39.  40.  41.  42.
  43.  44.  45.  46.  47.  48.  49.  50.  51.  52.  53.  54.  55.  56.
  57.  58.  59.  60.  61.  62.  63.  64.  65.  66.  67.  68.  69.  70.
  71.  72.  73.  74.  75.  76.  77.  78.  79.  80.  81.  82.  83.  84.
  85.  86.  87.  88.  89.  90.  91.  92.  93.  94.  95.  96.  97.  98.
  99. 100.]


## Indexing and slicing

In [151]:
data= np.arange(24).reshape(3,8)
data

array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20, 21, 22, 23]])

In [155]:
data[1,:]

array([ 8,  9, 10, 11, 12, 13, 14, 15])

In [154]:
data[:,4]

array([ 4, 12, 20])

In [157]:
data[:2,3:7]

array([[ 3,  4,  5,  6],
       [11, 12, 13, 14]])

As with the lists, all operations return a refernce to the original array, unless you make a copy


In [167]:
x = data[:2,1:3]
print(x)

[[ 1  2]
 [ 9 10]]


In [170]:
x[0]

array([1, 2])

In [171]:
x[0][1]

2

In [172]:
x[0][1] = -5.55
print(x)
print(data)

[[ 1 -5]
 [ 9 10]]
[[ 0  1 -5  3  4  5  6  7]
 [ 8  9 10 11 12 13 14 15]
 [16 17 18 19 20 21 22 23]]


So you have modified not just x, but also the original data array!

If this is not what you want, then you have to create a new copy

In [185]:
data= np.arange(16).reshape(4,4)
data

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [186]:
x = np.array( data[0])
print(x)

[0 1 2 3]


In [187]:
x[1:3]

array([1, 2])

In [189]:
x[1:3] = [ -2, -4]
print(x)

[ 0 -2 -4  3]


In [190]:
print(data)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]


## converting and casting
When using arange() the array's type is integer.

In [191]:
data= np.arange(42).reshape(6,7)
data

array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27],
       [28, 29, 30, 31, 32, 33, 34],
       [35, 36, 37, 38, 39, 40, 41]])

In [192]:
data[2]

array([14, 15, 16, 17, 18, 19, 20])

Sometimes you might want to changge its type to do floating point calculations

In [194]:
data[2][0] = -np.pi
data

array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [-3, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27],
       [28, 29, 30, 31, 32, 33, 34],
       [35, 36, 37, 38, 39, 40, 41]])

In [195]:
data_float = data.astype( np.float64)
data_float

array([[ 0.,  1.,  2.,  3.,  4.,  5.,  6.],
       [ 7.,  8.,  9., 10., 11., 12., 13.],
       [-3., 15., 16., 17., 18., 19., 20.],
       [21., 22., 23., 24., 25., 26., 27.],
       [28., 29., 30., 31., 32., 33., 34.],
       [35., 36., 37., 38., 39., 40., 41.]])

In [197]:
data_float[4][5] = -1.24
data_float

array([[ 0.  ,  1.  ,  2.  ,  3.  ,  4.  ,  5.  ,  6.  ],
       [ 7.  ,  8.  ,  9.  , 10.  , 11.  , 12.  , 13.  ],
       [-3.  , 15.  , 16.  , 17.  , 18.  , 19.  , 20.  ],
       [21.  , 22.  , 23.  , 24.  , 25.  , 26.  , 27.  ],
       [28.  , 29.  , 30.  , 31.  , 32.  , -1.24, 34.  ],
       [35.  , 36.  , 37.  , 38.  , 39.  , 40.  , 41.  ]])

## Boolean arrays
As with all operationds, also logical opertions are vectorised

In [271]:
data = np.random.normal(1.,0.5, 10)
data

array([0.74211767, 1.47999512, 1.10897368, 1.20607992, 0.24631132,
       0.20653833, 1.23289433, 1.50771445, 1.34696734, 1.5213118 ])

In [272]:
data = np.random.normal(0,1., 10)
data > 0.

array([ True, False,  True, False, False,  True,  True, False,  True,
       False])

Since booleans are converted automatically to 0 and 1, you can easy count tem by using the `sum` function

In [274]:
(data >0).sum()

5

Logial arrays can be used to slice and index an array!

In this case we want the array with just the positive cells

In [215]:
pos_vals = data[ data> 0. ]
print(type(pos_vals), pos_vals.shape)

<class 'numpy.ndarray'> (5,)


In [216]:
pos_vals[0]

0.4790110229469974

In [222]:
pos_vals[0] = -1

In [223]:
print (pos_vals)

[-1.          0.10287567  0.1759219   1.00980333  0.22562508]


In [224]:
print(data)

[ 0.47901102 -0.29646927 -0.19406709 -1.49848078 -1.79624113  0.10287567
  0.1759219   1.00980333 -1.28679243  0.22562508]


Note how slicing with boolean array, creates a new array and is not a reference to the original array

In [278]:
mu =1.0
sig = 0.2
nsig = 3.
nvals = 100000
data = np.random.normal(mu,sig, nvals)
print( (abs(data-mu)>nsig*sig).sum()   )
tail = data[ abs(data-mu)>nsig*sig  ]
print(len(tail))
print("fraction of points beyond %.1f sigma: %.1f"%(nsig,100*len(tail)/nvals),"%")

274
274
fraction of points beyond 3.0 sigma: 0.3 %


## Using NumPy ndarray instead of Lists
We now solve the same problem  of the projectile but this time using a 2D array to do just one comprhension to compute both x(t) and y(t).

When plotting you have to use slicing to specify that the 1st column are the x values and the 2nd column are the y values.

In [11]:
%matplotlib notebook
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import numpy as np
import time


# initial conditions
g = 9.8
h = 10.
theta = (30./180.)*np.pi
v0 = 30.
dt=0.01

#compute velocity components
v0x = v0*np.cos(theta)
v0y = v0*np.sin(theta)
print("v0_x: %.1f m/s \t v0_y: %.1f m/s"%(v0x,v0y))

x0 = 0
y0 = h

def x(t):
    return x0+v0x*t

def y(t):
    return y0+v0y*t-0.5*g*t*t


dt = 0.01
# generate list of times for sampling
times = np.arange(0., 1000., dt)

#print first 10 elements
print(times[:10])

# use 2D array to do one comprehension
pos = np.array([ [x(t),y(t)] for t in times if y(t)>=0. ])
print("shape of pos array: ",pos.shape)
# create a figure object
fig = plt.figure()

# add subplot (just 1) and set x and y limits based on data
# ax is the object containing objects to be plotted
ax = fig.add_subplot(111, autoscale_on=False, xlim=(-0.1, max(pos[:,0])*1.2), ylim=(-0.1,max(pos[:,1])*1.2) )
ax.grid()
ax.set_xlabel('x(t) [m]')
ax.set_ylabel("y(t) [m]")
plt.title("trajectory of a projectile with $v_0$: %.1f m/s\t $\Theta_0$: %.1f$^\circ$"%(v0,theta))

# plot slices for ndarray
line = ax.plot(pos[:,0], pos[:,1],  lw=2, color='red')
plt.show()


xi = list(pos[:,0])
yi = list(pos[:,1])
print("max height: %.2f at x = %.2f"%(max(yi),xi[yi.index(max(yi))]))


v0_x: 26.0 m/s 	 v0_y: 15.0 m/s
[0.   0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09]
shape of pos array:  (363, 2)


<IPython.core.display.Javascript object>

max height: 21.48 at x = 39.75


## Using function with multiple return value
we now get rid of x(t) and y(t) and replace it with just one function pos(t) returning 2 values

We use ndarray everywhere instaed of the list type. Howevere note that
- to print the position of the maximum, using slices can cause some headache and confusion for who reads the code
  - you can create lists xi and yi to make the code more readable
- a slice does not have the same methods of a list. So for example you can not call `index()` on a slice so we create a list on the fly `list(pos[:,1]).index(max(pos[:,1]))`

In [13]:
%matplotlib notebook
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import numpy as np
import time


# initial conditions
g = 9.8
h = 10.
theta = (30./180.)*np.pi
v0 = 30.
dt=0.01

#compute velocity components
v0x = v0*np.cos(theta)
v0y = v0*np.sin(theta)
print("v0_x: %.1f m/s \t v0_y: %.1f m/s"%(v0x,v0y))

x0 = 0
y0 = h

def pos(t):
    return x0+v0x*t, y0+v0y*t-0.5*g*t*t


dt = 0.01
# generate list of times for sampling
times = np.arange(0., 1000., dt)

#print first 10 elements
print(times[:10])


# use 2D array to do one comprehension
pos = np.array([ pos(t) for t in times if pos(t)[1]>=0. ])

print("shape of pos: ",pos.shape)

# you can create list for xi and yi
#xi = list(pos[:,0])
#yi = list(pos[:,1])
#print("max height: %.2f at x = %.2f"%(max(yi),xi[yi.index(max(yi))]))

# or you can simply use the slicing again. In this case it can be a bit confusing if not familiar
# also note that index() is a method for a list not for slices.
print("max height: %.2f at x = %.2f"%(max(pos[:,1]),pos[ list(pos[:,1]).index(max(pos[:,1])),0 ] ) )



# create a figure object
fig = plt.figure()

# add subplot (just 1) and set x and y limits based on data
# ax is the object containing objects to be plotted
ax = fig.add_subplot(111, autoscale_on=False, xlim=(-0.1, max(pos[:,0])*1.2), ylim=(-0.1,max(pos[:,1])*1.2) )
ax.grid()
ax.set_xlabel('x(t) [m]')
ax.set_ylabel("y(t) [m]")
plt.title("trajectory of a projectile with $v_0$: %.1f m/s\t $\Theta_0$: %.1f$^\circ$"%(v0,theta))

# plot slices for ndarray
line = ax.plot(pos[:,0], pos[:,1],  lw=2, color='red')

plt.show()

v0_x: 26.0 m/s 	 v0_y: 15.0 m/s
[0.   0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09]
shape of pos:  (363, 2)
max height: 21.48 at x = 39.75


<IPython.core.display.Javascript object>

## Exercise
- exetend the problem to 3D and use 3D plot to show the trajectory in space

# Random walks with ndarrays

In this example we use ndarray to solve the classical problem of random walk

The typical C like solution is

In [311]:
%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt

nstep = 100

x = 0
position = [x] 


for i in range(nstep):
    step = 1
    if np.random.uniform(0.,1.) < 0.5: step = -1
    x += step
    position.append(x)    

    
plt.plot(position[:nstep])
plt.grid()
plt.xlabel('step')
plt.ylabel('position')


<IPython.core.display.Javascript object>

Text(0, 0.5, 'position')

## Random walk with arrays
 We now note that the position of the walk is the cumulative sum of random numbers. we can use this to convert the problem to using arrays only

In [376]:
%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt

nsteps = 50

 - first we draw the random numbers for all `nsteps` at once, think of it as a coin

In [377]:
draws = np.random.randint(0,2, size=nsteps)
print(draws)

[0 1 0 0 0 1 0 1 1 1 0 1 0 1 0 0 1 1 0 0 1 1 0 0 0 1 1 1 0 0 1 0 1 1 0 1 0
 1 1 0 1 1 1 1 0 0 0 0 0 1]


Based on the drawn coins, we decide the steps, if positive or negative

In [378]:
steps = np.where(draws>0, 1, -1)
print(steps)

[-1  1 -1 -1 -1  1 -1  1  1  1 -1  1 -1  1 -1 -1  1  1 -1 -1  1  1 -1 -1
 -1  1  1  1 -1 -1  1 -1  1  1 -1  1 -1  1  1 -1  1  1  1  1 -1 -1 -1 -1
 -1  1]


- then we compute the cumulative sum of random numers

In [379]:
walk = steps.cumsum()
print(walk)

[-1  0 -1 -2 -3 -2 -3 -2 -1  0 -1  0 -1  0 -1 -2 -1  0 -1 -2 -1  0 -1 -2
 -3 -2 -1  0 -1 -2 -1 -2 -1  0 -1  0 -1  0  1  0  1  2  3  4  3  2  1  0
 -1  0]


The cumulative sum on an array, sums all values in cells before a given position.

In [380]:
plt.plot(walk)
plt.grid()
plt.xlabel('step')
plt.ylabel('position')



<IPython.core.display.Javascript object>

Text(0, 0.5, 'position')


finding information about the walk and doing analysis is now trivial.

The maximum position is

In [381]:
print(walk.max())
print(walk.argmax())
print(len(walk))

4
43
50


`argmax()` provides the first index where the `max()` value has occured. Similarly for the minimum

In [382]:
print(walk.min())
print(walk.argmin())
print(len(walk))

-3
4
50


But we want to find the location (positive or negative) with the largest distance. This can be easily done with vector operations

In [383]:
print(np.abs(walk))
print(np.abs(walk).max())

[1 0 1 2 3 2 3 2 1 0 1 0 1 0 1 2 1 0 1 2 1 0 1 2 3 2 1 0 1 2 1 2 1 0 1 0 1
 0 1 0 1 2 3 4 3 2 1 0 1 0]
4


In [384]:
print(np.abs(walk).argmax())

43


We can also easily find the time at which we cross a certain position. For example we want to find when the position is back to origin. This is done using the boolean arrays

In [385]:
np.abs(walk) == 0

array([False,  True, False, False, False, False, False, False, False,
        True, False,  True, False,  True, False, False, False,  True,
       False, False, False,  True, False, False, False, False, False,
        True, False, False, False, False, False,  True, False,  True,
       False,  True, False,  True, False, False, False, False, False,
       False, False,  True, False,  True])

In [386]:
(np.abs(walk)==0).max()

True

in a boolean array `max()` corresponds to `True` value. So using `argmax()` we can find when for the first time the particle crosses again the origin

In [387]:
(np.abs(walk)==0).argmax()

1

## Simulating many random walks at once

We now want to simulate `nexp` random walks and study statistics about number of crossings, maximum distance, etc. Instead of using nested loops, we can simplhy use 2D array to keep track of steps for `nexp` experiments.

In fact the only place we change the code, is the extraction of random numbers to decide direction of the random walk.

In [401]:
%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt

nsteps = 100
nexp = 1000

draws = np.random.randint(0,2, size=(nexp,nsteps) )
print(draws.shape)

steps = np.where(draws>0, 1, -1)
print(steps.shape)


walks = steps.cumsum(1)
print(walks.shape)

(1000, 100)
(1000, 100)
(1000, 100)


By calling `cumsum(1)` we summing over the 1-th dimensions, which are the columns. Recall that each row represents and experiment. columns are the draws or the steps for a given experiment. So by summing the steps over the columns, we obtain the random walk for each experiment.

In [402]:
walks

array([[  1,   2,   1, ...,  -2,  -1,   0],
       [  1,   0,  -1, ...,  -2,  -1,   0],
       [ -1,  -2,  -3, ...,  -4,  -3,  -4],
       ...,
       [  1,   2,   3, ...,  -4,  -5,  -6],
       [ -1,   0,  -1, ...,   2,   3,   4],
       [  1,   2,   1, ...,  -8,  -9, -10]])

The maximum distance ever reached is

In [403]:
np.abs(walks).max()

32

Check if the walk ever goes back  to the origin

In [412]:
hits_home = (np.abs(walks)==0).any(1)


In [413]:
hits10 = (np.abs(walks)==10).any(1)
print(hits_home.sum())

927


In [None]:
print(walk)

In [406]:
plt.plot(walk[0,:])
plt.grid()
plt.xlabel('step')
plt.ylabel('position')

IndexError: too many indices for array

### Exercise
- use animation to show the experimets in sequence
- simulate the random walk in 3D and compute the fraction of experiments going back to the origin or reaching an arbitrary distance from the origin