# Numpy: Numerical Python package 

- [NumPy](https://www.numpy.org) is perhaps the most important  package for numerical computing in python
- the n-dimentional array in NumPy used a basic  object in most python packages for data exchange
  - we will look at its methods and semantics starting today through examples
- some of the most important features of Num Py 
  - ndarray: multidimensional array for fast and efficient array-oriented operations and arithmetics
  - mathematical functions for fast operation on  arrays  without using loops and iterations
  - tools for I/O to and from disk
  - Linear algebra
  - random generation 
  - API to connect NumPy with C and C++ libraries

## NumPy ndarray: a multidimensional array object
while we will oftenuse it for 2D and 3D calculations , an ndarray can have an arbitary dimension

Let's first see how much faster is an ndarray object

In [1]:
import numpy as np

my_arr = np.arange(100000)
print(type(my_arr))

my_list = list(range(100000))
print(type(my_list))

%time for _ in range(10): my_arr2 = my_arr * 2

%time for _ in range(10): my_list2 = [ x*2 for x in my_list ]
    

<class 'numpy.ndarray'>
<class 'list'>
CPU times: user 1.84 ms, sys: 673 µs, total: 2.52 ms
Wall time: 2.05 ms
CPU times: user 98 ms, sys: 21.4 ms, total: 119 ms
Wall time: 121 ms


__NumPy based algorithms are generally 10 to 100 times faster than pure python counterparts!__

In [2]:
import numpy as np
arr2 = np.array( [ [ 1,2,3], [4,5,6], [7,8,9]  ] )
print(arr2)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [3]:
arr3 = np.random.rand(2,5)
print(arr3)
arr4 = np.random.rand(2,2,2)
print(arr4)

[[0.56758663 0.75430454 0.50160076 0.69635579 0.01572714]
 [0.72879785 0.40165576 0.43786906 0.17067457 0.66499337]]
[[[0.42650681 0.35989558]
  [0.95024834 0.03097878]]

 [[0.62210778 0.07109238]
  [0.31800587 0.03307419]]]


## creating ndarrays
As seen above, all NumPy functions and classes use ndarray as return type. You only need to specify the dimensions and the size of the array to be created

### uniform random numbers

In [4]:
np.random.rand(10,3)

array([[0.33983654, 0.35940453, 0.23689947],
       [0.21462987, 0.99259533, 0.78790307],
       [0.09056752, 0.174153  , 0.07787499],
       [0.06428135, 0.14751294, 0.23600176],
       [0.53396062, 0.66196393, 0.12754598],
       [0.2818487 , 0.25793597, 0.26959377],
       [0.08895102, 0.68758629, 0.13679696],
       [0.05283103, 0.23229202, 0.80167335],
       [0.49833238, 0.43352968, 0.60895795],
       [0.18468357, 0.75829072, 0.21268281]])

In [5]:
np.random.rand(2,3,4)

array([[[0.01928531, 0.26991759, 0.93058517, 0.37111633],
        [0.66837314, 0.1788304 , 0.2992275 , 0.52874933],
        [0.62388296, 0.85077842, 0.68431789, 0.90989868]],

       [[0.7572903 , 0.16773231, 0.36429347, 0.75491156],
        [0.57401456, 0.98833995, 0.12774465, 0.3405537 ],
        [0.90628459, 0.52956919, 0.1950213 , 0.09558987]]])

### array of zeros

In [6]:
v = np.zeros(3)
print(v)

[0. 0. 0.]


In [7]:
A = np.zeros((3,4) )
print(A)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]


### arrays of 1

In [8]:
w = np.ones(4)
print(w)

[1. 1. 1. 1.]


In [9]:
B = np.ones ( (3,4))
print(B)

val = 3.6*B
print(val)

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]
[[3.6 3.6 3.6 3.6]
 [3.6 3.6 3.6 3.6]
 [3.6 3.6 3.6 3.6]]


In [10]:
z = np.full( 4, fill_value=3.12)
print(z)

[3.12 3.12 3.12 3.12]


In [11]:
C = np.full( (4,5), fill_value=-4.3)
print(C)

[[-4.3 -4.3 -4.3 -4.3 -4.3]
 [-4.3 -4.3 -4.3 -4.3 -4.3]
 [-4.3 -4.3 -4.3 -4.3 -4.3]
 [-4.3 -4.3 -4.3 -4.3 -4.3]]


### Identity array

In [12]:
data = np.identity(7)
print(data)

[[1. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0.]
 [0. 0. 0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 0. 0. 1.]]


## Shape of arrays
each ndarray is characterized by its 
- shape
- size 
- type of data

In [13]:
data = np.array([ [-1., 2.3], [2.3, 4.5], [-8.4, 1.9] ])
print(data)
print(type(data))
print(data.ndim)
print(data.shape)
print(data.dtype)

data_int = np.array([ [-1., 2.3], [2.3, 4.5], [-8.4, 1.9] ], dtype=np.int64)
print(data_int)

[[-1.   2.3]
 [ 2.3  4.5]
 [-8.4  1.9]]
<class 'numpy.ndarray'>
2
(3, 2)
float64
[[-1  2]
 [ 2  4]
 [-8  1]]


## Reshaping arrays

In [19]:
data = np.arange(1,101)
data

array([  1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,
        14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,  26,
        27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,  39,
        40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,  52,
        53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,  65,
        66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,  78,
        79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,  91,
        92,  93,  94,  95,  96,  97,  98,  99, 100])

In [20]:
mat1 = data.reshape(25,4)
print(mat1)
print(data)

[[  1   2   3   4]
 [  5   6   7   8]
 [  9  10  11  12]
 [ 13  14  15  16]
 [ 17  18  19  20]
 [ 21  22  23  24]
 [ 25  26  27  28]
 [ 29  30  31  32]
 [ 33  34  35  36]
 [ 37  38  39  40]
 [ 41  42  43  44]
 [ 45  46  47  48]
 [ 49  50  51  52]
 [ 53  54  55  56]
 [ 57  58  59  60]
 [ 61  62  63  64]
 [ 65  66  67  68]
 [ 69  70  71  72]
 [ 73  74  75  76]
 [ 77  78  79  80]
 [ 81  82  83  84]
 [ 85  86  87  88]
 [ 89  90  91  92]
 [ 93  94  95  96]
 [ 97  98  99 100]]
[  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18
  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36
  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54
  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72
  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90
  91  92  93  94  95  96  97  98  99 100]


In [21]:
mat2 = mat1.reshape(10,10)
mat2

array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10],
       [ 11,  12,  13,  14,  15,  16,  17,  18,  19,  20],
       [ 21,  22,  23,  24,  25,  26,  27,  28,  29,  30],
       [ 31,  32,  33,  34,  35,  36,  37,  38,  39,  40],
       [ 41,  42,  43,  44,  45,  46,  47,  48,  49,  50],
       [ 51,  52,  53,  54,  55,  56,  57,  58,  59,  60],
       [ 61,  62,  63,  64,  65,  66,  67,  68,  69,  70],
       [ 71,  72,  73,  74,  75,  76,  77,  78,  79,  80],
       [ 81,  82,  83,  84,  85,  86,  87,  88,  89,  90],
       [ 91,  92,  93,  94,  95,  96,  97,  98,  99, 100]])

__Note that the original data array has not been modified__

In [22]:
data

array([  1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,
        14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,  26,
        27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,  39,
        40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,  52,
        53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,  65,
        66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,  78,
        79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,  91,
        92,  93,  94,  95,  96,  97,  98,  99, 100])

In [23]:
data.reshape(9,8)

ValueError: cannot reshape array of size 100 into shape (9,8)

You can create an array with the same shape of an existing array

In [24]:
np.empty_like(data)

array([4611686018427387904, 4611686018427387904, 3544386187360600198,
       6638917327069928290, 4265009324410024053, 7310868735423492403,
       3491798732661486149, 2322288612321931568, 8223700941521445219,
       6998705354278728549, 2334675642104246898, 3472329327913888115,
       7526395065333082400, 4047673009604227169, 7020660066564776489,
       7809617933841360237, 2482168881968670069, 7310868735959835180,
       8029476550202243618, 8097868449821368436, 2340008625568817253,
       2334406575183128175, 8031165433299087409, 2891422494116836128,
       2314987794142342201, 6714921058457893152, 4611686018427387904,
       4611686018427387904, 3688764892345663609, 2314898811142545452,
       3185787800459687217, 3539864681389826080, 2318290791344122936,
       2336906602263622176, 2331492554444382240, 3184663000064471346,
       3611922275360710688, 2318286397592579124, 3977276743474360864,
       2314898828339191852, 6714923257481148722,                   1,
       2318282003302

In [25]:
np.ones_like(data)
print(data2)

NameError: name 'data2' is not defined

In [26]:
np.zeros_like(data)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [27]:
np.full_like(data, -4.5)

array([-4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4,
       -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4,
       -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4,
       -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4,
       -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4,
       -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4])

Since the array used for the shape had integers, the new array is also made of integers, despite you providing `4.5` as fill_value. If you need a float array, then you have to specify it:

In [28]:
np.full_like(data, fill_value=-4.5,dtype=np.float64)

array([-4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
       -4.5])

In [29]:
data_2 = 1.*np.full_like(data, -4.5)
print(data_2)

[-4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4.
 -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4.
 -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4.
 -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4.
 -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4. -4.
 -4. -4. -4. -4. -4. -4. -4. -4. -4. -4.]


## Operations with ndarray
you can use an ndarray for  basic mathematical operations tyoically used with scalars.



### Arithmetics with arrays

In [30]:
data = np.arange(1,17).reshape(4,4)
print(data)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]]


In [31]:
data2 = np.ones_like(data)
print(data2)

[[1 1 1 1]
 [1 1 1 1]
 [1 1 1 1]
 [1 1 1 1]]


In [32]:
data + data2

array([[ 2,  3,  4,  5],
       [ 6,  7,  8,  9],
       [10, 11, 12, 13],
       [14, 15, 16, 17]])

In [33]:
data - data2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

Division by an array applies the divison to each element. It is not the same as inverting an array

In [34]:
1./data

array([[1.        , 0.5       , 0.33333333, 0.25      ],
       [0.2       , 0.16666667, 0.14285714, 0.125     ],
       [0.11111111, 0.1       , 0.09090909, 0.08333333],
       [0.07692308, 0.07142857, 0.06666667, 0.0625    ]])

In [35]:
10 / data

array([[10.        ,  5.        ,  3.33333333,  2.5       ],
       [ 2.        ,  1.66666667,  1.42857143,  1.25      ],
       [ 1.11111111,  1.        ,  0.90909091,  0.83333333],
       [ 0.76923077,  0.71428571,  0.66666667,  0.625     ]])

Adding a scalar to an array, adds the same value to all cells

In [36]:
3+data

array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

In [37]:
data_5 = 3.67+ np.sin(data)/data

### functions
you have to use the NumPy functions so __np.sin__ instead of __math.sin__

In [38]:
data**2

array([[  1,   4,   9,  16],
       [ 25,  36,  49,  64],
       [ 81, 100, 121, 144],
       [169, 196, 225, 256]])

In [39]:
print(data)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]]


Note how applying the function __does not__ modify data. Instead a new array is returned. This is the same behavior as with scalars in other langguage and also in python.

In [40]:
data_sqr = data**2
print(data_sqr)

[[  1   4   9  16]
 [ 25  36  49  64]
 [ 81 100 121 144]
 [169 196 225 256]]


In [41]:

data_2 = np.log(data_sqr) + np.sin( data*np.pi)

In [42]:
data_3 = np.sqrt( data_sqr )
print(data_3)

[[ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 9. 10. 11. 12.]
 [13. 14. 15. 16.]]


## Indexing and slicing

In [43]:
data= np.arange(20).reshape(4,5)
data

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [44]:
data[2:3,2:3].shape

(1, 1)

In [45]:
data[2:4,2:4].shape

(2, 2)

In [46]:
data[1:,:]

array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [47]:
data[:,4]

array([ 4,  9, 14, 19])

In [48]:
data[:2,3:7]

array([[3, 4],
       [8, 9]])

As with the lists, all operations return a refernce to the original array, unless you make a copy


In [49]:
x = data[:2,1:3]
print(x)

[[1 2]
 [6 7]]


In [50]:
x[0]

array([1, 2])

In [51]:
x[0][1]

2

In [52]:
x[0][1] = -5.55
print(x)
print(data)

[[ 1 -5]
 [ 6  7]]
[[ 0  1 -5  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]


So you have modified not just x, but also the original data array!

If this is not what you want, then you have to create a new copy

In [53]:
data= np.arange(16).reshape(4,4)
data

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [54]:
x = np.array( data[0])
print(x)

[0 1 2 3]


In [55]:
x[1:3]

array([1, 2])

In [56]:
x[1:3] = [ -2, -4]
print(x)

[ 0 -2 -4  3]


In [57]:
print(data)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]


## converting and casting
When using arange() the array's type is integer.

In [58]:
data= np.arange(42).reshape(6,7)
data

array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27],
       [28, 29, 30, 31, 32, 33, 34],
       [35, 36, 37, 38, 39, 40, 41]])

In [59]:
data[2]

array([14, 15, 16, 17, 18, 19, 20])

Sometimes you might want to changge its type to do floating point calculations

In [60]:
data[2][0] = -np.pi
data

array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [-3, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27],
       [28, 29, 30, 31, 32, 33, 34],
       [35, 36, 37, 38, 39, 40, 41]])

In [61]:
data_float = data.astype( np.float64)
data_float

array([[ 0.,  1.,  2.,  3.,  4.,  5.,  6.],
       [ 7.,  8.,  9., 10., 11., 12., 13.],
       [-3., 15., 16., 17., 18., 19., 20.],
       [21., 22., 23., 24., 25., 26., 27.],
       [28., 29., 30., 31., 32., 33., 34.],
       [35., 36., 37., 38., 39., 40., 41.]])

In [62]:
data_float[4][5] = -1.24
data_float

array([[ 0.  ,  1.  ,  2.  ,  3.  ,  4.  ,  5.  ,  6.  ],
       [ 7.  ,  8.  ,  9.  , 10.  , 11.  , 12.  , 13.  ],
       [-3.  , 15.  , 16.  , 17.  , 18.  , 19.  , 20.  ],
       [21.  , 22.  , 23.  , 24.  , 25.  , 26.  , 27.  ],
       [28.  , 29.  , 30.  , 31.  , 32.  , -1.24, 34.  ],
       [35.  , 36.  , 37.  , 38.  , 39.  , 40.  , 41.  ]])

## Boolean arrays
As with all operations, also logical opertions are vectorised

In [63]:
data = np.random.normal(0,1., 25).reshape(5,5)
print(data)
data > 0.
np.shape(data[data>0])

[[-0.99021874 -0.20468999 -0.01000783 -1.17858584  1.19494933]
 [ 0.98834929  1.25867633  0.7236783  -0.27397653  0.70001685]
 [ 0.52855297  0.16824097  0.71176472  0.78311828  0.32649156]
 [-0.972799   -1.15828996  0.46969251  1.52831476 -1.07997632]
 [ 0.02369929  0.9143921   0.2087082   0.77422045 -0.55491193]]


(16,)

In [64]:
data = np.random.normal(0,1., 10)
print(data)
data > 0.

[ 0.02228463 -1.4260072   0.12058832 -0.79831769  0.99221701  1.24317065
 -1.02799729 -0.34648788  0.53587557  0.49727269]


array([ True, False,  True, False,  True,  True, False, False,  True,
        True])

Since booleans are converted automatically to 0 and 1, you can easy count tem by using the `sum` function

In [65]:
(data >0).sum()

6

Logial arrays can be used to slice and index an array!

In this case we want the array with just the positive cells

In [66]:
pos_vals = data[ data> 0. ]
print(type(pos_vals), pos_vals.shape)

<class 'numpy.ndarray'> (6,)


In [67]:
pos_vals[0]

0.02228462522490106

In [68]:
pos_vals[0] = -1

In [69]:
print (pos_vals)

[-1.          0.12058832  0.99221701  1.24317065  0.53587557  0.49727269]


In [70]:
print(data)

[ 0.02228463 -1.4260072   0.12058832 -0.79831769  0.99221701  1.24317065
 -1.02799729 -0.34648788  0.53587557  0.49727269]


Note how slicing with boolean array, creates a new array and is not a reference to the original array

## Example: computing tails of a  Gaussian 

In [72]:
mu =1.0
sig = 0.2
nsig = 1.
nvals = 10000000
data = np.random.normal(mu,sig, nvals)
print( (abs(data-mu)>nsig*sig).sum()   )
tail = data[ abs(data-mu)>nsig*sig  ]
print(len(tail))
print("fraction of points beyond %.1f sigma: %.1f"%(nsig,100*len(tail)/nvals),"%")

3173098
3173098
fraction of points beyond 1.0 sigma: 31.7 %


## Using NumPy ndarray instead of Lists
We now solve the same problem  of the projectile but this time using a 2D array to do just one comprhension to compute both x(t) and y(t).

When plotting you have to use slicing to specify that the 1st column are the x values and the 2nd column are the y values.

In [73]:
%matplotlib notebook
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import numpy as np
import time


# initial conditions
g = 9.8
h = 10.
theta = (30./180.)*np.pi
v0 = 30.
dt=0.01

#compute velocity components
v0x = v0*np.cos(theta)
v0y = v0*np.sin(theta)
print("v0_x: %.1f m/s \t v0_y: %.1f m/s"%(v0x,v0y))

x0 = 0
y0 = h

def x(t):
    return x0+v0x*t

def y(t):
    return y0+v0y*t-0.5*g*t*t


dt = 0.01
# generate list of times for sampling
times = np.arange(0., 1000., dt)

#print first 10 elements
print(times[:10])

# use 2D array to do one comprehension
pos = np.array([ [x(t),y(t)] for t in times if y(t)>=0. ])
print("shape of pos array: ",pos.shape)
# create a figure object
fig = plt.figure()

# add subplot (just 1) and set x and y limits based on data
# ax is the object containing objects to be plotted
ax = fig.add_subplot(111, autoscale_on=False, xlim=(-0.1, max(pos[:,0])*1.2), ylim=(-0.1,max(pos[:,1])*1.2) )
ax.grid()
ax.set_xlabel('x(t) [m]')
ax.set_ylabel("y(t) [m]")
plt.title("trajectory of a projectile with $v_0$: %.1f m/s\t $\Theta_0$: %.1f$^\circ$"%(v0,theta))

# plot slices for ndarray
line = ax.plot(pos[:,0], pos[:,1],  lw=2, color='red')
plt.show()


xi = list(pos[:,0])
yi = list(pos[:,1])
print("max height: %.2f at x = %.2f"%(max(yi),xi[yi.index(max(yi))]))


v0_x: 26.0 m/s 	 v0_y: 15.0 m/s
[0.   0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09]
shape of pos array:  (363, 2)


<IPython.core.display.Javascript object>

max height: 21.48 at x = 39.75


## Using function with multiple return value
we now get rid of x(t) and y(t) and replace it with just one function pos(t) returning 2 values

We use ndarray everywhere instaed of the list type. Howevere note that
- to print the position of the maximum, using slices can cause some headache and confusion for who reads the code
  - you can create lists xi and yi to make the code more readable
- a slice does not have the same methods of a list. So for example you can not call `index()` on a slice so we create a list on the fly `list(pos[:,1]).index(max(pos[:,1]))`

In [74]:
%matplotlib notebook
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import numpy as np
import time


# initial conditions
g = 9.8
h = 10.
theta = (30./180.)*np.pi
v0 = 30.
dt=0.01

#compute velocity components
v0x = v0*np.cos(theta)
v0y = v0*np.sin(theta)
print("v0_x: %.1f m/s \t v0_y: %.1f m/s"%(v0x,v0y))

x0 = 0
y0 = h

def pos(t):
    return x0+v0x*t, y0+v0y*t-0.5*g*t*t


dt = 0.01
# generate list of times for sampling
times = np.arange(0., 1000., dt)

#print first 10 elements
print(times[:10])


# use 2D array to do one comprehension
pos = np.array([ pos(t) for t in times if pos(t)[1]>=0. ])

print("shape of pos: ",pos.shape)

# you can create list for xi and yi
#xi = list(pos[:,0])
#yi = list(pos[:,1])
#print("max height: %.2f at x = %.2f"%(max(yi),xi[yi.index(max(yi))]))

# or you can simply use the slicing again. In this case it can be a bit confusing if not familiar
# also note that index() is a method for a list not for slices.
print("max height: %.2f at x = %.2f"%(max(pos[:,1]),pos[ list(pos[:,1]).index(max(pos[:,1])),0 ] ) )



# create a figure object
fig = plt.figure()

# add subplot (just 1) and set x and y limits based on data
# ax is the object containing objects to be plotted
ax = fig.add_subplot(111, autoscale_on=False, xlim=(-0.1, max(pos[:,0])*1.2), ylim=(-0.1,max(pos[:,1])*1.2) )
ax.grid()
ax.set_xlabel('x(t) [m]')
ax.set_ylabel("y(t) [m]")
plt.title("trajectory of a projectile with $v_0$: %.1f m/s\t $\Theta_0$: %.1f$^\circ$"%(v0,theta))

# plot slices for ndarray
line = ax.plot(pos[:,0], pos[:,1],  lw=2, color='red')

plt.show()

v0_x: 26.0 m/s 	 v0_y: 15.0 m/s
[0.   0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09]
shape of pos:  (363, 2)
max height: 21.48 at x = 39.75


<IPython.core.display.Javascript object>

## Exercise
- exetend the problem to 3D and use 3D plot to show the trajectory in space

# Random walks with ndarrays

In this example we use ndarray to solve the classical problem of random walk

The typical C like solution is

In [75]:
%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt

nstep = 100

x = 0
position = [x] 


for i in range(nstep):
    step = 1
    if np.random.uniform(0.,1.) < 0.5: step = -1
    x += step
    position.append(x)    

    
plt.plot(position[:nstep])
plt.grid()
plt.xlabel('step')
plt.ylabel('position')


<IPython.core.display.Javascript object>

Text(0, 0.5, 'position')

## Random walk with arrays
 We now note that the position of the walk is the cumulative sum of random numbers. we can use this to convert the problem to using arrays only

In [76]:

import numpy as np
import matplotlib.pyplot as plt

nsteps = 10000

 - first we draw the random numbers for all `nsteps` at once, think of it as a coin

In [77]:
draws = np.random.randint(0,2, size=nsteps)
print(draws)

[1 0 0 ... 0 1 1]


Based on the drawn coins, we decide the steps, if positive or negative

In [78]:
steps = np.where(draws>0, 1, -1)
print(steps)

[ 1 -1 -1 ... -1  1  1]


- then we compute the cumulative sum of random numers

In [79]:
walk = steps.cumsum()
print(walk)
print(steps.sum())

[ 1  0 -1 ... 30 31 32]
32


The cumulative sum on an array, sums all values in cells before a given position.

In [80]:
%matplotlib notebook
plt.plot(walk)
plt.grid()
plt.xlabel('step')
plt.ylabel('position')



<IPython.core.display.Javascript object>

Text(0, 0.5, 'position')


finding information about the walk and doing analysis is now trivial.

The maximum position is

In [81]:
print(walk.max())
print(walk.argmax())
print(len(walk))

38
9493
10000


`argmax()` provides the first index where the `max()` value has occured. Similarly for the minimum

In [82]:
print(walk.min())
print(walk.argmin())
print(len(walk))

-114
5483
10000


But we want to find the location (positive or negative) with the largest distance. This can be easily done with vector operations

In [83]:
print(np.abs(walk))
print(np.abs(walk).max())

[ 1  0  1 ... 30 31 32]
114


In [84]:
print(np.abs(walk).argmax())

5483


We can also easily find the time at which we cross a certain position. For example we want to find when the position is back to origin. This is done using the boolean arrays

In [85]:
np.abs(walk) == 0

array([False,  True, False, ..., False, False, False])

In [86]:
(np.abs(walk)==0).max()

True

in a boolean array `max()` corresponds to `True` value. So using `argmax()` we can find when for the first time the particle crosses again the origin

In [87]:
(np.abs(walk)==0).argmax()

1

## Simulating many random walks at once

We now want to simulate `nexp` random walks and study statistics about number of crossings, maximum distance, etc. Instead of using nested loops, we can simplhy use 2D array to keep track of steps for `nexp` experiments.

In fact the only place we change the code, is the extraction of random numbers to decide direction of the random walk.

In [88]:
import numpy as np
import matplotlib.pyplot as plt

nsteps = 1000
nexp = 1000

draws = np.random.randint(0,2, size=(nexp,nsteps) )
print(draws.shape)

steps = np.where(draws>0, 1, -1)
print(steps.shape)


walks = steps.cumsum(1)
print(walks.shape)

(1000, 1000)
(1000, 1000)
(1000, 1000)


By calling `cumsum(1)` we summing over the 1-th dimensions, which are the columns. Recall that each row represents and experiment. columns are the draws or the steps for a given experiment. So by summing the steps over the columns, we obtain the random walk for each experiment.

In [89]:
walks

array([[  1,   0,   1, ...,  34,  33,  32],
       [ -1,   0,  -1, ...,  24,  23,  24],
       [  1,   0,  -1, ...,  18,  17,  16],
       ...,
       [ -1,   0,  -1, ...,  -8,  -9,  -8],
       [  1,   2,   3, ...,  26,  27,  28],
       [ -1,   0,  -1, ..., -18, -19, -18]])

The maximum distance ever reached in all experiments is 

In [90]:
np.abs(walks).max()

113

We can also find out the maximum distance for each experiment. Thjis is done by using the [`numpy.amax`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.amax.html) function. We want to find the maximum along columns (axis =1 ) for each experiment (row)

In [91]:
max_experiment = np.amax( np.abs(walks),1   )
print(max_experiment.shape)

(1000,)


For sanity check we see that the maximum distance across experiments is

In [92]:
max_experiment.max()

113

which occured in this experiment

In [93]:
max_experiment.argmax()

899

Similarly the smallest maximum distance ever reached is

In [94]:
max_experiment.min()

12

Finally we make a histogram of the maximum distance ever reached. Wd use
- `numpy.amax` funnction to compute the max for each experiment
- `set` to create unique list of distances reached
- `list` and `count` to compute frequency for each max distance- a dictionary to store the frequency for each max distance


In [97]:
max_dict = { i:list(max_experiment).count(i)   for i in set(max_experiment)  }
    
%matplotlib notebook
import matplotlib.pyplot as plt
plt.bar( list(max_dict.keys()), list(max_dict.values()), color='blue' ) 
plt.grid()
plt.xlabel('maximum distance')
plt.ylabel('experiments')

<IPython.core.display.Javascript object>

Text(0, 0.5, 'experiments')

### Computig crossing of a position

With booleans arrays we can easily check if the walk ever goes back  to the origin or any given position

In [98]:
hits_home = (np.abs(walks)==0).any(1)


Similarly we can check how many times a certain distance has been reached

In [99]:
hits_x = (np.abs(walks)==30).any(1)
print(hits_x.sum())

708


Plot a given experiment

In [None]:
%matplotlib notebook
plt.plot(walks[np.random.randint(0,nexp),:])
plt.grid()
plt.xlabel('step')
plt.ylabel('position')

Plot the experiment with the smallest maximum distance

In [101]:
%matplotlib notebook
plt.plot(walks[max_experiment.argmin(),:])
plt.grid()
plt.xlabel('step')
plt.ylabel('position')

<IPython.core.display.Javascript object>

Text(0, 0.5, 'position')

Plot the experiment with the largest maximum distance

In [102]:
%matplotlib notebook
plt.plot(walks[max_experiment.argmax(),:])
plt.grid()
plt.xlabel('step')
plt.ylabel('position')

<IPython.core.display.Javascript object>

Text(0, 0.5, 'position')

### Exercise
- use animation to show the experimets in sequence
- simulate the random walk in 2D and 3D and compute the fraction of experiments going back to the origin or reaching an arbitrary distance from the origin