## Chapter 2: Introduction to NumPy

### 2.1 Understanding data types in Python

- Python is dynamic typing, which does not require type of each variable to be explicitly declared. This is different with C.

- Data type includes bool, str, float, int.


- Python list: 

In [226]:
L=[True, "2", 3.0, 4]
[type(item) for item in L]


[bool, str, float, int]

-  Fixed-type arrays in Python: The NumPy array essentially contains a single pointer to one contiguous block of data. The Python list, on the other hand, contains a pointer to a block of pointers, each of which in turn points to a full Python object. 


#### Creating arrays from Python lists

In [18]:
import numpy as np
L1=np.array([1, 4,2,5,3])
print(L1)
type(L1)
[type(item) for item in L1]

[1 4 2 5 3]


[numpy.int64, numpy.int64, numpy.int64, numpy.int64, numpy.int64]

In [20]:
L2=np.array([3.14, 4, 2, 3])
print(L2)
type(L2)
[type(item) for item in L2]

[3.14 4.   2.   3.  ]


[numpy.float64, numpy.float64, numpy.float64, numpy.float64]

- If we want to explicitly set the data type of the resulting array, we can use the ```dtype``` keyword

In [32]:
np.array([1,2,3,4], dtype="float32")

array([1., 2., 3., 4.], dtype=float32)

- Multi-dimensional array: 

In [35]:
np.array([range(i, i+3) for i in [2, 4, 6]])

array([[2, 3, 4],
       [4, 5, 6],
       [6, 7, 8]])

#### Creating arrays from scratch

In [43]:
# create a length 10 integer array with zeros
np.zeros(10, dtype=int)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [44]:
# create a 3*5 floating-point array filled with ones
np.ones((3,5), dtype=float)

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [47]:
# create a 3*5 array filled with 5
np.full((3,5), 5)

array([[5, 5, 5, 5, 5],
       [5, 5, 5, 5, 5],
       [5, 5, 5, 5, 5]])

In [50]:
# create an array filled with a linear sequence
# starting  at 0, ending at 20, stepping by 2
np.arange(0, 20, 2)

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [51]:
# create an arrray of five values evenly spaced between 0 and 1
np.linspace(0, 1, 5)

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [54]:
# create a 3*3 array of uniformly distributed rv between 0 and 1
np.random.random((3,3))

array([[0.87641471, 0.31473667, 0.36155557],
       [0.26638459, 0.76331628, 0.87240441],
       [0.6886841 , 0.48063639, 0.59972997]])

In [56]:
# create a 3*3 array of standard normal rvs
np.random.normal(0, 1, (3,3))

array([[ 1.21607182, -0.22823368, -1.2858008 ],
       [ 0.19271299,  2.31885433,  1.11263326],
       [ 0.21980152, -0.80520053,  1.03954917]])

In [57]:
# create a 3*3 array of random integers in the interval [0, 10)
np.random.randint(0, 10, (3,3))

array([[3, 9, 1],
       [3, 9, 3],
       [4, 9, 9]])

In [58]:
# create a 3*3 identity matrix
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

#### NumPy standard data types 

- bool_	Boolean (True or False) stored as a byte
- int_	Default integer type (same as C long; normally either int64 or int32)
- intc	Identical to C int (normally int32 or int64)
- intp	Integer used for indexing (same as C ssize_t; normally either int32 or int64)
- int8	Byte (-128 to 127)
- int16	Integer (-32768 to 32767)
- int32	Integer (-2147483648 to 2147483647)
- int64	Integer (-9223372036854775808 to 9223372036854775807)
- uint8	Unsigned integer (0 to 255)
- uint16	Unsigned integer (0 to 65535)
- uint32	Unsigned integer (0 to 4294967295)
- uint64	Unsigned integer (0 to 18446744073709551615)
- float_	Shorthand for float64.
- float16	Half precision float: sign bit, 5 bits exponent, 10 bits mantissa
- float32	Single precision float: sign bit, 8 bits exponent, 23 bits mantissa
- float64	Double precision float: sign bit, 11 bits exponent, 52 bits mantissa
- complex_	Shorthand for complex128.
- complex64	Complex number, represented by two 32-bit floats
- complex128	Complex number, represented by two 64-bit floats

### 2.2 The basics of NumPy arrays

#### NumPy array attributes

- Each array has attributes ```ndim``` (the number of dimensions), ```shape```(the size of each dimension), and ```size``` (the total size of the array).

In [67]:
x3=np.random.randint(10, size=(3,4,5))

In [68]:
x3

array([[[0, 8, 4, 6, 6],
        [0, 2, 4, 6, 8],
        [5, 6, 7, 1, 8],
        [6, 6, 4, 2, 3]],

       [[5, 9, 1, 3, 7],
        [6, 1, 8, 2, 1],
        [0, 4, 0, 6, 2],
        [7, 1, 5, 0, 8]],

       [[5, 3, 9, 6, 8],
        [2, 2, 9, 7, 4],
        [3, 3, 1, 2, 6],
        [0, 7, 7, 4, 1]]])

In [72]:
print("x3 ndim:", x3.ndim)

x3 ndim: 3


In [73]:
print("x3 shape:", x3.shape)

x3 shape: (3, 4, 5)


In [74]:
print("x3 size:", x3.size)

x3 size: 60


- Another useful attribute is the ```dtype```, the data type of the array. 

In [79]:
print("dtype:", x3.dtype)

dtype: int64


- Other attributes include ```itemsize```, which lists the size in bites of each array element, and ```nbytes```, which lists the total size in bytes of the array. 

In [80]:
print("itemsize:", x3.itemsize, "bytes")

itemsize: 8 bytes


In [81]:
print("nbytes:", x3.nbytes, "bytes")

nbytes: 480 bytes


#### Array indexing 

In [83]:
x1=np.random.randint(10, size=6)

In [84]:
x1

array([2, 9, 2, 6, 9, 4])

In [85]:
x1[0]

2

In [86]:
x1[4]

9

In [87]:
x1[-1]

4

In [88]:
x1[-2]

9

In [92]:
x2=np.random.randint(10, size=(3,4))

In [93]:
x2

array([[6, 4, 4, 3],
       [9, 4, 4, 1],
       [0, 1, 6, 9]])

In [94]:
x2[0,0]

6

In [95]:
x2[2,0]

0

In [96]:
x2[2,-1]

9

In [97]:
x2[0, 0]=12

In [98]:
x2


array([[12,  4,  4,  3],
       [ 9,  4,  4,  1],
       [ 0,  1,  6,  9]])

#### Array slicing

- The Python slicing syntax follows that of the standard Python list. To access a slice of an array, use ```x[start: stop: step]```

In [114]:
x=np.arange(10)

In [115]:
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [116]:
x[:5]

array([0, 1, 2, 3, 4])

In [117]:
x[5:]

array([5, 6, 7, 8, 9])

In [118]:
x[4:7]

array([4, 5, 6])

In [119]:
x[::2] # every other element 

array([0, 2, 4, 6, 8])

In [121]:
x[1::2] # every other element, starting at index 1

array([1, 3, 5, 7, 9])

In [123]:
x[::-1] # all elements, reversed

array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

In [124]:
x[5::-2] # reversed every other from index 5

array([5, 3, 1])

In [128]:
x2

array([[12,  4,  4,  3],
       [ 9,  4,  4,  1],
       [ 0,  1,  6,  9]])

In [129]:
x2[:2, :3]

array([[12,  4,  4],
       [ 9,  4,  4]])

In [130]:
x2[:3, ::2]

array([[12,  4],
       [ 9,  4],
       [ 0,  6]])

In [133]:
x2[::-1, ::-1] # reverse

array([[ 9,  6,  1,  0],
       [ 1,  4,  4,  9],
       [ 3,  4,  4, 12]])

In [140]:
x2[::-1, ]

array([[ 0,  1,  6,  9],
       [ 9,  4,  4,  1],
       [12,  4,  4,  3]])

In [135]:
x2[:, 0]

array([12,  9,  0])

In [136]:
x2

array([[12,  4,  4,  3],
       [ 9,  4,  4,  1],
       [ 0,  1,  6,  9]])

In [143]:
x2[0, :]

array([12,  4,  4,  3])

In [144]:
x2[0]

array([12,  4,  4,  3])

In [145]:
x2[0][0]

12

- Subarrays as no-copy views: **array slices return views rather than copies of the array data**. 

In [146]:
x2

array([[12,  4,  4,  3],
       [ 9,  4,  4,  1],
       [ 0,  1,  6,  9]])

In [147]:
x2_sub=x2[:2, :2]

In [148]:
x2_sub

array([[12,  4],
       [ 9,  4]])

In [149]:
x2_sub[0, 0]=99

In [150]:
x2_sub

array([[99,  4],
       [ 9,  4]])

In [151]:
x2

array([[99,  4,  4,  3],
       [ 9,  4,  4,  1],
       [ 0,  1,  6,  9]])

- Creating copies of arrays: ```copy()``` method

In [152]:
x2_sub_copy=x2[:2, :2].copy()

In [153]:
x2_sub_copy

array([[99,  4],
       [ 9,  4]])

In [154]:
x2_sub_copy[0,0]=100

In [155]:
x2_sub_copy

array([[100,   4],
       [  9,   4]])

In [157]:
x2 # the original array is not touched 

array([[99,  4,  4,  3],
       [ 9,  4,  4,  1],
       [ 0,  1,  6,  9]])

#### Reshaping of arrays

- The most flexible way of doing this is with the ```reshape``` method. ```reshape``` method will use a **no-copy view of the initial array**.

In [158]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [159]:
np.arange(1,10)

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [162]:
grid=np.arange(1,10).reshape((3,3))

In [163]:
grid

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [169]:
x=np.array([1,2,3])

In [170]:
x


array([1, 2, 3])

In [171]:
x.shape

(3,)

In [179]:
y=x.reshape((1,3)) 

In [180]:
y

array([[1, 2, 3]])

In [181]:
y.shape

(1, 3)

In [182]:
x.shape

(3,)

In [185]:
x.reshape((3,1))

array([[1],
       [2],
       [3]])

- Another common reshaping pattern: ```newaxis```

In [183]:
x

array([1, 2, 3])

In [188]:
x[np.newaxis, :] # row vector via newaxis

array([[1, 2, 3]])

In [189]:
x[:, np.newaxis] # column vector via newaxis



array([[1],
       [2],
       [3]])

#### Array concatenation and splitting

- Concatenation: ```np.concatenate```, ```np.vstack```, ```np.hstack```


In [191]:
x=np.array([1,2,3])
y=np.array([3,2,1])

In [192]:
x

array([1, 2, 3])

In [193]:
y

array([3, 2, 1])

In [194]:
x.shape

(3,)

In [195]:
y.shape

(3,)

In [198]:
np.concatenate([x, y])

array([1, 2, 3, 3, 2, 1])

In [199]:
z=np.array([99,99,99])

In [200]:
np.concatenate([x, y, z])

array([ 1,  2,  3,  3,  2,  1, 99, 99, 99])

In [201]:
grid=np.array([
    [1,2,3],
    [4,5,6]
])

In [202]:
grid

array([[1, 2, 3],
       [4, 5, 6]])

In [208]:
np.concatenate([grid, grid]) # concatenate along the first axis

array([[1, 2, 3],
       [4, 5, 6],
       [1, 2, 3],
       [4, 5, 6]])

In [209]:
np.concatenate([grid, grid], axis=1) # concatenate along the second axis

array([[1, 2, 3, 1, 2, 3],
       [4, 5, 6, 4, 5, 6]])

In [216]:
np.vstack([x, grid])

array([[1, 2, 3],
       [1, 2, 3],
       [4, 5, 6]])

In [217]:
y=np.array([99, 99]).reshape((2,1))

In [218]:
y

array([[99],
       [99]])

In [220]:
np.hstack([grid, y])

array([[ 1,  2,  3, 99],
       [ 4,  5,  6, 99]])

- Splitting: ```np.split```, ```np.hsplit```, ```np.vsplit```. For each of these, we can pass a list of indices giving the split points. 

In [221]:
x=[1,2,3,99,99,3,2,1]

In [222]:
x1,x2,x3=np.split(x, [3,5])

In [223]:
x1


array([1, 2, 3])

In [224]:
x2

array([99, 99])

In [225]:
x3

array([3, 2, 1])

In [234]:
grid=np.arange(16).reshape(4,4)

In [235]:
grid

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [236]:
upper, lower=np.vsplit(grid, [2])

In [237]:
upper

array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

In [238]:
lower

array([[ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [240]:
left, right=np.hsplit(grid, [2])
print(left)
print(right)

[[ 0  1]
 [ 4  5]
 [ 8  9]
 [12 13]]
[[ 2  3]
 [ 6  7]
 [10 11]
 [14 15]]


### 2.3 Computation on NumPy arrays

- Computation on NumPy arrays can be very fast, or it can be very slow. The key to making it fast is to use **vectorized** operations, generally implemented through NumPy's universal functions. 

In [270]:
np.random.seed(0)

def compute_reciporals(values):
    output=np.empty(len(values))
    for i in range(len(values)):
        output[i]=1.0/values[i]
    return output    

In [271]:
values=np.random.randint(1,10, size=5)

In [272]:
compute_reciporals(values)

array([0.16666667, 1.        , 0.25      , 0.25      , 0.125     ])

In [273]:
big_array=np.random.randint(1,100, size=1000000)

In [274]:
%timeit compute_reciporals(big_array)

3.21 s ± 80.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


#### Universal functions

- Vectorized operations in NumPy are implemented via ufuncs, whose main purpose is to quickly execute repeated operations on values in NumPy arrays. 

- ```+```	```np.add```	Addition (e.g., 1 + 1 = 2)
- ```-```	```np.subtract```	Subtraction (e.g., 3 - 2 = 1)
- ```-```	```np.negative```	Unary negation (e.g., -2)
- ```*```	```np.multiply```	Multiplication (e.g., 2 * 3 = 6)
- ```/```	```np.divide```	Division (e.g., 3 / 2 = 1.5)
- ```//```	```np.floor_divide```	Floor division (e.g., 3 // 2 = 1)
- ```**```	```np.power```	Exponentiation (e.g., 2 ** 3 = 8)
- ```%```	```np.mod```	Modulus/remainder (e.g., 9 % 4 = 1)
- ```abs()``` ```np.absolute``` ```np.abs```
- ```np.sin```
- ```np.cos```
- ```np.tan```
- ```np.exp(x)```: e^x
- ```np.ep2(x)```: 2^x
- ```np.power(3,x)```: 3^x
- ```np.log(x)```: log(x)
- ```np.log2(x)```: log2(x)
- ```np.log10(x)```: log10(x)
- Specialized ufuncs: the excellent source for more specialized and obscure ufuncs is the submodule ```scipy.special```

```python
from scipy import special
```




In [277]:
compute_reciporals(values)

array([0.16666667, 1.        , 0.25      , 0.25      , 0.125     ])

In [278]:
1.0/values

array([0.16666667, 1.        , 0.25      , 0.25      , 0.125     ])

In [279]:
%timeit (1.0/big_array)

5.48 ms ± 346 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [280]:
np.arange(5)/np.arange(1,6)

array([0.        , 0.5       , 0.66666667, 0.75      , 0.8       ])

In [281]:
np.arange(5)

array([0, 1, 2, 3, 4])

In [282]:
np.arange(1,6)

array([1, 2, 3, 4, 5])

In [283]:
x=np.arange(9).reshape((3,3))

In [284]:
x

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [287]:
2**x

array([[  1,   2,   4],
       [  8,  16,  32],
       [ 64, 128, 256]])

In [288]:
x=np.arange(4)

In [289]:
x

array([0, 1, 2, 3])

In [290]:
x+5

array([5, 6, 7, 8])

In [291]:
x-5

array([-5, -4, -3, -2])

In [292]:
x*2

array([0, 2, 4, 6])

In [293]:
x/2

array([0. , 0.5, 1. , 1.5])

In [294]:
x//2 # floor division

array([0, 0, 1, 1])

In [295]:
-x

array([ 0, -1, -2, -3])

In [296]:
x**2

array([0, 1, 4, 9])

In [297]:
x%2

array([0, 1, 0, 1])

In [301]:
-(0.5*x+1)**2

array([-1.  , -2.25, -4.  , -6.25])

In [302]:
np.add(x,2)

array([2, 3, 4, 5])

In [303]:
np.negative(x)

array([ 0, -1, -2, -3])

In [304]:
np.power(x, 2)

array([0, 1, 4, 9])

In [305]:
abs(x)

array([0, 1, 2, 3])

In [306]:
np.absolute(x)

array([0, 1, 2, 3])

In [307]:
theta=np.linspace(0, np.pi,3)

In [308]:
theta

array([0.        , 1.57079633, 3.14159265])

In [309]:
np.sin(theta)

array([0.0000000e+00, 1.0000000e+00, 1.2246468e-16])

In [310]:
np.cos(theta)

array([ 1.000000e+00,  6.123234e-17, -1.000000e+00])

In [311]:
np.tan(theta)

array([ 0.00000000e+00,  1.63312394e+16, -1.22464680e-16])

In [313]:
x

array([0, 1, 2, 3])

In [314]:
np.exp(x)

array([ 1.        ,  2.71828183,  7.3890561 , 20.08553692])

- Specifying output

In [316]:
x=np.arange(5)
y=np.empty(5)
np.multiply(x, 10, out=y)

array([ 0., 10., 20., 30., 40.])

In [317]:
y

array([ 0., 10., 20., 30., 40.])

- Aggregates

In [318]:
x=np.arange(1,6)

In [320]:
x

array([1, 2, 3, 4, 5])

In [322]:
np.add.reduce(x)

15

In [323]:
np.multiply.reduce(x)

120

In [324]:
np.add.accumulate(x)

array([ 1,  3,  6, 10, 15])

In [325]:
np.multiply.accumulate(x)

array([  1,   2,   6,  24, 120])

In [326]:
x=np.arange(1,6)

In [327]:
x

array([1, 2, 3, 4, 5])

In [328]:
np.multiply.outer(x,x)

array([[ 1,  2,  3,  4,  5],
       [ 2,  4,  6,  8, 10],
       [ 3,  6,  9, 12, 15],
       [ 4,  8, 12, 16, 20],
       [ 5, 10, 15, 20, 25]])