# Numpy
Numpy is an important scientific package for working with data. <br>
Many other packages are developed based on the numpy package. <br>
<br>
To install numpy, use the pip command:<br>
```bash
pip install numpy
```


In [214]:
import numpy as np

## Array definition
An array is a ndarrays objects. <br>
numpy arrays are defined as lists or lists of lists

In [215]:
a = np.array([[1,2], [3,4]])
print (f"a = \n{a} \ndata type: {a.dtype}\ndimention: {a.ndim}") #data type of elements and Number of array dimensions

a = 
[[1 2]
 [3 4]] 
data type: int32
dimention: 2


You can also define a 2-dimensional array as a matrix. <br>
Numpy matrices are strictly 2-dimensional, while numpy arrays (ndarrays) are N-dimensional.<br>
Matrix objects are a subclass of ndarray, so they inherit all the attributes and methods of ndarrays.<br>
Using ndarrays is prefered because it is more general than matrix.


In [216]:
b = np.matrix([[1, 2], [3, 4]])
b

matrix([[1, 2],
        [3, 4]])

Define special kinds of arrays:

In [217]:
# Define all-ones array
x = np.ones((3, 3))
x

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [218]:
# Default data type is float, we can change it using the dtype argument
np.ones((3, 3), dtype = np.int32)

array([[1, 1, 1],
       [1, 1, 1],
       [1, 1, 1]])

In [219]:
# Define all-zeros array
np.zeros((2, 3, 4))

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

In [220]:
# Generate an integer vector containing a sequence number from 0 to n-1
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [221]:
# Generate an integer vector using a sequence number from start (inclusive) to stop (exclusive), in steps
np.arange(1, 10, 3)

array([1, 4, 7])

In [222]:
# Generate a float sequence over a specified interval
np.linspace(1, 10, 3) # start (inclusive), stop (inclusive), number of elements

array([ 1. ,  5.5, 10. ])

In [223]:
# Generate an array using a sequence number auto shape
reshaped = np.arange(12).reshape((3, 4))
reshaped

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [224]:
# Revert the reshaping
#   Method-1:
reflat = reshaped.reshape(-1)
reflat

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [225]:
# reshape() returns a view of the original data. Thus, if you modify the returned value, the original one is also changed.
reflat[0] = 100
print(reflat)
print(reshaped)

[100   1   2   3   4   5   6   7   8   9  10  11]
[[100   1   2   3]
 [  4   5   6   7]
 [  8   9  10  11]]


In [226]:
#   Mthod-2:
reflat = reshaped.flatten()
reflat

array([100,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11])

In [227]:
# flatten() returns a copy of the original data. Thus, if you modify the returned value, the original one is not changed.
reflat[1] = 200
print(reflat)
print(reshaped)

[100 200   2   3   4   5   6   7   8   9  10  11]
[[100   1   2   3]
 [  4   5   6   7]
 [  8   9  10  11]]


In [228]:
# Generate a random integer array, elements are from low (inclusive) to high (exclusive), shape is specified
r1 = np.random.randint(0, 10, (3, 4))
r1

array([[4, 5, 5, 4],
       [5, 8, 4, 5],
       [3, 4, 1, 6]])

In [229]:
# Generate random values in a given shape from a uniform distribution over [0, 1)
np.random.rand(2, 3)

array([[0.90611883, 0.8898753 , 0.75878293],
       [0.81778825, 0.56966104, 0.27044322]])

In [230]:
# Generate a randomm array using uniform distribution
# Parameters: Min value (inclusive),  Max value (inclusive), array shape
myRandomArray = np.random.uniform(1, 5, (2, 3)) # Min value = 1,  Max value = 5, shape = 2 \times 3
print("My random array = ", myRandomArray)

My random array =  [[4.99153919 4.47141293 3.58496976]
 [2.68224083 1.45804365 2.56045524]]


In [231]:
# Generate a random matrix using normal distribution (mean=0, stdev=1).
myRandomArray = np.random.standard_normal((2, 3))
print("My random array = ", myRandomArray)

My random array =  [[ 0.60080134 -0.95257613 -1.87779078]
 [ 1.11367825  0.58708535  1.28279094]]


In [232]:
# Sets as arrays
aa = np.array([1, 2, 3, 4, 1, 3])
print("unique values: ", np.unique(aa))

unique values:  [1 2 3 4]


In [233]:
# Union and Intersection of sets as arrays
bb = np.array([7, 8, 9, 1, 3])
print("The first set=", aa)
print("The 2nd array =", bb)
print("union = ", np.union1d(aa, bb))
print("Intersection = ", np.intersect1d(aa, bb))

# 1d at the end of the statements means that the input arrays Will be flattened if not already 1D

The first set= [1 2 3 4 1 3]
The 2nd array = [7 8 9 1 3]
union =  [1 2 3 4 7 8 9]
Intersection =  [1 3]


## Shape and type

In [234]:
# Get size
print(a)
np.size(a)

[[1 2]
 [3 4]]


4

In [235]:
# Get shape
np.shape(a)

(2, 2)

Numpy data types and their corresponding types in C are as follows:<br>
| NumPy   | Equivalent C type                                                                 |
|:--------|:-------------------------------|
| float64 | double                         |
| float32 | float                          |
| int64   | long long [–2^63, 2^63−1]      |
| uint64  | unsigned long long [0, 2^64−1] |
| int32   | long [–2^31, 2^31−1]           |
| uint32  | unsigned long [0, 2^32−1]      |
| uint8   | unsigned char [0, 255]         |

In addition to numerical types, NumPy also supports strings
- unicode strings, via the numpy.str_ dtype (U character code), 
> - null-terminated byte sequences via numpy.bytes_ (S character code), and 
> - arbitrary byte sequences, via numpy.void (V character code).

Discussion about these types is out of the scope of this course!

In [236]:
z = np.array([1, 2, 3, 4], dtype="uint8")
print ("z = ", z, "data type: ", z.dtype)

z =  [1 2 3 4] data type:  uint8


## Indexing

In [237]:
print("a = \n", a)
print("a[0, 1]= ", a[0, 1])
print("a[1]= ", a[1])
print("a[0, 0], a[1, 0], a[1, 1] = ", np.array([a[0, 0], a[1, 0], a[1, 1]]))
print("a[[0, 1, 1], [0, 0, 1]] = ", np.array(a[[0, 1, 1], [0, 0, 1]])) # same as previous

a = 
 [[1 2]
 [3 4]]
a[0, 1]=  2
a[1]=  [3 4]
a[0, 0], a[1, 0], a[1, 1] =  [1 3 4]
a[[0, 1, 1], [0, 0, 1]] =  [1 3 4]


## Slicing

In [238]:
b = np.arange(1, 13).reshape((4,3))
print("b= ", b)
print("b[1, :]= ", b[1, :])
print("b[:, 1]= ", b[:, 1])
print("b[:2, :]= ", b[:2, :])
print("b[:2, :2]= ", b[:2, :2])
print("b[:1, :1]= ", b[:1, :1])

b=  [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
b[1, :]=  [4 5 6]
b[:, 1]=  [ 2  5  8 11]
b[:2, :]=  [[1 2 3]
 [4 5 6]]
b[:2, :2]=  [[1 2]
 [4 5]]
b[:1, :1]=  [[1]]


## Stepping

In [239]:
print("b= ", b)
# Stepping in rows
print("b[::2]= ", b[::2]) # The 3rd number is the step
print("b[::3]= ", b[::3])
print("b[::-2]= ", b[::-2]) # reverse order + steps

b=  [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
b[::2]=  [[1 2 3]
 [7 8 9]]
b[::3]=  [[ 1  2  3]
 [10 11 12]]
b[::-2]=  [[10 11 12]
 [ 4  5  6]]


In [240]:
print("b= ", b)
# Stepping in columns
print("b[:, ::2]= ", b[:, ::2])

b=  [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
b[:, ::2]=  [[ 1  3]
 [ 4  6]
 [ 7  9]
 [10 12]]


In [241]:
print("b= ", b)
# Stepping in both sides
print("b[::-2, ::2]= ", b[::-2, ::2])

b=  [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
b[::-2, ::2]=  [[10 12]
 [ 4  6]]


Indexing for more dimensions

In [242]:
q = np.arange(24).reshape((4, 3, 2))
q

array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5]],

       [[ 6,  7],
        [ 8,  9],
        [10, 11]],

       [[12, 13],
        [14, 15],
        [16, 17]],

       [[18, 19],
        [20, 21],
        [22, 23]]])

In [243]:
q[1, :, :]

array([[ 6,  7],
       [ 8,  9],
       [10, 11]])

## Modification

In [244]:

q[1, :, :] = [[66, 77],[88, 99],[100, 111]]
q

array([[[  0,   1],
        [  2,   3],
        [  4,   5]],

       [[ 66,  77],
        [ 88,  99],
        [100, 111]],

       [[ 12,  13],
        [ 14,  15],
        [ 16,  17]],

       [[ 18,  19],
        [ 20,  21],
        [ 22,  23]]])

In [245]:
q[2,...] = [[122, 133],[144, 155],[166, 177]] # ... means as many : as needed
q

array([[[  0,   1],
        [  2,   3],
        [  4,   5]],

       [[ 66,  77],
        [ 88,  99],
        [100, 111]],

       [[122, 133],
        [144, 155],
        [166, 177]],

       [[ 18,  19],
        [ 20,  21],
        [ 22,  23]]])

## Aggregation functions

In [246]:
# Sum of all elemts
print (a)
a.sum()

[[1 2]
 [3 4]]


10

In [247]:
# Cumulative sum
print(a)
print ("Cumulative sum of the elements in array a=\n", a.cumsum())
print ("Cumulative sum (axis=0) of the elements in array a=\n", a.cumsum(axis=0))
print ("Cumulative sum (axis=1) of the elements in array a=\n", a.cumsum(axis=1))

[[1 2]
 [3 4]]
Cumulative sum of the elements in array a=
 [ 1  3  6 10]
Cumulative sum (axis=0) of the elements in array a=
 [[1 2]
 [4 6]]
Cumulative sum (axis=1) of the elements in array a=
 [[1 3]
 [3 7]]


In [248]:
print(q)
q.cumsum(axis=0)

[[[  0   1]
  [  2   3]
  [  4   5]]

 [[ 66  77]
  [ 88  99]
  [100 111]]

 [[122 133]
  [144 155]
  [166 177]]

 [[ 18  19]
  [ 20  21]
  [ 22  23]]]


array([[[  0,   1],
        [  2,   3],
        [  4,   5]],

       [[ 66,  78],
        [ 90, 102],
        [104, 116]],

       [[188, 211],
        [234, 257],
        [270, 293]],

       [[206, 230],
        [254, 278],
        [292, 316]]])

In [249]:
# Multiplication of all elements: 1*2*3*4
a.prod()

24

## Statistics

In [250]:
print (f"for a={a},")
print (f"Max = {np.max(a)}, Min = {np.min(a)}")
print (f"Mean = {np.mean(a)}, Median = {np.median(a)}")
print (f"Variance = {np.var(a)}, Standard deviation = {np.std(a)}")
print(f"Quartiles={np.percentile(a, [25, 50, 75])}")

for a=[[1 2]
 [3 4]],
Max = 4, Min = 1
Mean = 2.5, Median = 2.5
Variance = 1.25, Standard deviation = 1.118033988749895
Quartiles=[1.75 2.5  3.25]


## Filtering

In [251]:
celcius =  np.array([[-17, -11, -20, -15]])
print(f"{celcius < -15}")

[[ True False  True False]]


In [252]:
print(a)
my_mask = a > 1
print("my_mask=\n", my_mask)
print("a[my_mask]=\n ", a[my_mask])

[[1 2]
 [3 4]]
my_mask=
 [[False  True]
 [ True  True]]
a[my_mask]=
  [2 3 4]


In [253]:
my_mask = np.logical_and(a > 1, a < 3)
print(my_mask)
print("1 < a < 3 mask -> ", a[my_mask])

[[False  True]
 [False False]]
1 < a < 3 mask ->  [2]


Other logical operations in Numpy: logical_and, logical_or, logical_not, and logical_xor

Alternatives for logical_and and logical_or are & and | 

In [254]:
my_mask = (a > 1) & (a < 3)
print(my_mask)
print(a[my_mask])

[[False  True]
 [False False]]
[2]


In [255]:
# Test whether each element of a 1-D array is also present in a second array.
arr = np.array([0, 1, 2, 5, 0])
states = [0, 2]
mask = np.in1d(arr, states)
mask 

array([ True, False,  True, False,  True])

In [256]:
# Choose elements from multiple arrays depending on condition
a1 = np.array([1, 2, 3])
a2 = np.array([4, 5, 6])
cond = np.array([True, False, True])
np.where(cond, a1, a2) # Returns the element from a1 if cond is true, otherwise returns the element from a2

array([1, 5, 3])

In [257]:
a1 = np.random.randn(2, 3)
print(a1)
np.where(a1 > 0, a1, 0)

[[-0.14582304 -0.48838021 -0.92405107]
 [ 0.9896214  -0.85033342 -1.71731785]]


array([[0.       , 0.       , 0.       ],
       [0.9896214, 0.       , 0.       ]])

## Broadcasting

In [258]:
f = np.array([1, 2, 3])
x = np.ones((3, 3))
print("[1, 2, 3] + 5 = " , f + 5)
print("\n[[1, 1, 1]\n [1, 1, 1]\n [1, 1, 1]] + \n[1, 2, 3] =\n" , x + f)
print("\n[[1]\n [1]\n [1]]\n+ [1, 2, 3] = \n" , np.array([[1], [1], [1]]) + f)

[1, 2, 3] + 5 =  [6 7 8]

[[1, 1, 1]
 [1, 1, 1]
 [1, 1, 1]] + 
[1, 2, 3] =
 [[2. 3. 4.]
 [2. 3. 4.]
 [2. 3. 4.]]

[[1]
 [1]
 [1]]
+ [1, 2, 3] = 
 [[2 3 4]
 [2 3 4]
 [2 3 4]]


## Save and load to/from file

In [259]:
# Save and load to/from file
print("r1= ", r1)
np.save("random.npy", r1) # It is a binary file, must be .npy file
r2 = np.load("random.npy")
print("r2= ", r2)

r1=  [[4 5 5 4]
 [5 8 4 5]
 [3 4 1 6]]
r2=  [[4 5 5 4]
 [5 8 4 5]
 [3 4 1 6]]


In [260]:
# Delete as usual
import os
os.remove('random.npy')

## Others

In [261]:
# Related numpy fuctoins to the following functions were deprecated.
# Since the functions are migrated to the standard math package, we use them as follows:

# Square root
import math
print ("sqrt of 5= ", math.sqrt(5))

# Not a number -> application = missing values
y = math.nan
print("y = ", y)

# Infinity
print("Infinity = ", math.inf)


sqrt of 5=  2.23606797749979
y =  nan
Infinity =  inf
