# NumPy

Numpy is an open source Python library that's used in science and engineering. The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific python packages.

NumPy gives us an enormous range of fast and efficient ways of creating arrays and manipulating numerical data inside them. While a Python list can contain different data types within a single list, all of the elements in a NumPy array should be homogeneous

NumPy arrays are faster and more compact than Python lists. An array consumes less memory and is convenient to use. NumPy uses much less memory to store data and it provides a mechanism of specifying the data types

### NumPy Arrays

In [1]:
import numpy as np

In [13]:
age = np.array([22, 25, 32, 22, 21])

for i in range(len(age)):
    print(age[i])

22
25
32
22
21


An N-dimensional array "ndarray is simply an array with any number of dimensions. The ndarray class is used to represent both matrices and vectors. 

### Attributes of an array

In NumPy, dimensions are called axes

In [5]:
[[0., 0., 0.],
[1., 1., 1.,]]

[[0.0, 0.0, 0.0], [1.0, 1.0, 1.0]]

The above array has 2 axes. The first axes has length 2 and the second axes has a length 3

### Creating Basic Arrays

- np.array()
- np.zeros()
- np.ones()
- np.empty()
- np.arange()
- np.linspace()
- dtype

In [3]:
a = np.array([1, 2, 3])
print(a)

[1 2 3]


In [5]:
zero_array = np.zeros(2)
ones_array = np.ones(2)

print(zero_array)
print(ones_array)

[0. 0.]
[1. 1.]


In [6]:
# np.empty creates an array which has random elements,
# depending on the state of the memory
empty_array = np.empty(2)
print(empty_array)

[1. 1.]


In [8]:
# creating array with range of elements
firstTen_array = np.arange(10)
print(firstTen_array)

# arange with intervals np.arange(first, last, interval)
firstTenEven_array = np.arange(0, 10, 2)
print(firstTenEven_array)

[0 1 2 3 4 5 6 7 8 9]
[0 2 4 6 8]


In [9]:
# num controls total number of elements in the output array
spacelin_array = np.linspace(0, 10, num = 4)
print(spacelin_array)

[ 0.          3.33333333  6.66666667 10.        ]


In [11]:
# we can specify our datatype using dtype argument
onesInt64_array = np.ones(2, dtype = np.int64)
print(onesInt64_array)

[1 1]


### Adding, Removing, and Sorting elements

- np.sort()
    - argsort
    - lexsort
    - searchsorted
    - partition
- np.concatenate()

In [14]:
arr = np.array([10, 20, 30, 40, 60, 50])

In [15]:
sorted_arr = np.sort(arr)
print(sorted_arr)

[10 20 30 40 50 60]


In [23]:
# argsort: sorting along a specified axis. returns indices

# 1D array
arr1 = np.array([35, 22, 30])
print(np.argsort(arr1))

# 2D array
arr2 = np.array([[20, 30], 
                 [40, 50]])
print('Sorting along column: \n', np.argsort(arr2, axis = 1))

[1 2 0]
Sorting along column: 
 [[0 1]
 [0 1]]


In [26]:
# lexsort

arr1 = [1,5,1,4,5,9,4]
arr2 = [150,239,832,333,696,787,222]

print(np.lexsort((arr1, arr2)))

[0 6 1 3 4 5 2]


In [28]:
#searchsorted: find indices where elements should be inserted to maintain order
arr1 = [232, 333, 444, 565, 689]
element_to_be_inserted = 569

print(np.searchsorted(arr1, element_to_be_inserted))

4


In [30]:
#partition sort
arr1 = [1,5,1,4,5,9,4]
print(np.partition(arr1, 1))

[1 1 5 4 5 9 4]


In [37]:
# concatenation
arr1 = [1,5,1,4,5,9,4]
arr2 = [150,239,832,333,696,787,222]

print(np.concatenate((arr1, arr2)))

[  1   5   1   4   5   9   4 150 239 832 333 696 787 222]


### Shape and Size of Array

- ndrarray.dim
- ndarray.size
- ndarray.shape

In [7]:
arr = np.array([10, 20, 30, 40, 50])
brr = np.array([[10,20,30],
              [40,50,60]])

print(arr.ndim)
print(arr.shape)
print(arr.size)
print('\n')
print(brr.ndim)
print(brr.shape)
print(brr.size)

1
(5,)
5


2
(2, 3)
6


### Reshaping an Array

- arr.reshape()

In [10]:
arr = np.arange(10)
print(arr, '\n')

reshaped_arr = arr.reshape(2,5)
print(reshaped_arr, '\n')

reshaped_arr = arr.reshape(5,2)
print(reshaped_arr)

[0 1 2 3 4 5 6 7 8 9] 

[[0 1 2 3 4]
 [5 6 7 8 9]] 

[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


### Converting 1D array into a 2D array

- np.newaxis
- np.expand_dims

In [13]:
arr = np.arange(10)
print(arr.shape,'\n')

arr1 = arr[np.newaxis, :]
print(arr1.shape,'\n')

arr2 = arr1[np.newaxis, :]
print(arr2.shape)

(10,) 

(1, 10) 

(1, 1, 10)


In [14]:
arr = np.arange(10)
print(arr.shape,'\n')

arr1 = np.expand_dims(arr, axis = 0)
print(arr1.shape)

(10,) 

(1, 10)


### Indexing and Slicing

In [16]:
data = np.arange(10)
print(data)

print(data[0:3])

print(data[:-1])

[0 1 2 3 4 5 6 7 8 9]
[0 1 2]
[0 1 2 3 4 5 6 7 8]


In [18]:
data2 = np.array([[1,2,3], [4,5,6], [7,8,9]])
print(data2[0:2])

[[1 2 3]
 [4 5 6]]


Conditional Indexing and Slicing

In [21]:
data = np.array([[1 , 2, 3, 4], [5, 6, 7, 8], [9, 0, 11, 12]])
condition = (data < 5)

print(data[condition])
print(condition)

[1 2 3 4 0]
[[ True  True  True  True]
 [False False False False]
 [False  True False False]]


In [22]:
data = np.array([[1 , 2, 3, 4], [5, 6, 7, 8], [9, 0, 11, 12]])
data = np.nonzero(data < 5)

print(data)

(array([0, 0, 0, 0, 2], dtype=int64), array([0, 1, 2, 3, 1], dtype=int64))


In this example, a tuple of arrays is returned: one for each dimension. The first array represents the row indices where these values are found, and the second array represents the column indices where the values are found

### Creating Array from Existing Data

- np.vstack()
- np.hstack()
- np.hsplit()
- .view()
- copy()

In [4]:
arr1 = np.array([[10,20],
                [30,40]])

arr2 = np.array([[1, 2],
                [3, 4]])

# Stacking arrays vertically
arr_vertical = np.vstack((arr1, arr2))
print("Vstack: \n", arr_vertical)

# Stacking arrays horizontally
arr_horizontal = np.hstack((arr1, arr2))
print("\nHstack: \n", arr_horizontal)

Vstack: 
 [[10 20]
 [30 40]
 [ 1  2]
 [ 3  4]]

Hstack: 
 [[10 20  1  2]
 [30 40  3  4]]


In [9]:
# horizontal split
arr = np.arange(10).reshape(5,2)
print(arr)

arr_split = np.hsplit(arr, 2)
print("\n", arr_split)

[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]

 [array([[0],
       [2],
       [4],
       [6],
       [8]]), array([[1],
       [3],
       [5],
       [7],
       [9]])]


In [14]:
# view
arr = np.arange(10)
print("arr: ", arr)

arr_view = arr.view()
print("\narr_view: ", arr_view)

# changing view changes original array as well
arr_view[0] = 100

print("\narr: ", arr)

arr:  [0 1 2 3 4 5 6 7 8 9]

arr_view:  [0 1 2 3 4 5 6 7 8 9]

arr:  [100   1   2   3   4   5   6   7   8   9]


In [18]:
# copy
arr = np.arange(10)
print("arr: ", arr)

arr_copy = arr.copy()
print("\narr_copy: ", arr_copy)

# modifying copy of an array does not modify the original array
arr_copy[0] = 100

print("\narr: ", arr)

arr:  [0 1 2 3 4 5 6 7 8 9]

arr_copy:  [0 1 2 3 4 5 6 7 8 9]

arr:  [0 1 2 3 4 5 6 7 8 9]


### Basic Array Operations

- Addition
- Subtraction
- Multiplicaiton
- Division

In [2]:
# addition
arr1 = np.array([20, 30])
arr2 = np.array([0.1, 0.1])

print(arr1 + arr2)

[20.1 30.1]


In [3]:
# subtraction
arr1 = np.array([20, 30])
arr2 = np.array([0.1, 0.1])

print(arr1 - arr2)

[19.9 29.9]


In [4]:
# division
arr1 = np.array([20, 30])
arr2 = np.array([0.1, 0.1])

print(arr1 / arr2)

[200. 300.]


In [7]:
# multiplication
arr1 = np.array([20, 30])
arr2 = np.array([0.1, 0.1])

print(arr1 * arr2)

[2. 3.]


### Other Array Operations

- maximum
- minimum
- sum
- mean
- product
- standard deviation

In [13]:
arr = np.arange(10)
print(arr)

# maximum
print('Max(arr): ', arr.max())

# minimum
print('Min(arr): ', arr.min())

# sum
print('Sum(arr): ', arr.sum())

# mean
print('Mean(arr): ', arr.mean())

# standard deviation
print('StandardDeviation(arr): ', np.std(arr))

# Product
print('Product(arr): ', np.product(arr))

[0 1 2 3 4 5 6 7 8 9]
Max(arr):  9
Min(arr):  0
Sum(arr):  45
Mean(arr):  4.5
StandardDeviation(arr):  2.8722813232690143
Product(arr):  0


### Random Number Generation

The use of random number generation is an important part of configuration and evaluation of many numerical and machine learning algorithms. Whether you need to randomly initialize weights in an artificial neural network, split data into random sets, or randomly shuffle your dataset, being able to generate random numbers is essential

With Generator.integers, we can generate random integers from low to high. You can set endpoint = True to make the high number inclusive

**Source: Numpy Documentation**

In [15]:
from numpy.random import default_rng
rng = default_rng()

print(rng.integers(6, size = (3,2)))

[[2 0]
 [3 0]
 [5 4]]


### Unique Items and Counts

In [18]:
arr = np.array([10,10,20,10,20,10,30,40,20,30,30,40,40,50])
print("Unique entries in arr: ", np.unique(arr))

unique_items, indices_list = np.unique(arr, return_index = True)
print("Index of unique items: ",indices_list)

unique_items, frequency = np.unique(arr, return_counts = True)
print("Frequency of Unique Items: ",frequency)

Unique entries in arr:  [10 20 30 40 50]
Index of unique items:  [ 0  2  6  7 13]
Frequency of Unique Items:  [4 3 3 3 1]


### Transposing and Reshaping a Matrix

In [23]:
arr = np.arange(15)
print("arr: ", arr)

arr_reshaped = arr.reshape(3, 5)
print("reshaped arr: \n", arr_reshaped)

arr_transposed = arr_reshaped.transpose()
print("transposed arr: \n", arr_transposed)

arr:  [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]
reshaped arr: 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]]
transposed arr: 
 [[ 0  5 10]
 [ 1  6 11]
 [ 2  7 12]
 [ 3  8 13]
 [ 4  9 14]]


### Reversing array

In [24]:
# 1D
arr = np.arange(15)
print(arr)

arr_reversed = np.flip(arr)
print(arr_reversed)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]
[14 13 12 11 10  9  8  7  6  5  4  3  2  1  0]


In [31]:
arr = np.array([[10, 20, 30],
               [40, 50, 60]])

print("arr: \n", arr)

# reversing contents of each array as well the arrays
arr_reverse = np.flip(arr)
print("Reversed: \n", arr_reverse)

# reversing only rows
arr_reverse_rows = np.flip(arr, 0)
print("Reversing only rows: \n", arr_reverse_rows)

# reversing only columns
arr_reverse_columns = np.flip(arr, 1)
print("Reversing only columns: \n", arr_reverse_columns)

# reversing only 1 row
arr[1] = np.flip(arr[1])
print("Reversing one row: \n", arr)

# reversing only 1 column
arr[:, 2] = np.flip(arr[:, 2])
print("Reversing one column: \n", arr)

arr: 
 [[10 20 30]
 [40 50 60]]
Reversed: 
 [[60 50 40]
 [30 20 10]]
Reversing only rows: 
 [[40 50 60]
 [10 20 30]]
Reversing only columns: 
 [[30 20 10]
 [60 50 40]]
Reversing one row: 
 [[10 20 30]
 [60 50 40]]
Reversing one column: 
 [[10 20 40]
 [60 50 30]]


### Reshaping and Flattening Multidimensional arrays

- .flatten()
- ravel()

Difference between the two is that the new array created using ravel() is actually a reference to the parent array. This means that any changes to the new array will affect the parent as well. Since ravel does not create a copy, it's memory efficient

In [4]:
arr = np.arange(16).reshape(4,4)
print(arr)

flatten_arr = arr.flatten()
print(flatten_arr)

ravel_arr = arr.ravel()
print(ravel_arr)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]


### Saving and Loading Numpy Objects

- np.save
- np.savez
- np.savetxt
- np.load
- np.loadtxt

The ndarray objects can be saved to and loaded from the disk files with **loadtxt** and **savetxt** functions that handle normal text files, **load** and **save** functions that handle NumPy binary files with a .npy file extension, and a **savez** function that handles NumPy files with a .npz file extension

The .npy and .npz files store data, shape, dtype, and other information required to reconstruct the ndarray in a way that allows the array to be correctly retrieved, even when the file is on another machine with different architecture

If you want to store a single ndarray object, store it as a .npy file. If you want to store more than one ndarray object in a single file, save it as .npz file. 