Numpy Basics

NumPy is a Python library used for working with arrays.

In Python we have lists that serve the purpose of arrays, but they are slow to process.
NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.


In [1]:
import numpy as np
a = np.array([2,4,5,9,6]) # creates 1-D array 
b = np.array(32) # creates 0-D array
c = np.array([[1, 2, 3], [4, 5, 6]]) # creates 2-D array
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]]) # creates 3-D array (An array that has 2-D arrays (matrices) as its elements is called 3-D array)

print(a)
print(b)
print(c)
print(d)

print(type(a))

print(np.__version__)

[2 4 5 9 6]
32
[[1 2 3]
 [4 5 6]]
[[[1 2 3]
  [4 5 6]]

 [[1 2 3]
  [4 5 6]]]
<class 'numpy.ndarray'>
1.26.4


In [2]:
# ndim attribute that returns an integer that tells us how many dimensions the array have
print(a.ndim)
print(b.ndim)
print(c.ndim)
print(d.ndim)

1
0
2
3


In [3]:
#When the array is created, we can define the number of dimensions by using the ndmin argument.

arr = np.array([1, 2, 3, 4], ndmin=5)

print(arr)
print('number of dimensions :', arr.ndim)

[[[[[1 2 3 4]]]]]
number of dimensions : 5


In [4]:
# NumPy Array Indexing

import numpy as np

arr = np.array([1, 2, 3, 4])
print(arr[2])
print(arr[2] + arr[3])


3
7


In [5]:
# Access 2-D, 3-D Arrays

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('2nd element on 1st row: ', arr[0, 1]) # arr[row index, column index]
print('5th element on 2nd row: ', arr[1, 4]) 

brr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(brr[0, 1, 2])

# Negative indexing

print(arr[1, -1])


2nd element on 1st row:  2
5th element on 2nd row:  10
6
10


In [6]:
# Slicing

# Note: The result includes the start index, but excludes the end index.

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print('1)', arr[1:5])
print('2)', arr[4:])
print('3)', arr[:4])
print('4)', arr[-3:-1]) # Negative Slicing
print('5)', arr[1:5:2]) # Steps
print('6)', arr[::2]) # return every other element from the entire array

# Slicing 2-D Arrays

b = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print('7)', b[1, 1:4])
print('8)', b[0:2, 2])
print('9)', b[0:2, 1:4]) # returns a 2-D array



1) [2 3 4 5]
2) [5 6 7]
3) [1 2 3 4]
4) [5 6]
5) [2 4]
6) [1 3 5 7]
7) [7 8 9]
8) [3 8]
9) [[2 3 4]
 [7 8 9]]


========== Data Types in Python ===============
By default Python have these data types:

strings - used to represent text data, the text is given under quote marks. e.g. "ABCD"
integer - used to represent integer numbers. e.g. -1, -2, -3
float - used to represent real numbers. e.g. 1.2, 42.42
boolean - used to represent True or False.
complex - used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5

======= Data Types in NumPy ==========
NumPy has some extra data types, and refer to data types with one character, like i for integers, u for unsigned integers etc.

Below is a list of all data types in NumPy and the characters used to represent them.

i - integer
b - boolean
u - unsigned integer
f - float
c - complex float
m - timedelta
M - datetime
O - object
S - string
U - unicode string
V - fixed chunk of memory for othe

==== Note: ValueError: In Python ValueError is raised when the type of passed argument to a function is unexpected/incorrect.
r type ( void ) + 2.5j

In [7]:
# Checking the Data Type of an Array

a = np.array([1, 2, 3, 4])
print('a:', a.dtype)

b = np.array(['apple', 'banana', 'cherry'])
print('b:', b.dtype)

c = np.array([1, 2, 3, 4], dtype='S')
print('c:', c)
print(c.dtype)

d = np.array([1, 2, 3, 4], dtype='i4')
print('d:', d)
print(d.dtype)

a: int32
b: <U6
c: [b'1' b'2' b'3' b'4']
|S1
d: [1 2 3 4]
int32


In [8]:
##### ValueError: In Python ValueError is raised when the type of passed argument to a function is unexpected/incorrect.

# A non integer string like 'a' can not be converted to integer (will raise an error):

arr = np.array(['a', '2', '3'], dtype='i')

ValueError: invalid literal for int() with base 10: 'a'


Converting Data Type on Existing Arrays
The best way to change the data type of an existing array, is to make a copy of the array with the astype() method.

The astype() function creates a copy of the array, and allows you to specify the data type as a parameter.

The data type can be specified using a string, like 'f' for float, 'i' for integer etc. or you can use the data type directly like float for float and int for int
eger.

In [9]:
# Change data type from float to integer by using 'i' as parameter value:

arr = np.array([1.1, 2.1, 3.1])

newarr = arr.astype('i')

print('newarr : ',newarr)
print(newarr.dtype)

newarr1 = arr.astype(int) # bool also can be used for boolean
print('newarr1 : ', newarr1)
print(newarr1.dtype)

newarr :  [1 2 3]
int32
newarr1 :  [1 2 3]
int32


Copy and View

Copy : The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy.

View : The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.

In [10]:
#====== Copy =====#
# When you copy the data, the variable in which you copied the data will not be dependent on original after you copy it

a = np.array([1, 2, 3, 4, 5])
x = a.copy()
a[0] = 42
print('Copy example: ')
print('a: ', a)
print('x: ',x)

#====== View =====#
# when you do view, if you change original data, variable in which you viewed the data will also change

b = np.array([1, 2, 3, 4, 5])
y = b.view()
b[0] = 42
print('View example: ')
print('b: ', b)
print('y: ',y)


Copy example: 
a:  [42  2  3  4  5]
x:  [1 2 3 4 5]
View example: 
b:  [42  2  3  4  5]
y:  [42  2  3  4  5]


In [11]:
#=================Important =============#
# Check whether the Array owns it's data

arr = np.array([1, 2, 3, 4, 5])

x = arr.copy()
y = arr.view()

print(x.base)
print(y.base)


None
[1 2 3 4 5]


In [12]:
# Shape of an Array
# The shape of an array is the number of elements in each dimension.
# NumPy arrays have an attribute called shape that returns a tuple with each index having the number of corresponding elements.

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(arr.shape)

# Example : Create an array with 5 dimensions using ndmin using a vector with values 1,2,3,4 and verify that last dimension has value 4:

b = np.array([1, 2, 3, 4], ndmin=5)

print(b)
print('shape of array :', b.shape)

(2, 4)
[[[[[1 2 3 4]]]]]
shape of array : (1, 1, 1, 1, 4)


In [13]:
# Reshaping arrays
# Reshaping means changing the shape of an array.
# returns the original array, so it is a view

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newarr = arr.reshape(4, 3)
print('newarr =', newarr)

b = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newb = b.reshape(2, 3, 2)
print('newb =', newb)

newarr = [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
newb = [[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]



Unknown Dimension
You are allowed to have one "unknown" dimension.

Meaning that you do not have to specify an exact number for one of the dimensions in the reshape method.

Pass -1 as the value, and NumPy will calculate this number for
 you.

In [14]:

# Example
# Convert 1D array with 8 elements to 3D array with 2x2 elements:

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

newarr = arr.reshape(2, 2, -1)
print(newarr)

# Note: We can not pass -1 to more than one dimension.


[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


In [15]:

# Flattening the arrays
# Flattening array means converting a multidimensional array into a 1D array.

# We can use reshape(-1) to do this.

arr = np.array([[1, 2, 3], [4, 5, 6]])
newarr = arr.reshape(-1)
print(newarr)

# There are a lot of functions for changing the shapes of arrays in numpy flatten, ravel and also for rearranging the elements rot90, flip, fliplr, flipud etc.


[1 2 3 4 5 6]


In [16]:
# Iterating Arrays

arr = np.array([[1, 2, 3], [4, 5, 6]])
for x in arr:
  print(x)

# To return the actual values, the scalars, we have to iterate the arrays in each dimension.

b = np.array([[1, 2, 3], [4, 5, 6]])

for x in b:
  for y in x:
    print(y)

c = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
for x in c:
    print("x represents the 2-D array:")
    print(x)


[1 2 3]
[4 5 6]
1
2
3
4
5
6
x represents the 2-D array:
[[1 2 3]
 [4 5 6]]
x represents the 2-D array:
[[ 7  8  9]
 [10 11 12]]


In [17]:

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

for x in arr:
  for y in x:
    for z in y:
      print(z)
        

1
2
3
4
5
6
7
8
9
10
11
12



Iterating Arrays Using nditer()


In [18]:

# Iterating on Each Scalar Element
# In basic for loops, iterating through each scalar of an array we need to use n for loops which can be difficult to write for arrays with very high dimensionality.

arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

for x in np.nditer(arr):
  print(x)

1
2
3
4
5
6
7
8


In [19]:

# Iterate through the array as a string:

arr = np.array([1, 2, 3])

for x in np.nditer(arr, flags=['buffered'], op_dtypes=['S']):
  print(x)

# Iterate through every scalar element of the 2D array skipping 1 element:

b = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

for y in np.nditer(b[:, ::2]):
  print(y)

b'1'
b'2'
b'3'
1
3
5
7


In [20]:
# Enumerated Iteration Using ndenumerate()
# np.ndenumerate is a convenience iterator that loops over all elements of the array and gives you their index (multi-dimensional) along with the value.

arr = np.array([[1, 2, 3], [4,5,6]])

for idx, x in np.ndenumerate(arr):
  print(idx, x)


(0, 0) 1
(0, 1) 2
(0, 2) 3
(1, 0) 4
(1, 1) 5
(1, 2) 6


In [29]:
# Joining NumPy Arrays
# concatenate() function

# We pass a sequence of arrays that we want to join to the concatenate() function, along with the axis. If axis is not explicitly passed, it is taken as 0.

arr1 = np.array([[1, 2], [3, 4]])

arr2 = np.array([[5, 6], [7, 8]])

x = np.concatenate((arr1, arr2))
y = np.concatenate((arr1, arr2), axis=1)

print('x:', x)
print('y:', y)

x: [[1 2]
 [3 4]
 [5 6]
 [7 8]]
y: [[1 2 5 6]
 [3 4 7 8]]


In [33]:
# Joining Arrays Using Stack Functions
# Stacking is same as concatenation, the only difference is that stacking is done along a new axis.

import numpy as np

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

x = np.stack((arr1, arr2))            # stack as rows → shape (2, 3)
y = np.stack((arr1, arr2), axis=1)    # stack as columns → shape (3, 2)
h = np.hstack((arr1, arr2))           # join side by side → shape (6,)
v = np.vstack((arr1, arr2))           # join one on top of the other → shape (2, 3)

print('x:', x)
print('y:', y)
print('h:', h)
print('v:', v)


x: [[1 2 3]
 [4 5 6]]
y: [[1 4]
 [2 5]
 [3 6]]
h: [1 2 3 4 5 6]
v: [[1 2 3]
 [4 5 6]]


In [36]:
# Splitting NumPy Arrays
# array_split()

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

print(newarr) # Note: The return value is a list containing three arrays.

##### If the array has less elements than required, it will adjust from the end accordingly. ########

newarr2 = np.array_split(arr, 4)

print(newarr2)


[array([1, 2]), array([3, 4]), array([5, 6])]
[array([1, 2]), array([3, 4]), array([5]), array([6])]



Note: We also have the method split() available but it will not adjust the elements when elements are less in source array for splitting like in example above, array_split() worked properly but split() would fail.


In [38]:
# Accesseing the arrays from result

print(newarr[0])
print(newarr[1])
print(newarr[2])

[1 2]
[3 4]
[5 6]


In [39]:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])

newarr = np.array_split(arr, 3, axis=1)

print(newarr)


[array([[ 1],
       [ 4],
       [ 7],
       [10],
       [13],
       [16]]), array([[ 2],
       [ 5],
       [ 8],
       [11],
       [14],
       [17]]), array([[ 3],
       [ 6],
       [ 9],
       [12],
       [15],
       [18]])]


In [42]:
# hsplit() or vsplit()

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])

newarr = np.hsplit(arr, 3)

print(newarr)

[array([[ 1],
       [ 4],
       [ 7],
       [10],
       [13],
       [16]]), array([[ 2],
       [ 5],
       [ 8],
       [11],
       [14],
       [17]]), array([[ 3],
       [ 6],
       [ 9],
       [12],
       [15],
       [18]])]


In [43]:
#### Searching Arrays
# You can search an array for a certain value, and return the indexes that get a match.
# where()

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 4, 4])

x = np.where(arr == 4)

print(x)


(array([3, 5, 6], dtype=int64),)


In [44]:
# Find the indexes where the values are even:

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

x = np.where(arr%2 == 0)

print(x)

(array([1, 3, 5, 7], dtype=int64),)


In [48]:
###### Search Sorted
# The searchsorted() method is assumed to be used on sorted arrays.

arr = np.array([6, 7, 8, 9])

x = np.searchsorted(arr, 7)

print(x)

# Example explained: The number 7 should be inserted on index 1 to remain the sort order.
# The method starts the search from the left and returns the first index where the number 7 is no longer larger than the next value.

# By default the left most index is returned, but we can give side='right' to return the right most index instead.
b = np.array([6, 7, 8, 9])

y = np.searchsorted(b, 7, side='right')

print(y)


1
2


In [55]:
##### NumPy Sorting Arrays
# sort()

arr = np.array([3, 2, 0, 1])
print(np.sort(arr))

# Note: This method returns a copy of the array, leaving the original array unchanged.

b = np.array(['banana', 'cherry', 'apple'])
print(np.sort(b))

# If you use the sort() method on a 2-D array, both arrays will be sorted:

c = np.array([[3, 2, 4], [5, 0, 1]])
print(np.sort(c))


[0 1 2 3]
['apple' 'banana' 'cherry']
[[2 3 4]
 [0 1 5]]


In [57]:
########## NumPy Filter Array
# Getting some elements out of an existing array and creating a new array out of them is called filtering.
# In NumPy, you filter an array using a boolean index list.
# A boolean index list is a list of booleans corresponding to indexes in the array.
# If the value at an index is True that element is contained in the filtered array, if the value at that index is False that element is excluded from the filtered array.

arr = np.array([41, 42, 43, 44])
x = arr[[True, False, True, False]]
print(x)




[41 43]


In [58]:
# Creating the Filter Array
# In the example above we hard-coded the True and False values, but the common use is to create a filter array based on conditions.
# Example: Create a filter array that will return only values higher than 42:

import numpy as np

arr = np.array([41, 42, 43, 44])

# Create an empty list
filter_arr = []

# go through each element in arr
for element in arr:
  # if the element is higher than 42, set the value to True, otherwise False:
  if element > 42:
    filter_arr.append(True)
  else:
    filter_arr.append(False)

newarr = arr[filter_arr]

print(filter_arr)
print(newarr)

[False, False, True, True]
[43 44]


In [59]:
# Create a filter array that will return only even elements from the original array:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

# Create an empty list
filter_arr = []

# go through each element in arr
for element in arr:
  # if the element is completely divisble by 2, set the value to True, otherwise False
  if element % 2 == 0:
    filter_arr.append(True)
  else:
    filter_arr.append(False)

newarr = arr[filter_arr]

print(filter_arr)
print(newarr)

[False, True, False, True, False, True, False]
[2 4 6]


In [60]:
# Creating Filter Directly From Array

# Create a filter array that will return only values higher than 42:

import numpy as np

arr = np.array([41, 42, 43, 44])

filter_arr = arr > 42

newarr = arr[filter_arr]

print(filter_arr)
print(newarr)


[False False  True  True]
[43 44]
