# NumPy

What is NumPy?


NumPy is a Python library for scientific computing .

It provides a high performance multidimensional array object and tools for working with these arrays .

It also has functions for working in domain of linear algebra, fourier transform, and matrices.

NumPy stands for Numerical Python.



Why  NumPy?


In Python we have lists that serve the purpose of arrays, but they are slow to process.

NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.

The array object in NumPy is called ndarray, it provides a lot of supporting functions that make working with ndarray very easy.

Arrays are very frequently used in data science, where speed and resources are very important.


In [1]:
pip install numpy

Note: you may need to restart the kernel to use updated packages.


In [2]:
import numpy

Create a NumPy array:

In [2]:
import numpy

arr = numpy.array([1, 2, 3, 4, 5])

print(arr)

[1 2 3 4 5]


In [3]:
type(arr)

numpy.ndarray

Checking NumPy Version

The version string is stored under __version__ attribute.

In [7]:
import numpy as np

print(np.__version__)

1.23.0


NumPy Creating Arrays

<b>Create a NumPy ndarray Object<b>

NumPy is used to work with arrays. The array object in NumPy is called ndarray.

We can create a NumPy ndarray object by using the array() function.

In [9]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

print(type(arr))

[1 2 3 4 5]
<class 'numpy.ndarray'>




type(): This built-in Python function tells us the type of the object passed to it. Like in above code it shows that arr is numpy.ndarray type.

To create an ndarray, we can pass a list, tuple or any array-like object into the array() method, and it will be converted into an ndarray:

<b> Use a tuple to create a NumPy array <b>

In [10]:
import numpy as np

arr = np.array((1, 2, 3, 4, 5))

print(arr) 

[1 2 3 4 5]


<b>Dimensions in Arrays<b>

A dimension in arrays is one level of array depth (nested arrays).

In [4]:
# 0-D Arrays

#0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.


#Create a 0-D array with value 42
import numpy as np

arr = np.array(42)

print(arr) 

42


1-D Arrays

An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.

These are the most common and basic arrays.

In [5]:
# Create a 1-D array containing the values 1,2,3,4,5:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr) 

[1 2 3 4 5]


2-D Arrays

An array that has 1-D arrays as its elements is called a 2-D array.

These are often used to represent matrix or 2nd order tensors.

NumPy has a whole sub module dedicated towards matrix operations called numpy.mat


In [6]:
# Create a 2-D array containing two arrays with the values 1,2,3 and 4,5,6:
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr) 

[[1 2 3]
 [4 5 6]]


3-D arrays

An array that has 2-D arrays (matrices) as its elements is called 3-D array.

These are often used to represent a 3rd order tensor.




In [7]:
#Create a 3-D array with two 2-D arrays, both containing two arrays with the values 1,2,3 and 4,5,6:
import numpy as np

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(arr) 

[[[1 2 3]
  [4 5 6]]

 [[1 2 3]
  [4 5 6]]]


<b>Check Number of Dimensions?<b>

NumPy Arrays provides the ndim attribute that returns an integer that tells us how many dimensions the array have.
Example

Check how many dimensions the arrays have:

In [8]:
import numpy as np

a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(a.ndim)
print(b.ndim)
print(c.ndim)
print(d.ndim) 

0
1
2
3


<b>Higher Dimensional Arrays<b>

An array can have any number of dimensions.

When the array is created, you can define the number of dimensions by using the
<b>ndmin <b> argument.
    
Example

<b>Create an array with 5 dimensions and verify that it has 5 dimensions:<b>

In [5]:
import numpy as np

arr = np.array([1, 2, 3, 4], ndmin=5)

print(arr)
print('number of dimensions :', arr.ndim) 

[[[[[1 2 3 4]]]]]
number of dimensions : 5


In this array the innermost dimension (5th dim) has 4 elements, the 4th dim has 1 element that is the vector, the 3rd dim has 1 element that is the matrix with the vector, the 2nd dim has 1 element that is 3D array and 1st dim has 1 element that is a 4D array.

<b>NumPy Array Indexing<b>
Access Array Elements

Array indexing is the same as accessing an array element.

You can access an array element by referring to its index number.

The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

In [10]:
# Get the first element from the following array:
import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[0]) 

1


In [11]:
#Get the second element from the following array.
import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[1]) 

2


In [12]:
# Get third and fourth elements from the following array and add them.
import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[2] + arr[3]) 

7


<b>Access 2-D Arrays<b>

To access elements from 2-D arrays we can use comma separated integers representing the dimension and the index of the element.

Think of 2-D arrays like a table with rows and columns, where the row represents the dimension and the index represents the column.

In [13]:
# Access the element on the first row, second column:
import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('2nd element on 1st row: ', arr[0, 1]) 

2nd element on 1st row:  2


In [14]:
# Access the element on the 2nd row, 5th column:
import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('5th element on 2nd row: ', arr[1, 4]) 

5th element on 2nd row:  10


<b>Negative Indexing<b>

Use negative indexing to access an array from the end.


In [16]:
# Print the last element from the 2nd dim:
import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('Last element from 2nd dim: ', arr[1, -1]) 

Last element from 2nd dim:  10


<b>NumPy Array Slicing<b>

Slicing arrays

Slicing in python means taking elements from one given index to another given index.

We pass slice instead of index like this: [start:end].

We can also define the step, like this: [start:end:step].

If we don't pass start its considered 0

If we don't pass end its considered length of array in that dimension

If we don't pass step its considered 1

In [17]:
# Slice elements from index 1 to index 5 from the following array:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5]) 

[2 3 4 5]


In [18]:
# Slice elements from index 4 to the end of the array:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[4:]) 

[5 6 7]


In [19]:
# slice elements from the beginning to index 4 (not included):
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[:4]) 

[1 2 3 4]


In [1]:
#Negative Slicing

#Use the minus operator to refer to an index from the end:


#Slice from the index 3 from the end to index 1 from the end:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[-3:-1]) 

[5 6]


In [21]:
#STEP

#Use the step value to determine the step of the slicing:


# Return every other element from index 1 to index 5:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5:2]) 

[2 4]


In [22]:
# Return every other element from the entire array:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[::2]) 

[1 3 5 7]


In [23]:
#Slicing 2-D Arrays


#From the second element, slice elements from index 1 to index 4 (not included):
import numpy as np

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[1, 1:4]) 

[7 8 9]


<b>NumPy Data Types<b>
    
Data Types in Python

By default Python have these data types:

    strings - used to represent text data, the text is given under quote marks. e.g. "ABCD"
    integer - used to represent integer numbers. e.g. -1, -2, -3
    float - used to represent real numbers. e.g. 1.2, 42.42
    boolean - used to represent True or False.
    complex - used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5j

Data Types in NumPy

NumPy has some extra data types, and refer to data types with one character, like i for integers, u for unsigned integers etc.

Below is a list of all data types in NumPy and the characters used to represent them.

    i - integer
    b - boolean
    u - unsigned integer
    f - float
    c - complex float
    m - timedelta
    M - datetime
    O - object
    S - string
    U - unicode string
    V - fixed chunk of memory for other type ( void )



In [2]:
#Checking the Data Type of an Array

#The NumPy array object has a property called dtype that returns the data type of the array:


#Get the data type of an array object:
import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr.dtype) 

int64


In [29]:
#Get the data type of an array containing strings:
import numpy as np

arr = np.array(['apple', 'banana', 'cherry'])

print(arr.dtype) 

<U6


<b>Creating Arrays With a Defined Data Type<b>

We use the array() function to create arrays, this function can take an optional argument: dtype that allows us to define the expected data type of the array elements:

In [1]:
# Create an array with data type string:
import numpy as np

arr = np.array([1, 2, 3, 4], dtype='S')

print(arr)
print(arr.dtype) 

[b'1' b'2' b'3' b'4']


What if a Value Can Not Be Converted?

If a type is given in which elements can't be casted then NumPy will raise a ValueError.

ValueError: In Python ValueError is raised when the type of passed argument to a function is unexpected/incorrect.
Example

A non integer string like 'a' can not be converted to integer (will raise an error):
import numpy as np

arr = np.array(['a', '2', '3'], dtype='i') 

In [2]:
arr = np.array(['a', '2', '3'], dtype='i')

ValueError: invalid literal for int() with base 10: 'a'

<b>Converting Data Type on Existing Arrays<b>

The best way to change the data type of an existing array, is to make a copy of the array with the astype() method.

The astype() function creates a copy of the array, and allows you to specify the data type as a parameter.

The data type can be specified using a string, like 'f' for float, 'i' for integer etc. or you can use the data type directly like float for float and int for integer.

In [17]:
# Change data type from float to integer by using 'i' as parameter value:
import numpy as np

arr = np.array([1.1, 2.1, 3.1])

newarr = arr.astype('l')

print(newarr)
print(newarr.dtype) 


[1 2 3]
int64


In [19]:
# Change data type from float to integer by using int as parameter value:
import numpy as np

arr = np.array([1.1, 2.1, 3.1])

newarr = arr.astype(int)

print(newarr)
print(newarr.dtype) 

[1 2 3]
int64


In [34]:
# Change data type from integer to boolean:
import numpy as np

arr = np.array([1, 0, 3])

newarr = arr.astype(bool)

print(newarr)
print(newarr.dtype) 

[ True False  True]
bool


<b>The Difference Between Copy and View<b>

The main difference between a copy and a view of an array is that the copy is a new array, and the view is just a view of the original array.

The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy.

The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.
COPY:
Example

Make a copy, change the original array, and display both arrays:

In [35]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
x = arr.copy()
arr[0] = 42

print(arr)
print(x) 

[42  2  3  4  5]
[1 2 3 4 5]


The copy SHOULD NOT be affected by the changes made to the original array.

In [36]:
# VIEW:


# Make a view, change the original array, and display both arrays:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
arr[0] = 42

print(arr)
print(x) 

[42  2  3  4  5]
[42  2  3  4  5]


The view SHOULD be affected by the changes made to the original array.

In [37]:
#Make Changes in the VIEW:

# Make a view, change the view, and display both arrays:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
x[0] = 31

print(arr)
print(x) 

[31  2  3  4  5]
[31  2  3  4  5]


<b>NumPy Array Shape<b>
    
Shape of an Array

The shape of an array is the number of elements in each dimension.
Get the Shape of an Array

NumPy arrays have an attribute called shape that returns a tuple with each index having the number of corresponding elements.

In [41]:
# Print the shape of a 2-D array:
import numpy as np

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

print(arr.shape) 

(2, 4)


The example above returns (2, 4), which means that the array has 2 dimensions, where the first dimension has 2 elements and the second has 4.


<b>NumPy Array Reshaping<b>
Reshaping arrays

Reshaping means changing the shape of an array.

The shape of an array is the number of elements in each dimension.

By reshaping we can add or remove dimensions or change number of elements in each dimension.
Reshape From 1-D to 2-D

In [44]:
#Convert the following 1-D array with 12 elements into a 2-D array.

#The outermost dimension will have 4 arrays, each with 3 elements:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

newarr = arr.reshape(4, 3)

print(newarr) 

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


In [45]:
# Reshape From 1-D to 3-D


#Convert the following 1-D array with 12 elements into a 3-D array.

#The outermost dimension will have 2 arrays that contains 3 arrays, each with 2 elements:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

newarr = arr.reshape(2, 3, 2)

print(newarr) 

[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]


<b>Flattening the arrays<b>

Flattening array means converting a multidimensional array into a 1D array.

We can use reshape(-1) to do this.

In [50]:
# Convert the array into a 1D array:
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

newarr = arr.reshape(-1)

print(newarr) 

[1 2 3 4 5 6]


<b>NumPy Array Iterating<b>
    
    
Iterating Arrays

Iterating means going through elements one by one.

As we deal with multi-dimensional arrays in numpy, we can do this using basic for loop of python.

If we iterate on a 1-D array it will go through each element one by one.


In [51]:
# Iterate on the elements of the following 1-D array:
import numpy as np

arr = np.array([1, 2, 3])

for x in arr:
  print(x) 

1
2
3


Iterating Arrays Using nditer()

The function nditer() is a helping function that can be used from very basic to very advanced iterations. It solves some basic issues which we face in iteration, lets go through it with examples.
Iterating on Each Scalar Element

In basic for loops, iterating through each scalar of an array we need to use n for loops which can be difficult to write for arrays with very high dimensionality.


In [56]:
#Iterate through the following 3-D array:
import numpy as np
 
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

for x in np.nditer(arr):
  print(x) 

1
2
3
4
5
6
7
8


<b>NumPy Joining Array<b>
    
Joining NumPy Arrays

Joining means putting contents of two or more arrays in a single array.

In SQL we join tables based on a key, whereas in NumPy we join arrays by axes.

We pass a sequence of arrays that we want to join to the concatenate() function, along with the axis. If axis is not explicitly passed, it is taken as 0.

In [62]:
# Join two arrays
import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.concatenate((arr1, arr2))

print(arr) 

[1 2 3 4 5 6]


In [7]:
#Join two 2-D arrays along rows (axis=1):
import numpy as np

arr1 = np.array([[1, 2], [3, 4]])

arr2 = np.array([[5, 6], [7, 8]])

arr = np.concatenate((arr1, arr2), axis=1)

print(arr) 

[[1 2 5 6]
 [3 4 7 8]]


<b>NumPy Splitting Array<b>
    
Splitting NumPy Arrays

Splitting is reverse operation of Joining.

Joining merges multiple arrays into one and Splitting breaks one array into multiple.

We use array_split() for splitting arrays, we pass it the array we want to split and the number of splits.

In [8]:
# Split the array in 3 parts:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

print(newarr) 

[array([1, 2]), array([3, 4]), array([5, 6])]


In [69]:
# Split the array in 4 parts:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 4)

print(newarr) 

[array([1, 2]), array([3, 4]), array([5]), array([6])]


Note: We also have the method split() available but it will not adjust the elements when elements are less in source array for splitting like in example above, array_split() worked properly but split() would fail.

In [70]:
#Split Into Arrays

#The return value of the array_split() method is an array containing each of the split as an array.

#If you split an array into 3 arrays, you can access them from the result just like any array element:


#Access the splitted arrays:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

print(newarr[0])
print(newarr[1])
print(newarr[2]) 

[1 2]
[3 4]
[5 6]


In [75]:
#NumPy Searching Arrays
#Searching Arrays

#You can search an array for a certain value, and return the indexes that get a match.

#To search an array, use the where() method.

# Find the indexes where the value is 4:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 4, 4])

x = np.where(arr == 4)

print(x) 

(array([3, 5, 6]),)


the example above will return a tuple: (array([3, 5, 6],)

Which means that the value 4 is present at index 3, 5, and 6.

<b>NumPy Sorting Arrays<b>

    

Sorting means putting elements in an ordered sequence.

Ordered sequence is any sequence that has an order corresponding to elements, like numeric or alphabetical, ascending or descending.

The NumPy ndarray object has a function called sort(), that will sort a specified array.


In [83]:
# Sort the array:
import numpy as np

arr = np.array([3, 2, 0, 1])

print(np.sort(arr)) 

[0 1 2 3]


In [84]:
#you can also sort arrays of strings, or any other data type:


#Sort the array alphabetically:
import numpy as np

arr = np.array(['banana', 'cherry', 'apple'])

print(np.sort(arr)) 

['apple' 'banana' 'cherry']


In [85]:
#  Sort a boolean array:
import numpy as np

arr = np.array([True, False, True])

print(np.sort(arr)) 

[False  True  True]


In [86]:
# Sorting a 2-D Array

#If you use the sort() method on a 2-D array, both arrays will be sorted:


# Sort a 2-D array:
import numpy as np

arr = np.array([[3, 2, 4], [5, 0, 1]])

print(np.sort(arr)) 

[[2 3 4]
 [0 1 5]]
