What is NumPy?

NumPy is a Python library used for working with arrays.

It also has functions for working in domain of linear algebra, fourier transform, and matrices.

NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it freely.

NumPy stands for Numerical Python.

Why Use NumPy?

In Python we have lists that serve the purpose of arrays, but they are slow to process.

NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.

The array object in NumPy is called ndarray, it provides a lot of supporting functions that make working with ndarray very easy.

Arrays are very frequently used in data science, where speed and resources are very important.

Data Science: is a branch of computer science where we study how to store, use and analyze data for deriving information from it.

Why is NumPy Faster Than Lists?

NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently.

This behavior is called locality of reference in computer science.

This is the main reason why NumPy is faster than lists. Also it is optimized to work with latest CPU architectures.

Which Language is NumPy written in?

NumPy is a Python library and is written partially in Python, but most of the parts that require fast computation are written in C or C++.

Where is the NumPy Codebase?

The source code for NumPy is located at this github repository https://github.com/numpy/numpy

Installation of NumPy

If you have Python and PIP already installed on a system, then installation of NumPy is very easy.

Install it using this command:

C:\Users\Your Name>pip install numpy

Import NumPy

Once NumPy is installed, import it in your applications by adding the import keyword:

In [655]:

import numpy

Now NumPy is imported and ready to use.

In [656]:
# Get your own Python Server
import numpy

arr = numpy.array([1, 2, 3, 4, 5])

print(arr)

[1 2 3 4 5]


NumPy as np
NumPy is usually imported under the np alias.

alias: In Python alias are an alternate name for referring to the same thing.

Create an alias with the as keyword while importing:

import numpy as np

In [657]:
# Now the NumPy package can be referred to as np instead of numpy.

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

[1 2 3 4 5]


Checking NumPy Version

The version string is stored under __version__ attribute.

In [658]:
# To check NumPy version

import numpy as np

print(np.__version__)

1.26.4


NumPy Creating Arrays

Create a NumPy ndarray Object

NumPy is used to work with arrays. The array object in NumPy is called ndarray.

We can create a NumPy ndarray object by using the array() function.

In [659]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

print(type(arr))

[1 2 3 4 5]
<class 'numpy.ndarray'>


type(): This built-in Python function tells us the type of the object passed to it. Like in above code it shows that arr is numpy.ndarray type.

To create an ndarray, we can pass a list, tuple or any array-like object into the array() method, and it will be converted into an ndarray:

In [660]:
# Use a tuple to create a NumPy array:

import numpy as np

arr = np.array((1, 2, 3, 4, 5))

print(arr)

[1 2 3 4 5]


Dimensions in Arrays

A dimension in arrays is one level of array depth (nested arrays).

nested array: are arrays that have arrays as their elements.

0-D Arrays

0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.

In [661]:
# Create a 0-D array with value 42

import numpy as np

arr = np.array(42)

print(arr)

42


1-D Arrays

An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.

These are the most common and basic arrays.

In [662]:
# Create a 1-D array containing the values 1,2,3,4,5:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

[1 2 3 4 5]


2-D Arrays

An array that has 1-D arrays as its elements is called a 2-D array.

These are often used to represent matrix or 2nd order tensors.

NumPy has a whole sub module dedicated towards matrix operations called numpy.mat

In [663]:
# Create a 2-D array containing two arrays with the values 1,2,3 and 4,5,6:

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr)

[[1 2 3]
 [4 5 6]]


3-D arrays

An array that has 2-D arrays (matrices) as its elements is called 3-D array.

These are often used to represent a 3rd order tensor.

In [664]:
# Create a 3-D array with two 2-D arrays, both containing two arrays with the values 1,2,3 and 4,5,6:

import numpy as np

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(arr)

[[[1 2 3]
  [4 5 6]]

 [[1 2 3]
  [4 5 6]]]


From the output above,

We have 2 tables (first dimension).

Each table has 2 rows (second dimension).

Each row has 3 columns (third dimension).

How many rows?

Each "block" (or "table") has 2 rows:

[1, 2, 3] is one row,

[4, 5, 6] is another row.

So each table has 2 rows.

And since there are 2 blocks, each block still has 2 rows — rows are counted inside each block, not across blocks.

Answer: 2 rows per block

How many columns?

Look at any one row, like [1, 2, 3].
It has 3 numbers across — so 3 columns.

Answer: 3 columns per row

Summary:

Rows: 2 per block,

Columns: 3 per row,

Blocks: 2 total.

Shape: (2 blocks, 2 rows, 3 columns) ➔ (2, 2, 3)

So:

The first number (2) tells you how many tables (or "slices") you have.

The second number (2) tells you how many rows are on each table.

The third number (3) tells you how many columns are in each row.

That's why it’s three dimensions:

Table ➔ Row ➔ Column ➔ (3 directions to move).

Shape: (2, 2, 3)

NOTE =>

In a table:

Row = a horizontal line (left to right),

Column = a vertical line (up and down).

Check Number of Dimensions?

NumPy Arrays provides the ndim attribute that returns an integer that tells us how many dimensions the array have.

In [665]:
# Check how many dimensions the arrays have:

import numpy as np

a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(a.ndim)
print(b.ndim)
print(c.ndim)
print(d.ndim)

0
1
2
3


Higher Dimensional Arrays

An array can have any number of dimensions.

When the array is created, you can define the number of dimensions by using the ndmin argument.

In [666]:
# Create an array with 5 dimensions and verify that it has 5 dimensions:

import numpy as np

arr = np.array([1, 2, 3, 4], ndmin=5)

print(arr)
print('number of dimensions :', arr.ndim)

[[[[[1 2 3 4]]]]]
number of dimensions : 5


NOTE => Using the output of the result above, in this array the innermost dimension (5th dim) has 4 elements, the 4th dim has 1 element that is the vector, the 3rd dim has 1 element that is the matrix with the vector, the 2nd dim has 1 element that is 3D array and 1st dim has 1 element that is a 4D array.

So basically, the numbers 1 2 3 4 are buried inside 4 layers of brackets ➔ making it 5 levels deep = 5 dimensions.

Each pair of square brackets [] represents one level (one dimension).

Very simple way to imagine it:

Imagine putting a pencil (the numbers) inside a box (1 layer),

then putting that box inside another bigger box (2 layers),

and again inside another bigger box (3 layers),

and again... until 5 boxes are wrapped around it.

The more boxes (layers), the more dimensions.

NumPy Array Indexing

Access Array Elements

Array indexing is the same as accessing an array element.

You can access an array element by referring to its index number.

The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

In [667]:
# Get the first element from the following array:

import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[0])

1


In [668]:
# Get the second element from the following array.

import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[1])

2


In [669]:
# Get third and fourth elements from the following array and add them.

import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[2] + arr[3])

7


Access 2-D Arrays

To access elements from 2-D arrays we can use comma separated integers representing the dimension and the index of the element.

Think of 2-D arrays like a table with rows and columns, where the dimension represents the row and the index represents the column.

In [670]:
# Access the element on the first row, second column:

import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('2nd element on 1st row: ', arr[0, 1])

2nd element on 1st row:  2


In [671]:
# Access the element on the 2nd row, 5th column:

import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('5th element on 2nd row: ', arr[1, 4])

5th element on 2nd row:  10


Access 3-D Arrays

To access elements from 3-D arrays we can use comma separated integers representing the dimensions and the index of the element.

In [672]:
# Access the third element of the second array of the first array:

import numpy as np

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

print(arr[0, 1, 2])

6


Negative Indexing

Use negative indexing to access an array from the end.

In [673]:
# Print the last element from the 2nd dim:

import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('Last element from 2nd dim: ', arr[1, -1])

Last element from 2nd dim:  10


NumPy Array Slicing

Slicing arrays

Slicing in python means taking elements from one given index to another given index.

We pass slice instead of index like this: [start:end].

We can also define the step, like this: [start:end:step].

If we don't pass start its considered 0

If we don't pass end its considered length of array in that dimension

If we don't pass step its considered 1

In [674]:
# Slice elements from index 1 to index 5 from the following array:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5])

[2 3 4 5]


Note: The result includes the start index, but excludes the end index.

In [675]:
# Slice elements from index 4 to the end of the array:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[4:])

[5 6 7]


In [676]:
# Slice elements from the beginning to index 4 (not included):

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[:4])

[1 2 3 4]


Negative Slicing

Use the minus operator to refer to an index from the end:

In [677]:
# Slice from the index 3 from the end to index 1 from the end:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[-3:-1])

[5 6]


STEP

Use the step value to determine the step of the slicing:

In [678]:
# Return every other element from index 1 to index 5:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5:2])

[2 4]


In [679]:
# Return every other element from the entire array:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[::2]) # NOTE: This output elicits a result in which the count starts from the beginning of the array, and moves every two steps.

[1 3 5 7]


Slicing 2-D Arrays

In [680]:
# From the second element, slice elements from index 1 to index 4 (not included):

import numpy as np

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[1, 1:4])

[7 8 9]


Note: From the example above, remember that second element has index 1.

In [681]:
# From both elements, return index 2:

import numpy as np

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[0:2, 2])

[3 8]


A well detailed explanation on the code above

From the above array, 

[[ 1  2  3  4  5]     ← row index 0
 [ 6  7  8  9 10]]    ← row index 1
    ↑  ↑  ↑  ↑  ↑
  col0 col1 col2 col3 col4

It’s a 2x5 matrix (2 rows, 5 columns).

This is a slice and index operation combined:

arr[0:2, 2] reads as:

* Get rows 0 up to (but not including) 2
* And from those rows, get the element at column index 2

By analyzing it:

* 0:2 → selects both rows: row 0 and row 1

* 2 → selects the 3rd column (indexing starts at 0)

So this line is retrieving the element at column 2 for both rows.

That means:

From row 0 → column 2: 3

From row 1 → column 2: 8

In [682]:
# From both elements, slice index 1 to index 4 (not included), this will return a 2-D array:

import numpy as np

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[0:2, 1:4])

[[2 3 4]
 [7 8 9]]


Here's is a well detailed explanation on the example above:

🔸 Syntax: arr[row_start:row_end, column_start:column_end]

* 0:2 → selects rows from index 0 up to but not including 2.

 So this includes row 0 and row 1.

* 1:4 → selects columns from index 1 up to but not including 4.

 → So this includes column 1, column 2, and column 3.

 That means:
We are selecting the intersection of:

Rows: 0 and 1

Columns: 1, 2, and 3

After successfully carrying out the slicing and indexing, the result will be a 2x3 NumPy array representing the selected block.

NumPy Data Types

Data Types in Python
By default Python have these data types:

* strings - used to represent text data, the text is given under quote marks. e.g. "ABCD"
* integer - used to represent integer numbers. e.g. -1, -2, -3
* float - used to represent real numbers. e.g. 1.2, 42.42
* boolean - used to represent True or False.
* complex - used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5j

Data Types in NumPy
NumPy has some extra data types, and refer to data types with one character, like i for integers, u for unsigned integers etc.

Below is a list of all data types in NumPy and the characters used to represent them.

* i - integer
* b - boolean
* u - unsigned integer
* f - float
* c - complex float
* m - timedelta
* M - datetime
* O - object
* S - string
* U - unicode string
* V - fixed chunk of memory for other type ( void )

Checking the Data Type of an Array

The NumPy array object has a property called dtype that returns the data type of the array:

In [683]:
# Get the data type of an array object:

import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr.dtype)

int32


In [684]:
# Get the data type of an array containing strings:

import numpy as np

arr = np.array(['apple', 'banana', 'cherry'])

print(arr.dtype)

<U6


Creating Arrays With a Defined Data Type

We use the array() function to create arrays, this function can take an optional argument: dtype that allows us to define the expected data type of the array elements:

In [685]:
# import numpy as np

arr = np.array([1, 2, 3, 4], dtype='S')

print(arr)
print(arr.dtype)

[b'1' b'2' b'3' b'4']
|S1


From the code in the example above,

[1, 2, 3, 4] is a list of integers.
* ◾ dtype='S' tells NumPy to store the data as:
Byte strings (not regular strings or integers).

🔸 What is a Byte String ('S') in NumPy?

'S' means string stored as bytes — also called ASCII string or byte string.

Each element will be converted to a byte-representation of a string.

So:

1 becomes b'1'

2 becomes b'2'

etc.

🔸 What happens under the hood?
* NumPy automatically determines the maximum string length based on the input.

In this case, all numbers are single-digit, so the byte string will be of type |S1:

| → means "native byte order"

S1 → means a byte string of length 1

If you had longer strings (e.g., '10', '200'), NumPy would automatically upgrade the length (e.g., S3, S4, etc.).

NOTE => i, u, f, S and U we can define size as well.

In [692]:
# Create an array with data type 4 bytes integer:

import numpy as np

arr = np.array([1, 2, 3, 4], dtype='i4')

print(arr)
print(arr.dtype)

[1 2 3 4]
int32


What if a Value Can Not Be Converted?

If a type is given in which elements can't be casted then NumPy will raise a ValueError.

* ValueError: In Python ValueError is raised when the type of passed argument to a function is unexpected/incorrect.

In [693]:
# A non integer string like 'a' can not be converted to integer (will raise an error):

import numpy as np

arr = np.array(['a', '2', '3'], dtype='i')

ValueError: invalid literal for int() with base 10: 'a'

Converting Data Type on Existing Arrays

The best way to change the data type of an existing array, is to make a copy of the array with the astype() method.

The astype() function creates a copy of the array, and allows you to specify the data type as a parameter.

The data type can be specified using a string, like 'f' for float, 'i' for integer etc. or you can use the data type directly like float for float and int for integer.

In [695]:
# Change data type from float to integer by using 'i' as parameter value:

import numpy as np

arr = np.array([1.1, 2.1, 3.1])

newarr = arr.astype('i')

print(arr.dtype)
print(newarr)
print(newarr.dtype)

float64
[1 2 3]
int32


In [696]:
# Change data type from float to integer by using int as parameter value:

import numpy as np

arr = np.array([1.1, 2.1, 3.1])

newarr = arr.astype(int)

print(newarr)
print(newarr.dtype)

[1 2 3]
int32


In [701]:
# Change data type from integer to boolean:

import numpy as np

arr = np.array([1, 0, 3])

newarr = arr.astype(bool)

print(newarr)
print(newarr.dtype)

[ True False  True]
bool


NumPy Array Copy vs View

The Difference Between Copy and View

The main difference between a copy and a view of an array is that the copy is a new array, and the view is just a view of the original array.

The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy.

The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.

COPY:

In [1]:
# Make a copy, change the original array, and display both arrays:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])
x = arr.copy()
arr[0] = 42

print(arr)
print(x)

[42  2  3  4  5]
[1 2 3 4 5]


The copy SHOULD NOT be affected by the changes made to the original array.

In [2]:
# Make a view, change the original array, and display both arrays:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
arr[0] = 42

print(arr)
print(x)

[42  2  3  4  5]
[42  2  3  4  5]


NOTE => The view SHOULD be affected by the changes made to the original array.

Make Changes in the VIEW:

In [3]:
# Make a view, change the view, and display both arrays:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
x[0] = 31

print(arr)
print(x)

[31  2  3  4  5]
[31  2  3  4  5]


NOTE => The original array SHOULD be affected by the changes made to the view.

Check if Array Owns its Data

As mentioned above, copies owns the data, and views does not own the data, but how can we check this?

Every NumPy array has the attribute base that returns None if the array owns the data.

Otherwise, the base  attribute refers to the original object.

In [4]:
# Print the value of the base attribute to check if an array owns it's data or not:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

x = arr.copy()
y = arr.view()

print(x.base)
print(y.base)

None
[1 2 3 4 5]


NOTE => The copy returns None.
The view returns the original array.

NumPy Array Shape

Shape of an Array
The shape of an array is the number of elements in each dimension.

Get the Shape of an Array
NumPy arrays have an attribute called shape that returns a tuple with each index having the number of corresponding elements.

In [5]:
# Print the shape of a 2-D array:

import numpy as np

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

print(arr.shape)

(2, 4)


The example above returns (2, 4), which means that the array has 2 dimensions, where the first dimension has 2 elements and the second has 4.

In [6]:
# Create an array with 5 dimensions using ndmin using a vector with values 1,2,3,4 and verify that last dimension has value 4:

import numpy as np

arr = np.array([1, 2, 3, 4], ndmin=5)

print(arr)
print('shape of array :', arr.shape)

[[[[[1 2 3 4]]]]]
shape of array : (1, 1, 1, 1, 4)


What does the shape tuple represent?
Integers at every index tells about the number of elements the corresponding dimension has.

In the example above at index-4 we have value 4, so we can say that 5th ( 4 + 1 th) dimension has 4 elements.