# NumPy Basics

Welcome to NumPy! This notebook covers the essential features and usage of NumPy, the fundamental package for scientific computing in Python. NumPy provides powerful tools for working with arrays, matrices, and mathematical functions efficiently.

## Arrays

Arrays are similar to Python lists but offer better performance and more functionality. The main type is **ndarray**, which can be created by passing an iterable to `numpy.array()`.

In [1]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])  # create a 1D array
print(arr)  # print the array
print(type(arr))  # print the type of the array

[1 2 3 4 5]
<class 'numpy.ndarray'>


## Dimensions

Every array has an `ndim` attribute that shows its number of dimensions:
- **0-dim:** Single element
- **1-dim:** List of elements
- **2-dim:** Matrix
- **3-dim:** Tensor

In [2]:
arr0 = np.array(1)  # 0-dim array
arr1 = np.array((1, 2, 3, 4, 5))  # 1-dim array
arr2 = np.array(((1, 2, 3), (4, 5, 6)))  # 2-dim array
arr3 = np.array((((1, 2), (3, 4)), ((5, 6), (7, 8))))  # 3-dim array

print("Array:")
print(arr0)
print("Dimension:", arr0.ndim)

print("Array:")
print(arr1)
print("Dimension:", arr1.ndim)

print("Array:")
print(arr2)
print("Dimension:", arr2.ndim)

print("Array:")
print(arr3)
print("Dimension:", arr3.ndim)

Array:
1
Dimension: 0
Array:
[1 2 3 4 5]
Dimension: 1
Array:
[[1 2 3]
 [4 5 6]]
Dimension: 2
Array:
[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]
Dimension: 3


### Set dimension as a parameter

You can set the dimension of an array regardless of the input using the `ndmin` parameter in `np.array()`.

In [3]:
arr = np.array([1, 2, 3, 4], ndmin=5)  # create array with minimum 5 dimensions
print(arr)  # print the array
print("Dimension:", arr.ndim)  # print the number of dimensions

[[[[[1 2 3 4]]]]]
Dimension: 5


## Indexing Arrays

Indexes in NumPy start from 0, so `arr[0]` gives the first element in an array.

For 2D arrays (matrices), use `arr[n, m]`. The same applies for higher dimensions.

In [4]:
arr = np.array([1, 2, 3, 4, 5])  # create a 1D array
print(arr[0])  # print first element
print(arr[2] + arr[3])  # sum third and fourth elements

1
7


In [5]:
m = np.array([[1, 2, 3], [4, 5, 6]])  # create a 2D array
print(m[0, 2])  # print element at 1st row, 3rd column

3


**Note:** Negative indexing starts from the end of the array.

## Slicing

You can slice arrays using the syntax `[start:end]`.
- To specify a step: `[start:end:step]`

Notes:
- The index passed as end is not included in the slice
- If start is omitted, it defaults to 0
- If end is omitted, it defaults to the array length
- If step is omitted, it defaults to 1

You can also use **fancy indexing** to select specific indices.

In [6]:
print(arr[1:5])  # elements from index 1 to 4
print(arr[:5])  # elements from start to index 4
print(arr[1:])  # elements from index 1 to end
print(arr[::2])  # every second element

[2 3 4 5]
[1 2 3 4 5]
[2 3 4 5]
[1 3 5]


In [7]:
print(m[1, :])  # all columns in second row
print(m[:, 0:2])  # first two columns in all rows

[4 5 6]
[[1 2]
 [4 5]]


In [8]:
print(arr[[1, 3]])  # select elements at index 1 and 3

[2 4]


## Data Types

NumPy has several data types:
- i: integer
- b: boolean
- u: unsigned integer
- f: float
- c: complex float
- M: datetime
- O: object
- S: string
- V: void

Use the `dtype` attribute to check the type of an array.

In [9]:
print(arr.dtype)  # print the data type of the array

int64


### Set the data type

You can define the array with a preset type using the `dtype` parameter in `np.array()`.

For *i*, *u*, *f*, *S*, and *U*, you can set the size as well.

**Note:** If the type is not convertible, Python will raise an exception.

In [10]:
arr = np.array([1, 2, 3, 4], dtype='S')  # string type
print(arr)
print(arr.dtype)

arr = np.array([1, 2, 3, 4], dtype='i4')  # 32-bit integer
print(arr)
print(arr.dtype)

[b'1' b'2' b'3' b'4']
|S1
[1 2 3 4]
int32


#### Change the data type on a built array

The best way to change the data type is to take a copy of the array with `astype()`.

In [11]:
arr = np.array([1, 2, 3, 4])  # create a 1D array
copy_arr = arr.astype('i')  # convert to integer
print(copy_arr)
print(copy_arr.dtype)

[1 2 3 4]
int32


## Copy & View

- A copy of an array owns its data and changes will not affect the original array.
- A view does not own data and changes will affect the original array.

**Note:** To check if the variable owns the data, use the `base` attribute.

In [12]:
x = arr.copy()  # create a copy
y = arr.view()  # create a view

x[0] = 0  # change first element in copy
print(arr)  # original array is unchanged

y[0] = 0  # change first element in view
print(arr)  # original array is changed

print(x.base)  # None means owns data
print(y.base)  # points to original array

[1 2 3 4]
[0 2 3 4]
None
[0 2 3 4]


## Shape

NumPy arrays have a `shape` attribute that returns a tuple with the number of elements in each dimension.

In [13]:
print(arr0.shape)  # shape of 0-dim array
print(arr1.shape)  # shape of 1-dim array
print(arr2.shape)  # shape of 2-dim array
print(arr3.shape)  # shape of 3-dim array

()
(5,)
(2, 3)
(2, 2, 2)


### Reshape

You can add or remove dimensions or change the number of elements in each one.

#### Can we reshape into any dimension?
Yes, as long as the number of elements is appropriate for reshaping.
E.g. 8 elements can be reshaped to 2x4.

**Note:** `reshape` returns a view of the original array.
**Note:** Only one argument can be -1 (unknown dimension) to be calculated automatically.

In [14]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])  # create a 1D array
reshaped = arr.reshape(3, 3)  # reshape to 3x3
print(reshaped)  # print reshaped array
print(reshaped.ndim)  # print number of dimensions

[[1 2 3]
 [4 5 6]
 [7 8 9]]
2


## Iteration through arrays

You can iterate through arrays with simple for loops. For higher dimensions, use nested loops.

`nditer()` gives you 0-dim elements for any array dimension.

The `op_dtypes` attribute converts the data type of each element during iteration. This operation needs extra space, so use `flags=['buffered']`.

The method `ndenumerate()` returns each element with its index.

In [15]:
for x in np.nditer(arr3):
    print(x)  # print each element

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])  # create a 2D array
# choose element-wise skipping 1 of 2
for x in np.nditer(arr[:, ::2]):
    print(x)  # print every second element

1
2
3
4
5
6
7
8
1
3
5
7


In [16]:
for idx, x in np.ndenumerate(arr):
    print(idx, ':', x)  # print index and value

(0, 0) : 1
(0, 1) : 2
(0, 2) : 3
(0, 3) : 4
(1, 0) : 5
(1, 1) : 6
(1, 2) : 7
(1, 3) : 8


## Join

Joining means putting contents of two or more arrays in a single array.

Use the `concatenate()` function. The **axis** parameter is 0 by default.

In [17]:
a1 = np.array([1, 2, 3])  # first array
a2 = np.array([4, 5, 6])  # second array
a = np.concatenate((a1, a2))  # join arrays
print(a)  # print result

[1 2 3 4 5 6]


In [18]:
# 2D arrays concatenation
a1 = np.array([[1, 2], [3, 4]])  # first 2D array
a2 = np.array([[5, 6], [7, 8]])  # second 2D array
a = np.concatenate((a1, a2), axis=1)  # join along columns
print(a)  # print result

[[1 2 5 6]
 [3 4 7 8]]


### Stack

Stacking is like `concatenate` but done along a new axis, e.g. several 1D arrays become 2D.

- To stack along rows: `hstack()`
- To stack along columns: `vstack()`
- To stack along depth: `dstack()`

In [19]:
a1 = np.array([1, 2, 3])  # first array
a2 = np.array([4, 5, 6])  # second array
a = np.stack((a1, a2), axis=1)  # stack along new axis
print(a)  # print result

[[1 4]
 [2 5]
 [3 6]]


In [20]:
b = np.hstack((a1, a2))  # stack horizontally
c = np.vstack((a1, a2))  # stack vertically
d = np.dstack((a1, a2))  # stack along depth
print(b)  # print horizontal stack
print(c)  # print vertical stack
print(d)  # print depth stack

[1 2 3 4 5 6]
[[1 2 3]
 [4 5 6]]
[[[1 4]
  [2 5]
  [3 6]]]


### Split

Splitting is the reverse of joining.

Use `array_split()` and pass the array and the number of splits you want. The `split()` method will not adjust the size if it's not suitable for splitting.

In [21]:
arr = np.array([1, 2, 3, 4, 5, 6])  # create a 1D array
splited = np.array_split(arr, 3)  # split into 3 parts
print(splited)  # print all splits
print(splited[0])  # print first split
print(splited[1])  # print second split
print(splited[2])  # print third split

# this one has less than required but it will be handled
incomp = np.array_split(arr, 4)  # split into 4 parts
print(incomp)  # print result

[array([1, 2]), array([3, 4]), array([5, 6])]
[1 2]
[3 4]
[5 6]
[array([1, 2]), array([3, 4]), array([5]), array([6])]


#### Splitting 2D arrays

`array_split` will split matrices along **axis=0** by default, but you can set the axis. You can also use `hsplit()` which is the reverse of `hstack()`.

In [22]:
m = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])  # create a 2D array
spm = np.array_split(m, 3)  # split rows
print(spm)  # print result

m = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])  # another 2D array
spm = np.array_split(m, 3, axis=1)  # split columns
print(spm)  # print result

# alt_spm = np.hsplit(m, 3)
# print(alt_spm)  # alternative split

[array([[1, 2],
       [3, 4]]), array([[5, 6],
       [7, 8]]), array([[ 9, 10],
       [11, 12]])]
[array([[ 1],
       [ 4],
       [ 7],
       [10],
       [13],
       [16]]), array([[ 2],
       [ 5],
       [ 8],
       [11],
       [14],
       [17]]), array([[ 3],
       [ 6],
       [ 9],
       [12],
       [15],
       [18]])]


## Search

Use `np.where()` with a boolean condition to get the indices where the condition is True.

`searchsorted()` runs a binary search on a **sorted** array and returns the first index where the value can be inserted to keep the array sorted. By default, it searches from the left, but you can set `side='right'`.

In [23]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])  # create a 1D array
evens = np.where(arr % 2 == 0)  # find indices of even numbers
print(evens)  # print result

# searchsorted example
arr = np.array([6, 7, 8, 9])  # sorted array
x = np.searchsorted(arr, 7)  # find index for 7
print(x)  # print result

# searchsorted with multiple values
arr = np.array([1, 3, 5, 7])  # sorted array
x = np.searchsorted(arr, [2, 4, 6], side='right')  # find indices for multiple values
print(x)  # print result

(array([1, 3, 5, 7]),)
1
[1 2 3]


## Sort

Sort elements in ascending order using `np.sort(arr)`.

Note: If run on a 2D array, it will sort each row.

In [24]:
arr = np.array([3, 2, 0, 1])  # create a 1D array
print(np.sort(arr))  # sort and print

m = np.array([[3, 2, 4], [5, 0, 1]])  # create a 2D array
print(np.sort(m))  # sort each row and print

[0 1 2 3]
[[2 3 4]
 [0 1 5]]


## Filter

Use a list of booleans to filter indices in NumPy.

**True** means keep the element, **False** means ignore it.

In [25]:
arr = np.array([41, 42, 43, 44])  # create a 1D array
x = [True, False, True, False]  # manual boolean list
filter_arr = arr > 42  # condition-based boolean array
a = arr[x]  # filter using manual list
b = arr[filter_arr]  # filter using condition
print(a)  # print filtered array (manual)
print(b)  # print filtered array (condition)

[41 43]
[43 44]


## Special Arrays

NumPy provides functions to create special arrays:
- `np.zeros()` creates an array of zeros
- `np.ones()` creates an array of ones
- `np.eye()` creates an identity matrix
- `np.arange()` creates an array with an arithmetic sequence
- `np.linspace()` creates an array with evenly spaced values over a specified interval

Note: `arange` is like `range`, exclusive of the right endpoint
Note: `linspace` splits the interval into the specified number of parts, inclusive of the right endpoint.

In [26]:
a = np.zeros((3, 3))  # 3x3 array of zeros
b = np.ones((2, 2))  # 2x2 array of ones
c = np.eye(4)  # 4x4 identity matrix
d = np.arange(1, 10, 3)  # array from 1 to 9 with step 3
e = np.linspace(0.001, 0.1, 100)  # 100 values from 0.001 to 0.1
print(a)  # print zeros array
print(b)  # print ones array
print(c)  # print identity matrix
print(d)  # print arange array
print(e)  # print linspace array

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
[[1. 1.]
 [1. 1.]]
[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]
[1 4 7]
[0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01  0.011 0.012
 0.013 0.014 0.015 0.016 0.017 0.018 0.019 0.02  0.021 0.022 0.023 0.024
 0.025 0.026 0.027 0.028 0.029 0.03  0.031 0.032 0.033 0.034 0.035 0.036
 0.037 0.038 0.039 0.04  0.041 0.042 0.043 0.044 0.045 0.046 0.047 0.048
 0.049 0.05  0.051 0.052 0.053 0.054 0.055 0.056 0.057 0.058 0.059 0.06
 0.061 0.062 0.063 0.064 0.065 0.066 0.067 0.068 0.069 0.07  0.071 0.072
 0.073 0.074 0.075 0.076 0.077 0.078 0.079 0.08  0.081 0.082 0.083 0.084
 0.085 0.086 0.087 0.088 0.089 0.09  0.091 0.092 0.093 0.094 0.095 0.096
 0.097 0.098 0.099 0.1  ]
