# NumPy Tutorial

NumPy = **Num**erical **Py**thon

"NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more."

https://numpy.org/

In [2]:
import numpy as np

In [3]:
a = np.array([[1, 2, 3],
              [4, 5, 6]])

In [4]:
a.shape

(2, 3)

"NumPy shines when there are large quantities of “homogeneous” (same-type) data to be processed on the CPU."

## Array fundamentals

In [5]:
a = np.array([1, 2, 3, 4, 5, 6])

In [6]:
a

array([1, 2, 3, 4, 5, 6])

In [7]:
a[0]

1

In [8]:
a[0] = 10

In [9]:
a

array([10,  2,  3,  4,  5,  6])

In [10]:
a[:3]

array([10,  2,  3])

- View: an object that refers to the data in the original array.

In [11]:
b = a[3:]
b

array([4, 5, 6])

- The original array can be mutated using the view.

In [12]:
b[0] = 40
a

array([10,  2,  3, 40,  5,  6])

- Copies and views: https://numpy.org/doc/stable/user/basics.copies.html#basics-copies-and-views

In [13]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [14]:
a[1][3]

8

## Array attributes

In [21]:
a = np.array([1, 2, 3])  # This is a simple 1D vector of length 3

- `ndim` = number of dimensions

In [22]:
a.ndim

1

In [23]:
a.shape

(3,)

In [24]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

In [25]:
a.ndim

2

- The shape of an array is a tuple of non-negative integers that specify the number of elements along each dimension.

In [26]:
a.shape

(3, 4)

- The fixed, total number of elements in array is contained in the `size` attribute.

In [27]:
a.size

12

In [28]:
import math

In [29]:
a.size == math.prod(a.shape)

True

- Arrays are typically "homogeneous", which means they contain elements of the same data type.
- The data type is recorded in the `dtype` attribute.

In [30]:
a.dtype

dtype('int32')

- For those who do not know:
    - `int32` means **32-bit integer** which is an integer (oh really??) that is stored using 32 binary digits (bits) → each can be either 0 or 1.
    - 32 bits = sequence of 32 zeros or ones.
    - For example, `00000000 00000000 00000000 00000001` → represents the number 1

## How to create a basic array

In [32]:
np.zeros(2)

array([0., 0.])

In [33]:
np.ones(5)

array([1., 1., 1., 1., 1.])

In [34]:
np.empty(2)

array([0., 0.])

In [35]:
np.arange(4)

array([0, 1, 2, 3])

In [36]:
np.arange(2, 13, 2)

array([ 2,  4,  6,  8, 10, 12])

In [37]:
np.linspace(0, 10, num=5)

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

In [38]:
np.linspace(0, 10, num=6)

array([ 0.,  2.,  4.,  6.,  8., 10.])

In [39]:
np.linspace(0, 10, num=7)

array([ 0.        ,  1.66666667,  3.33333333,  5.        ,  6.66666667,
        8.33333333, 10.        ])

In [40]:
x = np.ones(2, dtype=np.int64)

In [41]:
x

array([1, 1], dtype=int64)

## Adding, removing, and sorting elements

In [42]:
arr = np.array([2, 1, 5, 3, 7, 4, 6, 8])

In [43]:
np.sort(arr)

array([1, 2, 3, 4, 5, 6, 7, 8])

In [44]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

In [45]:
np.concatenate((a, b))

array([1, 2, 3, 4, 5, 6, 7, 8])

In [46]:
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6]])

In [49]:
np.concatenate((x, y), axis=0)

array([[1, 2],
       [3, 4],
       [5, 6]])

## How do you know the shape and size of an array?

In [50]:
array_example = np.array([[[0, 1, 2, 3],
                           [4, 5, 6, 7]],

                          [[0, 1, 2, 3],
                           [4, 5, 6, 7]],

                          [[0 ,1 ,2, 3],
                           [4, 5, 6, 7]]])

In [51]:
array_example.ndim

3

In [52]:
array_example.size

24

In [53]:
array_example.shape

(3, 2, 4)

## Can you reshape an array?

In [55]:
a = np.arange(6)
print(a)

[0 1 2 3 4 5]


In [None]:
b = a.reshape(a, 3, 2)
print(b)

[[0 1]
 [2 3]
 [4 5]]


In [68]:
np.reshape(a, (1, 6), order='C')

array([[0, 1, 2, 3, 4, 5]])

## Convert a 1D array into a 2D array

In [69]:
a = np.array([1, 2, 3, 4, 5, 6])
a.shape

(6,)

In [72]:
a2 = a[np.newaxis, :]
a2.shape

(1, 6)

In [73]:
row_vector = a[np.newaxis, :]
row_vector.shape

(1, 6)

In [74]:
col_vector = a[:, np.newaxis]
col_vector.shape

(6, 1)

In [75]:
a = np.array([1, 2, 3, 4, 5, 6])
a.shape

(6,)

In [76]:
b = np.expand_dims(a, axis=1)
b.shape

(6, 1)

In [77]:
c = np.expand_dims(a, axis=0)
c.shape

(1, 6)

## Indexing and Slicing

In [78]:
data = np.array([1, 2, 3])
data[1]

2

In [79]:
data[0:2]

array([1, 2])

In [80]:
data[1:]

array([2, 3])

In [81]:
data[-2:]

array([2, 3])

In [82]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
print(a[a < 5])

[1 2 3 4]


In [83]:
five_up = (a >= 5)
print(a[five_up])

[ 5  6  7  8  9 10 11 12]


In [84]:
divisible_by_2 = a[a%2==0]
print(divisible_by_2)

[ 2  4  6  8 10 12]


In [85]:
c = a[(a > 2) & (a < 11)]
print(c)

[ 3  4  5  6  7  8  9 10]


In [86]:
five_up = (a > 5) | (a == 5)
print(five_up)

[[False False False False]
 [ True  True  True  True]
 [ True  True  True  True]]


In [87]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
b = np.nonzero(a < 5)
print(b)

(array([0, 0, 0, 0], dtype=int64), array([0, 1, 2, 3], dtype=int64))


In [88]:
list_of_coordinates = list(zip(b[0], b[1]))
for coord in list_of_coordinates:
    print(coord)

(0, 0)
(0, 1)
(0, 2)
(0, 3)


In [89]:
print(a[b])

[1 2 3 4]


## How to create an array from existing data

In [90]:
a = np.array([1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [91]:
arr1 = a[3:8]
arr1

array([4, 5, 6, 7, 8])

In [92]:
a1 = np.array([[1, 1],
               [2, 2]])

a2 = np.array([[3, 3],
               [4, 4]])

In [93]:
np.vstack((a1, a2))

array([[1, 1],
       [2, 2],
       [3, 3],
       [4, 4]])

In [94]:
np.hstack((a1, a2))

array([[1, 1, 3, 3],
       [2, 2, 4, 4]])

In [95]:
x = np.arange(1, 25).reshape(2, 12)
x

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]])

In [96]:
np.hsplit(x, 3)

[array([[ 1,  2,  3,  4],
        [13, 14, 15, 16]]),
 array([[ 5,  6,  7,  8],
        [17, 18, 19, 20]]),
 array([[ 9, 10, 11, 12],
        [21, 22, 23, 24]])]

In [97]:
np.hsplit(x, (3, 4))

[array([[ 1,  2,  3],
        [13, 14, 15]]),
 array([[ 4],
        [16]]),
 array([[ 5,  6,  7,  8,  9, 10, 11, 12],
        [17, 18, 19, 20, 21, 22, 23, 24]])]

- "Views are an important NumPy concept! NumPy functions, as well as operations like indexing and slicing, will return views whenever possible. This saves memory and is faster (no copy of the data has to be made). However it’s important to be aware of this - **modifying data in a view also modifies the original array**!"

In [98]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

In [99]:
b1 = a[0, :]
b1

array([1, 2, 3, 4])

In [100]:
b1[0] = 99
b1

array([99,  2,  3,  4])

In [101]:
a

array([[99,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [102]:
b2 = a.copy()

In [103]:
b2

array([[99,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [104]:
b2[0] = 100

In [105]:
b2

array([[100, 100, 100, 100],
       [  5,   6,   7,   8],
       [  9,  10,  11,  12]])

In [106]:
a

array([[99,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

## Basic array operations

In [107]:
data = np.array([1, 2])
ones = np.ones(2, dtype=int)
data + ones

array([2, 3])

![image.png](attachment:image.png)

In [109]:
data - ones

array([0, 1])

In [110]:
data * data

array([1, 4])

In [111]:
data / data

array([1., 1.])

![image.png](attachment:image.png)

In [112]:
a = np.array([1, 2, 3, 4])
a.sum()

10

In [115]:
b = np.array([[1, 1], [2, 2]])

In [118]:
b.sum(axis=0)  # vertical (down each col)

array([3, 3])

In [119]:
b.sum(axis=1)  # horizontal (across the cols)

array([2, 4])

## Broadcasting

In [120]:
data = np.array([1.0, 2.0])
data * 1.6

array([1.6, 3.2])

## More useful array operations

In [121]:
data

array([1., 2.])

In [122]:
data.max()

2.0

In [123]:
data.min()

1.0

In [124]:
data.sum()

3.0

In [130]:
a = np.random.uniform(low=0.0, high=1.0, size=(3,4))

In [131]:
a

array([[0.74708129, 0.81068866, 0.28373304, 0.27733943],
       [0.12406873, 0.09213657, 0.25316103, 0.86729674],
       [0.62163331, 0.79707664, 0.45346394, 0.75632084]])

In [132]:
a.sum()

6.084000209626186

In [133]:
a.min()

0.09213657053141466

In [134]:
a.min(axis=0)

array([0.12406873, 0.09213657, 0.25316103, 0.27733943])

## Creating matrices

In [135]:
data = np.array([[1, 2], [3, 4], [5, 6]])
data

array([[1, 2],
       [3, 4],
       [5, 6]])

In [136]:
data[0, 1]

2

In [137]:
data[1:3]

array([[3, 4],
       [5, 6]])

In [138]:
data[0:2, 0]

array([1, 3])

In [139]:
data.max()

6

In [140]:
data.min()

1

In [141]:
data.sum()

21

In [142]:
data = np.array([[1, 2], [5, 3], [4, 6]])

In [143]:
data.max(axis=0)

array([5, 6])

In [144]:
data.max(axis=1)

array([2, 5, 6])

In [145]:
data = np.array([[1, 2], [3, 4]])
ones = np.array([[1, 1], [1, 1]])
data + ones

array([[2, 3],
       [4, 5]])

In [146]:
data = np.array([[1, 2], [3, 4], [5, 6]])
ones_row = np.array([[1, 1]])
data + ones_row

array([[2, 3],
       [4, 5],
       [6, 7]])

In [147]:
np.ones((4, 3, 2))

array([[[1., 1.],
        [1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.],
        [1., 1.]]])

In [150]:
rng = np.random.default_rng()
rng.random(3)

array([0.19793066, 0.18282637, 0.86933147])

## Generating random numbers

In [151]:
rng.integers(5, size=(2, 4))

array([[2, 2, 0, 1],
       [1, 3, 1, 4]], dtype=int64)

## How to get unique items and counts

In [152]:
a = np.array([11, 11, 12, 13, 14, 15, 16, 17, 12, 13, 11, 14, 18, 19, 20])
unique_values = np.unique(a)
print(unique_values)

[11 12 13 14 15 16 17 18 19 20]


In [153]:
unique_values, indices_list = np.unique(a, return_index=True)
print(indices_list)

[ 0  2  3  4  5  6  7 12 13 14]


In [154]:
unique_values, occurrence_count = np.unique(a, return_counts=True)
print(occurrence_count)

[3 2 2 2 1 1 1 1 1 1]


In [155]:
a_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [1, 2, 3, 4]])
unique_values = np.unique(a_2d)
print(unique_values)

[ 1  2  3  4  5  6  7  8  9 10 11 12]


In [156]:
unique_rows = np.unique(a_2d, axis=0)
print(unique_rows)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [158]:
unique_rows, indices, occurrence_count = np.unique(
     a_2d, axis=0, return_counts=True, return_index=True)
print(unique_rows)
print(indices)
print(occurrence_count)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
[0 1 2]
[2 1 1]


## Transposing and reshaping a matrix

In [159]:
data.reshape(2, 3)

array([[1, 2, 3],
       [4, 5, 6]])

In [160]:
data.reshape(3, 2)

array([[1, 2],
       [3, 4],
       [5, 6]])

![image.png](attachment:image.png)

In [171]:
arr = np.arange(1, 7).reshape((3, 2))
arr

array([[1, 2],
       [3, 4],
       [5, 6]])

In [172]:
arr.transpose()

array([[1, 3, 5],
       [2, 4, 6]])

In [173]:
arr.T

array([[1, 3, 5],
       [2, 4, 6]])

![image.png](attachment:image.png)

## How to reverse an array

In [174]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

In [175]:
reversed_arr = np.flip(arr)

In [176]:
reversed_arr

array([8, 7, 6, 5, 4, 3, 2, 1])

In [177]:
arr_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

In [178]:
reversed_2d_arr = np.flip(arr_2d)
reversed_2d_arr

array([[12, 11, 10,  9],
       [ 8,  7,  6,  5],
       [ 4,  3,  2,  1]])

In [179]:
reversed_arr_cols = np.flip(arr_2d, axis=0)

In [180]:
reversed_arr_cols

array([[ 9, 10, 11, 12],
       [ 5,  6,  7,  8],
       [ 1,  2,  3,  4]])

In [181]:
arr_2d[1] = np.flip(arr_2d[1])
print(arr_2d)

[[ 1  2  3  4]
 [ 8  7  6  5]
 [ 9 10 11 12]]


In [182]:
arr_2d[:,1] = np.flip(arr_2d[:,1])
print(arr_2d)

[[ 1 10  3  4]
 [ 8  7  6  5]
 [ 9  2 11 12]]


## Reshaping and flattening multidimensional arrays

- "There are two popular ways to flatten an array: `.flatten()` and `.ravel()`. The primary difference between the two is that the new array created using ravel() is actually a reference to the parent array (i.e., a “view”). This means that any changes to the new array will affect the parent array as well. Since ravel does not create a copy, it’s memory efficient."

In [183]:
x = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

In [184]:
x.flatten()

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

- When you use `flatten`, changes to your new array won’t change the parent array.

In [185]:
a1 = x.flatten()
a1[0] = 99
print(x)
print(a1)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
[99  2  3  4  5  6  7  8  9 10 11 12]


- But when you use `ravel`, the changes you make to the new array will affect the parent array.

In [186]:
a2 = x.ravel()
a2[0] = 98
print(x)
print(a2)

[[98  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
[98  2  3  4  5  6  7  8  9 10 11 12]


## How to access the docstring for more information

- "When it comes to the data science ecosystem, Python and NumPy are built with the user in mind. One of the best examples of this is the built-in access to documentation. Every object contains the reference to a string, which is known as the docstring. In most cases, this docstring contains a quick and concise summary of the object and how to use it. Python has a built-in `help()` function that can help you access this information. This means that nearly any time you need more information, you can use `help()` to quickly find the information that you need."

In [187]:
help(max)

Help on built-in function max in module builtins:

max(...)
    max(iterable, *[, default=obj, key=func]) -> value
    max(arg1, arg2, *args, *[, key=func]) -> value
    
    With a single iterable argument, return its biggest item. The
    default keyword-only argument specifies an object to return if
    the provided iterable is empty.
    With two or more arguments, return the largest argument.



In [188]:
max?

[1;31mDocstring:[0m
max(iterable, *[, default=obj, key=func]) -> value
max(arg1, arg2, *args, *[, key=func]) -> value

With a single iterable argument, return its biggest item. The
default keyword-only argument specifies an object to return if
the provided iterable is empty.
With two or more arguments, return the largest argument.
[1;31mType:[0m      builtin_function_or_method

In [191]:
sum?

[1;31mSignature:[0m [0msum[0m[1;33m([0m[0miterable[0m[1;33m,[0m [1;33m/[0m[1;33m,[0m [0mstart[0m[1;33m=[0m[1;36m0[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Return the sum of a 'start' value (default: 0) plus an iterable of numbers

When the iterable is empty, return the start value.
This function is intended specifically for use with numeric values and may
reject non-numeric types.
[1;31mType:[0m      builtin_function_or_method

In [194]:
a = np.array([1, 2, 3, 4, 5, 6])

In [195]:
a?

[1;31mType:[0m        ndarray
[1;31mString form:[0m [1 2 3 4 5 6]
[1;31mLength:[0m      6
[1;31mFile:[0m        c:\python310\lib\site-packages\numpy\__init__.py
[1;31mDocstring:[0m  
ndarray(shape, dtype=float, buffer=None, offset=0,
        strides=None, order=None)

An array object represents a multidimensional, homogeneous array
of fixed-size items.  An associated data-type object describes the
format of each element in the array (its byte-order, how many bytes it
occupies in memory, whether it is an integer, a floating point number,
or something else, etc.)

Arrays should be constructed using `array`, `zeros` or `empty` (refer
to the See Also section below).  The parameters given here refer to
a low-level method (`ndarray(...)`) for instantiating an array.

For more information, refer to the `numpy` module and examine the
methods and attributes of an array.

Parameters
----------
(for the __new__ method; see Notes below)

shape : tuple of ints
    Shape of created array.


## Working with mathematical formulas

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)