# NumPy Tutorial for Beginners

## 0. Introduction
NumPy (Numerical Python) is a Python library used for numerical computations and working with arrays. It provides a high-performance multidimensional array object, and tools for working with these arrays.



## 1. Installation and Import

### Install NumPy

Install using pip:

```pip install numpy```

Install using conda:

```conda install numpy```

Install a specific version:

```pip install numpy==1.19.5```


```conda install numpy==1.19.5```

### Importing Numpy

In [1]:
import numpy as np

## 2. Create arrays and their properties

Create arrays using numpy, and get their properties such as dimension, shape, data type, itemsize, total size, etc.

### Initializing NumPy Arrays
You can create a Numpy array using the `np.array()` function:

In [2]:
a = np.array([1, 2, 3, 4, 5],dtype=np.int32)
print(a)

[1 2 3 4 5]


In [3]:
b = np.array([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
print(b[0,2])

3.0


#### More initializing methods

In [4]:
# All 1s matrix
print(np.ones((4, 2, 2), dtype='int32'))

[[[1 1]
  [1 1]]

 [[1 1]
  [1 1]]

 [[1 1]
  [1 1]]

 [[1 1]
  [1 1]]]


In [5]:
# All 0s matrix
print(np.zeros((2, 3, 3)))

[[[0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]]

 [[0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]]]


In [6]:
# Any other number
print(np.full((2, 2), 99))

[[99 99]
 [99 99]]


In [7]:
# Any other number (full_like) - copy shape of another array
print(np.full_like(a, 4))

[4 4 4 4 4]


#### Random Numbers
We can also generate random numbers. 

In [8]:
# Random decimal numbers
print(np.random.rand(4, 2))

[[0.69858802 0.36854694]
 [0.24143252 0.20404743]
 [0.03222985 0.82495731]
 [0.75254135 0.23985028]]


In [9]:
# Random Integer values (start, end, size)
print(np.random.randint(11, size=(3, 3)))
print(np.random.randint(-4, 8, size=(3, 3)))

[[3 1 9]
 [4 2 9]
 [3 7 2]]
[[ 4  2 -2]
 [-3  6  7]
 [ 6  6 -4]]


In [10]:
# Random numbers that follow a normal distribution
print(np.random.randn(100))

[-0.91757644  0.93477579  1.72478178  2.13774354 -0.23021869 -2.29382083
 -0.03858306 -0.22844565  1.16706453  0.48180358  1.22998182  0.78366379
 -1.45214163  1.38076612  0.21487297 -1.37576405 -0.18378876 -0.51165954
 -0.77417807 -0.6702308  -1.25287922  0.5003666  -1.34214091 -0.37230661
 -2.04392282 -0.25132036  0.44505337  0.10099788  0.23944978 -0.67005164
 -0.16485469 -1.44522033  0.1254572   1.78613868 -0.3883826  -1.45304241
 -1.35636289 -0.47765804 -0.9677432  -0.67498827 -0.11880555 -1.41566303
  0.21014055 -1.49676959  0.3115381  -0.49581236 -3.40698781 -0.10836408
 -0.7970611   0.52394414 -0.28758045  0.48281376  0.24107885 -2.10311495
  0.82888799  0.40540958  0.58009827  0.21424426 -0.46938238 -0.40335774
 -2.25372029 -0.92759076 -0.38269245  1.09809202 -0.88613012  0.59144691
  2.87127114 -0.14020926  0.72964685  0.52059551 -0.4447406  -0.41719663
 -0.00486875  0.0231684  -0.85528619  1.10597908 -1.35261355  0.21132248
 -0.99352084 -0.14612128 -2.03541551  0.86509452  0

#### Other 

In [11]:
# The identity matrix
print(np.identity(5))

[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


In [12]:
# linspace (start, end, number of points)
print(np.linspace(1, 5, 10))

[1.         1.44444444 1.88888889 2.33333333 2.77777778 3.22222222
 3.66666667 4.11111111 4.55555556 5.        ]


In [13]:
# An array with a range of numbers - start, end, step
print(np.arange(1, 5))
print(np.arange(1, 5, 0.5))

[1 2 3 4]
[1.  1.5 2.  2.5 3.  3.5 4.  4.5]


In [14]:
# Repeat an array
arr = np.array([[1, 2, 3]])
r1 = np.repeat(arr, 3, axis=0)
print(r1)

[[1 2 3]
 [1 2 3]
 [1 2 3]]


In [15]:
# Challenge
output = np.ones((5, 5))
print(output)

z = np.zeros((3, 3))
z[1, 1] = 9
print(z)

output[1:4, 1:4] = z
print(output)

[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]
[[0. 0. 0.]
 [0. 9. 0.]
 [0. 0. 0.]]
[[1. 1. 1. 1. 1.]
 [1. 0. 0. 0. 1.]
 [1. 0. 9. 0. 1.]
 [1. 0. 0. 0. 1.]
 [1. 1. 1. 1. 1.]]


##### Be careful when copying arrays!!!

In [16]:
a = np.array([1, 2, 3])
b = a
b[0] = 100
print(a)

[100   2   3]


In [17]:
# Use copy() to avoid this
a = np.array([1, 2, 3])
b = a.copy()
b[0] = 100
print(a)

[1 2 3]


### b. Get dimension and shape of the arrays

In [18]:
# Get Dimension
print(a.ndim)
print(b.ndim)

1
1


In [19]:
# Get shape
print(a.shape)
print(b.shape)

(3,)
(3,)


### c. Get the data type, item size, and total size of the arrays.

In [20]:
# Get Type
print(a.dtype,b.dtype)

int64 int64


In [21]:
# Get Size
print(a.itemsize, b.itemsize)

8 8


In [22]:
# Get total size
print(a.size * a.itemsize, b.size * b.itemsize)
print(a.nbytes, b.nbytes)

24 24
24 24


### Hands on practice (5 mins)
Please install numpy and practice the following basics:
1. Install and import numpy.
2. Create 3 different kinds of numpy arrays, e.g. with specific numbers, random, different datatype.
3. Get the shape, dimension, data type, itemsize and total size of the arrays.

## 2. Accessing/Changing specific elements, rows, columns, etc.

In [23]:
a = np.array([[1, 2, 3, 4, 5, 6, 7], [8, 9, 0, 1, 2, 3, 4]])
print(a)

[[1 2 3 4 5 6 7]
 [8 9 0 1 2 3 4]]


In [24]:
# Get a specific element [r, c]
print(a[1, 5])

3


In [25]:
# Get a specific row
print(a[0, :])

[1 2 3 4 5 6 7]


In [26]:
# Get a specific column
print(a[:, 2])

[3 0]


In [27]:
# Get a little more fancy [startindex:endindex:stepsize]
print(a[0, 1:-2:2])

[2 4]


In [28]:
# Change values
a[1, 5] = 20
print(a)

a[:, 2] = [1, 2]
print(a)


[[ 1  2  3  4  5  6  7]
 [ 8  9  0  1  2 20  4]]
[[ 1  2  1  4  5  6  7]
 [ 8  9  2  1  2 20  4]]


In [29]:
# 3D example
b = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(b)

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


In [30]:
# Get specific element (work outside in)
print(b[0, 1, 1])
print(b[:, 1, :])

4
[[3 4]
 [7 8]]


In [31]:
# replace
b[:, 1, :] = [[9, 9], [8, 8]]
print(b)

[[[1 2]
  [9 9]]

 [[5 6]
  [8 8]]]


### Practice time (5 mins)
Please practice the following:
1. Create different arrays such as all zeros, all nans, fill with a specific number, or random numbers.
2. Copy an array in different ways and check their differences.

## 3. Array Operation
Numpy arrays support all standard arithmetic operations

In [32]:
# Mathematics
a = np.array([1, 2, 3, 4])
print(a)

# Addition
print(a + 2)

# Subtraction
print(a - 2)

# Multiplication
print(a * 2)

# Division
print(a / 2)

# Power
print(a ** 2)

# Take the sin
print(np.sin(a))

# Take the cos
print(np.cos(a))



[1 2 3 4]
[3 4 5 6]
[-1  0  1  2]
[2 4 6 8]
[0.5 1.  1.5 2. ]
[ 1  4  9 16]
[ 0.84147098  0.90929743  0.14112001 -0.7568025 ]
[ 0.54030231 -0.41614684 -0.9899925  -0.65364362]


In [33]:
# Operation on 2 arrays
b = np.array([1, 2, 1, 2])

# Addition
print(a + b)

# Subtraction
print(a - b)

# Multiplication
print(a * b)

# Division
print(a / b)

[2 4 4 6]
[0 0 2 2]
[1 4 3 8]
[1. 1. 3. 2.]


In [34]:
# Linear Algebra
a = np.ones((2, 3))
print(a)

b = np.full((3, 2), 2)
print(b)

# Matrix Multiplication
print(np.matmul(a, b))

# Find the determinant
c = np.identity(3)
print(np.linalg.det(c))

[[1. 1. 1.]
 [1. 1. 1.]]
[[2 2]
 [2 2]
 [2 2]]
[[6. 6.]
 [6. 6.]]
1.0


In [35]:
# Statistics
stats = np.array([[1, 2, 3], [4, 5, 6]])
print(stats)

# Min
print(np.min(stats))

# Max
print(np.max(stats))

# Sum
print(np.sum(stats))

# Axis 0 = columns
print(np.sum(stats, axis=0))

# Axis 1 = rows
print(np.sum(stats, axis=1))

[[1 2 3]
 [4 5 6]]
1
6
21
[5 7 9]
[ 6 15]


### b. Reorganizing Arrays (reshape, vstack, hstack)

In [36]:
# Reorganizing Arrays
before = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(before)

after = before.reshape((4, 2))
print(after)

# Vertically stacking vectors
v1 = np.array([1, 2, 3, 4])
v2 = np.array([5, 6, 7, 8])

print(np.vstack([v1, v2, v1, v2]))

# Horizontal stack
h1 = np.ones((2, 4))
h2 = np.zeros((2, 2))

print(np.hstack((h1, h2)))

[[1 2 3 4]
 [5 6 7 8]]
[[1 2]
 [3 4]
 [5 6]
 [7 8]]
[[1 2 3 4]
 [5 6 7 8]
 [1 2 3 4]
 [5 6 7 8]]
[[1. 1. 1. 1. 0. 0.]
 [1. 1. 1. 1. 0. 0.]]


### Indexing
You can access elements of a Numpy array using indices:

In [37]:
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [38]:
# Indexing with a list of booleans (or a boolean mask)
print(a[[True, False, True, False, True, False, True, False, True]])
print(a[a > 5])

[1 3 5 7 9]
[6 7 8 9]


In [39]:
# You can index with a list in NumPy
print(a[[1, 2, 8]])

[2 3 9]


In [40]:
# Any and All
print(a > 5)
print(np.any(a > 5))
print(np.all(a > 5))

[False False False False False  True  True  True  True]
True
False


In [41]:
# Challenge
filedata = np.genfromtxt('data.txt', delimiter=',')
filedata = filedata.astype('int32')
print(filedata)

# You can index with a list in NumPy

print(filedata[[2, 4, 6, 8, 10, 12, 14, 16, 18, 20]])
print(np.any(filedata > 50, axis=0))
print(np.all(filedata > 50, axis=0))
print((~((filedata > 50) & (filedata < 100))))

FileNotFoundError: data.txt not found.

### Slicing
You can slice Numpy arrays similar to Python lists:

In [5]:
arr = np.array([1, 2, 3, 4, 5])
print(arr[1:4])  # prints [2 3 4]

[2 3 4]


### Shape and Reshape
You can get the shape of an array using the `shape` attribute and change the shape using `reshape` function:

In [6]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.shape)  # prints (2, 3)

reshaped_arr = arr.reshape(3, 2)
print(reshaped_arr)

(2, 3)
[[1 2]
 [3 4]
 [5 6]]


### Practice Time

Time: 5 mins

1. Read the example dataset.
2. Generate another random array with the same size.
3. Compare their statistics such as mean, min, max, etc.
4. Perform operations on these two arrays.

## Further Reading

NumPy is a super useful and basic package in python. There are many free online resources to explore. 

For example:

1. Check their [official documentation](https://numpy.org/doc/stable/user/absolute_beginners.html), which provides very helpful descriptions and examples.
2. [Numpy Tutorial (2022): For Physicists, Engineers, and Mathematicians](https://www.youtube.com/watch?v=DcfYgePyedM&ab_channel=Mr.PSolver)
3. [Numpy for Machine Learning](https://www.youtube.com/playlist?list=PLCC34OHNcOtpalASMlX2HHdsLNipyyhbK)
4. [NumPy Explained - FUll Course (3 Hrs)](https://www.youtube.com/watch?v=eClQWW_gbFk&ab_channel=GormAnalysis)