# Introduction to NumPy for Beginners

---

## Chapter 1: Why Use NumPy?

NumPy (Numerical Python) is the foundation of scientific computing in Python. We use it for two main reasons: **performance** and **convenience**.

NumPy arrays are much faster than Python lists because they use a contiguous block of memory, allowing for optimized operations written in C. This concept is called **vectorization**.

In [5]:
!pip install numpy



In [None]:
import numpy as np

<ufunc 'absolute'>

In [4]:
import time

# Define a large number of elements for our test
size = 1_000_000

# Create a standard Python list and a NumPy array
py_list = list(range(size))
np_array = np.arange(size)

# --- Test Python List Performance ---
start_time = time.time() # Get start time
py_list_doubled = [x * 2 for x in py_list] # Double each element using a loop
end_time = time.time() # Get end time
print(f"Python List (loop) took: {(end_time - start_time) * 1000:.2f} ms")

# --- Test NumPy Array Performance ---
start_time = time.time() # Get start time
np_array_doubled = np_array * 2 # Double each element using a single vectorized operation
end_time = time.time() # Get end time
print(f"NumPy Array (vectorization) took: {(end_time - start_time) * 1000:.2f} ms")

Python List (loop) took: 62.17 ms
NumPy Array (vectorization) took: 2.90 ms


---

## Chapter 2: Creating Arrays & Understanding Shape

### 2.1 One-Dimensional Arrays
The most basic way to create a NumPy array is from a Python list.

In [26]:
arr = np.array([1, 2, 3, 4, 5], dtype=float)
a2D = np.array([[1.99, 2], [3, 4], [4, 6]], dtype=int)

# a3D = np.array([[1, 2],
#                 [3, 4],
#                 [5, 6]])

a3D = np.array([[[1, 2],
                  [3, 4]],
                 [[5, 6],
                   [7, 8]]])
print(a3D)
# print(a2D)
# print(arr)

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


### 2.2 Understanding Array Attributes (1D)
Once an array is created, we can inspect its properties, called **attributes**.

In [17]:
print(arr.ndim)
print(a2D.ndim)
print(a3D.ndim)

1
2
2


In [24]:
print(a2D.shape)
print(a3D.shape)

(3, 2)
(2, 2, 2)


In [28]:
print(arr.size)
print(a2D.size)
print(a3D.size)

5
6
8


In [30]:
print(arr.dtype)
print(a2D.dtype)

float64
int64


### 2.3 Multi-Dimensional Arrays
To create a 2D array (a matrix), we use a list of lists.

In [19]:
b = np.array([[1, 2, 3, 4, 5], [1, 2, 3, 4, 5]])

print(b)

[[1 2 3 4 5]
 [1 2 3 4 5]]


In [20]:
b.size

10

In [21]:
b.dtype

dtype('int64')

### 2.4 Understanding Array Attributes (2D)
Let's look at the attributes for our new 2D array. Notice how `.ndim` and `.shape` have changed.

In [22]:
b.ndim

2

In [23]:
b.shape

(2, 5)

---

## Chapter 3: Reshaping and Creating Arrays from Shapes

### 3.1 Creating Arrays with Sequences
First, let's learn how to create simple 1D sequences.

In [38]:
# c = np.arange(0, 11, 2)
# c

# range(start , stop ,step)
c = np.arange(0, 11, 2, dtype=float)
c

array([ 0.,  2.,  4.,  6.,  8., 10.])

In [48]:
# d = np.linspace(start=0, stop=5, num=5, retstep=True) # 0 - 5 ตัวหลังสุดคือกำหนดว่าจะให้มันกระจายตัวเท่าไหร่
d = np.linspace(1, 4, 10, retstep=True)
print(d)

(array([1.        , 1.33333333, 1.66666667, 2.        , 2.33333333,
       2.66666667, 3.        , 3.33333333, 3.66666667, 4.        ]), np.float64(0.3333333333333333))


### 3.2 The `.reshape()` Method
Now that we understand what `.shape` is, we can learn how to change it. The `.reshape()` method allows you to rearrange the elements of an array into a new shape. The only rule is that the `size` (total number of elements) of the new shape must be the same as the original.

In [60]:
c = np.arange(16)
print(c)
arr = list(range(10))
arr

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]


[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [62]:
# d = c.reshape(4, 3) #(5, 3) ValueError: cannot reshape array of size 12 into shape (5,3)
# d

arr = np.reshape(c, (-1, 2))
arr

array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11],
       [12, 13],
       [14, 15]])

### 3.3 The `-1` Trick in Reshape
You can use `-1` as a placeholder for one of the dimensions. NumPy will automatically calculate what that dimension should be.

In [28]:
d = c.reshape(6, -1) # มีแค่ 12 ตัว 13แถวทำไม่ได้ ValueError: cannot reshape array of size 12 into shape (13,newaxis)
d

array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11]])

In [29]:
d = c.reshape(-1, 6) #(-1, -1) ValueError: can only specify one unknown dimension
d

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11]])

### 3.4 Creating Arrays with a Given Shape
Now that you understand shape tuples like `(rows, columns)`, you can learn the functions that create arrays directly from a desired shape.

In [63]:
zeros_arr = np.zeros((2, 3), dtype=int)
# สร้าง arr ที่ค่าข้างในเป็น 0 ทั้งหมด np.zeros(tuple) => (rows, column)
zeros_arr

array([[0, 0, 0],
       [0, 0, 0]])

In [31]:
ones_arr = np.ones((3, 4), dtype=int)
ones_arr

array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]])

In [32]:
zeros_like_arr = np.zeros_like(ones_arr, dtype=int)
zeros_like_arr

array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]])

In [76]:
full_arr = np.full((2, 2), 100)
full_arr

array([[100, 100],
       [100, 100]])

In [67]:
eye_arr = np.eye(5)
eye_arr

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

---

## Chapter 4: 1D Arrays vs. 2D Vectors (A Critical Distinction)

This is a common point of confusion. A 1D NumPy array is **neither a row vector nor a column vector**. It has no orientation.

- **1D Array:** Has a shape of `(n,)`. It's a flat list of `n` elements.
- **Row Vector:** Is a **2D array** with a shape of `(1, n)`. It has one row and `n` columns.
- **Column Vector:** Is a **2D array** with a shape of `(n, 1)`. It has `n` rows and one column.

This distinction is critical for linear algebra and machine learning.

In [35]:
a = np.array([1, 2, 3])       # shape (3,)

# Row Vector (2D)
row_vec = a.reshape(1, -1)    # shape (1, 3)

# Column Vector (2D)
col_vec = a.reshape(-1, 1)    # shape (3, 1)

print("1D Array:", a.shape)
print("Row Vector:", row_vec.shape)
print("Column Vector:", col_vec.shape)

1D Array: (3,)
Row Vector: (1, 3)
Column Vector: (3, 1)


---

## Chapter 5: Views vs. Copies

This is another vital concept.

- A **Copy** is a new array with its own data. Changes to the copy will not affect the original.
- A **View** is just a different way of looking at the *same* data. Changes to the view **will** affect the original array.

**Important:** Basic slicing creates a view, not a copy. Fancy and Boolean indexing create copies.

In [103]:
arr = np.arange(stop=101, step=10)
arr

array([  0,  10,  20,  30,  40,  50,  60,  70,  80,  90, 100])

---

In [91]:
slice_view = arr[::-1] #[start:stop-1:step]
slice_view

array([100,  90,  80,  70,  60,  50,  40,  30,  20,  10,   0])

In [104]:
slice_view = arr[8:].copy()
slice_view[2] = 999
print(slice_view)
print(arr)

[ 80  90 999]
[  0  10  20  30  40  50  60  70  80  90 100]


In [107]:
a = np.arange(101)
a

array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,
        13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,
        26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,
        39,  40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,
        52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,
        65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,
        78,  79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,
        91,  92,  93,  94,  95,  96,  97,  98,  99, 100])

## Chapter 6: Indexing and Slicing

### 6.1 Basic and 2D Slicing
The syntax is `arr[start:stop:step]`, which can be applied to each dimension.

In [111]:
matrix = np.arange(25).reshape(5, 5)
matrix

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [121]:
matrix[::2, ::-1] #[start:stop:step, start:stop,step]

array([[ 4,  3,  2,  1,  0],
       [14, 13, 12, 11, 10],
       [24, 23, 22, 21, 20]])

### 6.2 Boolean Indexing
Use conditions to filter data directly. Combine conditions with `&` (and) and `|` (or). **Crucially, you must wrap each condition in parentheses.**

In [128]:
data = np.arange(1, 11)
data

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [None]:
condition1 = (data % 2 == 0) | (data == 9) 
print(condition1)
print(data[condition1])

print(data[data > 5])

[False  True False  True False  True False  True  True  True]
[ 2  4  6  8  9 10]
[ 6  7  8  9 10]


In [43]:
condition2 = (data <= 2) | (data > 8)
# condition2
data[condition2]

array([ 1,  2,  9, 10])

In [44]:
data2 = np.array(["Hello", "hello"])
condition3 = np.char.startswith(data2, 'h')
data2[condition3]

array(['hello'], dtype='<U5')

### 6.3 Fancy Indexing
Use lists of integers to select data in a specific order. Remember, this creates a copy.

In [133]:
arr = np.arange(100, 110)
arr

array([100, 101, 102, 103, 104, 105, 106, 107, 108, 109])

In [138]:
arr[[5, 6, 7]]

array([105, 106, 107])

---

## Chapter 7: NumPy Operations

### 7.1 Vectorization & Universal Functions (Ufuncs)
A ufunc is a function that operates on arrays in an element-by-element fashion. They are the core of NumPy's vectorized operations.

In [151]:
x = np.array([1, 4, 9, 16])
y = np.array([1, 2, 3, 4])

In [144]:
x+y

array([ 2,  6, 12, 20])

In [145]:
np.add(x, y) #run time ดีกว่า x+y นิดหน่อย

array([ 2,  6, 12, 20])

In [146]:
x-y

array([ 0,  2,  6, 12])

In [147]:
np.subtract(x, y)

array([ 0,  2,  6, 12])

In [152]:
x@y

np.int64(100)

In [None]:
np.sqrt(x).astype(int)

array([1, 2, 3, 4])

In [None]:
np.power(x, 1/3) 

array([1.        , 1.58740105, 2.08008382, 2.5198421 ])

In [52]:
list_x = np.array([1, 4, 9, 16])
list_y = np.array([1, 2, 3, 4])
list_x + list_y


array([ 2,  6, 12, 20])

### 7.2 Broadcasting
Broadcasting describes how NumPy treats arrays with different shapes during arithmetic operations. The smaller array is "broadcast" across the larger one so that they have compatible shapes.

In [161]:
matrix = np.ones((3, 4), dtype=int)
arr = np.arange(2, 9, 2)

print(matrix)
print(arr)


[[1 1 1 1]
 [1 1 1 1]
 [1 1 1 1]]
[2 4 6 8]


In [162]:
print(arr + matrix) # rows == column --> Broadcasting

[[3 5 7 9]
 [3 5 7 9]
 [3 5 7 9]]


### 7.3 Statistical Operations
NumPy has a rich suite of statistical functions. The most crucial concept is the **`axis`** parameter, which specifies the dimension to operate along.

#### Finding Indices with `argmin` and `argmax`
Instead of getting the min/max value, these functions give you the *index* of that value. This is extremely useful in machine learning.
- **`axis=0`**: Finds the index of the min/max value in each **column**.
- **`axis=1`**: Finds the index of the min/max value in each **row**.

In [166]:
data = np.array([[10, 7, 4], [1, 2, 3], [9, 99, 999]])
data

array([[ 10,   7,   4],
       [  1,   2,   3],
       [  9,  99, 999]])

In [167]:
"""
axis 0 = column
axis 1 = row

"""
print(data.min(axis=0))
print(data.max(axis=0))

print(data.max(axis=1))

[1 2 3]
[ 10  99 999]
[ 10   3 999]


In [169]:
print(data.argmax(axis=1)) # return index
print(data.argmin(axis=0)) # return index

[0 2 2]
[1 1 1]


### 7.4 Linear Algebra
The `np.linalg` submodule contains core linear algebra functionality.

In [58]:
A = np.arange(1, 5).reshape(2, 2)
B = np.arange(5, 9).reshape(2, 2)

print(A)
print(B)

[[1 2]
 [3 4]]
[[5 6]
 [7 8]]


In [59]:
A.T

array([[1, 3],
       [2, 4]])

In [60]:
A @ B # np.dot(a, b)

array([[19, 22],
       [43, 50]])

In [61]:
np.linalg.inv(A) # Invert matrix

array([[-2. ,  1. ],
       [ 1.5, -0.5]])

In [62]:
np.linalg.inv(A) @ A

array([[1.00000000e+00, 0.00000000e+00],
       [1.11022302e-16, 1.00000000e+00]])

In [63]:
np.linalg.det(A)

np.float64(-2.0000000000000004)

### 7.5 Modern Random Number Generation
For reproducible science, it's best to create a **Generator** instance and seed it.

In [64]:
(np.random.rand(3, 5)*100//10).astype(int)

array([[7, 1, 1, 6, 1],
       [4, 7, 7, 5, 9],
       [4, 4, 4, 3, 4]])

In [None]:
(np.random.rand(5, 5)*10).astype(int)

array([[7, 3, 5, 9, 5],
       [8, 0, 1, 0, 3],
       [5, 9, 1, 9, 0],
       [1, 7, 2, 9, 2],
       [3, 3, 6, 5, 0]])

In [320]:
np.random.randint(1, 6, size=(3, 3))

array([[3, 4, 1],
       [3, 2, 2],
       [4, 4, 5]], dtype=int32)

In [399]:
from numpy.random import default_rng
import sys 

rng = default_rng(seed=sys.maxsize)
rng.integers(10, size=5)

array([4, 1, 2, 4, 6])

In [476]:
rng.standard_normal(10)

array([ 0.98392154, -1.44081813,  1.76648489,  0.13567687, -0.87472528,
       -1.5606416 ,  2.58360278,  0.13459948,  1.09168312,  0.44742714])

In [490]:
deck = np.arange(10)
rng.shuffle(deck)
deck

array([3, 7, 5, 4, 9, 8, 0, 6, 1, 2])

In [510]:
rng.choice(deck, size=3 , replace=True) # replace False ไม่หยิบเลขซ้ำ

array([1, 0, 7])