# Introduction to NumPy for Beginners

---

## Chapter 1: Why Use NumPy?

NumPy (Numerical Python) is the foundation of scientific computing in Python. We use it for two main reasons: **performance** and **convenience**.

NumPy arrays are much faster than Python lists because they use a contiguous block of memory, allowing for optimized operations written in C. This concept is called **vectorization**.

In [59]:
!pip install numpy



In [60]:
import numpy as np

In [61]:
import time

# Define a large number of elements for our test
size = 1_000_000

# Create a standard Python list and a NumPy array
py_list = list(range(size))
np_array = np.arange(size)

# --- Test Python List Performance ---
start_time = time.time() # Get start time
py_list_doubled = [x * 2 for x in py_list] # Double each element using a loop
end_time = time.time() # Get end time
print(f"Python List (loop) took: {(end_time - start_time) * 1000:.2f} ms")

# --- Test NumPy Array Performance ---
start_time = time.time() # Get start time
np_array_doubled = np_array * 2 # Double each element using a single vectorized operation
end_time = time.time() # Get end time
print(f"NumPy Array (vectorization) took: {(end_time - start_time) * 1000:.2f} ms")

Python List (loop) took: 35.12 ms
NumPy Array (vectorization) took: 1.27 ms


---

## Chapter 2: Creating Arrays & Understanding Shape

### 2.1 One-Dimensional Arrays
The most basic way to create a NumPy array is from a Python list.

In [62]:
arr = np.array([1, 2, 3, 4, 5])

print(arr)

[1 2 3 4 5]


### 2.2 Understanding Array Attributes (1D)
Once an array is created, we can inspect its properties, called **attributes**.

In [63]:
arr.ndim

1

In [64]:
arr.shape

(5,)

In [65]:
arr.size

5

In [66]:
arr.dtype

dtype('int64')

### 2.3 Multi-Dimensional Arrays
To create a 2D array (a matrix), we use a list of lists.

In [67]:
b = np.array([[1, 2, 3, 4, 5], [1, 2, 3, 4, 5]])

print(b)

[[1 2 3 4 5]
 [1 2 3 4 5]]


In [68]:
b.size

10

In [69]:
b.dtype

dtype('int64')

### 2.4 Understanding Array Attributes (2D)
Let's look at the attributes for our new 2D array. Notice how `.ndim` and `.shape` have changed.

In [70]:
b.ndim

2

In [71]:
b.shape

(2, 5)

---

## Chapter 3: Reshaping and Creating Arrays from Shapes

### 3.1 Creating Arrays with Sequences
First, let's learn how to create simple 1D sequences.

In [72]:
c = np.arange(0, 11, 2)
c

array([ 0,  2,  4,  6,  8, 10])

In [73]:
d = np.linspace(0, 5, 10) # 0 - 5 ตัวหลังสุดคือกำหนดว่าจะให้มันกระจายตัวเท่าไหร่
d # กระจาย10

array([0.        , 0.55555556, 1.11111111, 1.66666667, 2.22222222,
       2.77777778, 3.33333333, 3.88888889, 4.44444444, 5.        ])

### 3.2 The `.reshape()` Method
Now that we understand what `.shape` is, we can learn how to change it. The `.reshape()` method allows you to rearrange the elements of an array into a new shape. The only rule is that the `size` (total number of elements) of the new shape must be the same as the original.

In [74]:
c = np.arange(12)
c

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [75]:
d = c.reshape(4, 3) #(5, 3) ValueError: cannot reshape array of size 12 into shape (5,3)
d

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

### 3.3 The `-1` Trick in Reshape
You can use `-1` as a placeholder for one of the dimensions. NumPy will automatically calculate what that dimension should be.

In [76]:
d = c.reshape(6, -1) # มีแค่ 12 ตัว 13แถวทำไม่ได้ ValueError: cannot reshape array of size 12 into shape (13,newaxis)
d

array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11]])

In [77]:
d = c.reshape(-1, 6) #(-1, -1) ValueError: can only specify one unknown dimension
d

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11]])

### 3.4 Creating Arrays with a Given Shape
Now that you understand shape tuples like `(rows, columns)`, you can learn the functions that create arrays directly from a desired shape.

In [78]:
zeros_arr = np.zeros((2, 3))
# สร้าง arr ที่ค่าข้างในเป็น 0 ทั้งหมด np.zeros(tuple) => (rows, column)
zeros_arr

array([[0., 0., 0.],
       [0., 0., 0.]])

In [79]:
ones_arr = np.ones((3, 4), dtype=int)
ones_arr

array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]])

In [80]:
zeros_like_arr = np.zeros_like(ones_arr, dtype=int)
zeros_like_arr

array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]])

In [81]:
full_arr = np.full((5, 5), 999)
full_arr

array([[999, 999, 999, 999, 999],
       [999, 999, 999, 999, 999],
       [999, 999, 999, 999, 999],
       [999, 999, 999, 999, 999],
       [999, 999, 999, 999, 999]])

In [82]:
eye_arr = np.eye(3, dtype=int)
eye_arr

array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]])

---

## Chapter 4: 1D Arrays vs. 2D Vectors (A Critical Distinction)

This is a common point of confusion. A 1D NumPy array is **neither a row vector nor a column vector**. It has no orientation.

- **1D Array:** Has a shape of `(n,)`. It's a flat list of `n` elements.
- **Row Vector:** Is a **2D array** with a shape of `(1, n)`. It has one row and `n` columns.
- **Column Vector:** Is a **2D array** with a shape of `(n, 1)`. It has `n` rows and one column.

This distinction is critical for linear algebra and machine learning.

In [117]:
a = np.array([1, 2, 3])       # shape (3,)

# Row Vector (2D)
row_vec = a.reshape(1, -1)    # shape (1, 3)

# Column Vector (2D)
col_vec = a.reshape(-1, 1)    # shape (3, 1)

print("1D Array:", a.shape)
print("Row Vector:", row_vec.shape)
print("Column Vector:", col_vec.shape)

1D Array: (3,)
Row Vector: (1, 3)
Column Vector: (3, 1)


---

## Chapter 5: Views vs. Copies

This is another vital concept.

- A **Copy** is a new array with its own data. Changes to the copy will not affect the original.
- A **View** is just a different way of looking at the *same* data. Changes to the view **will** affect the original array.

**Important:** Basic slicing creates a view, not a copy. Fancy and Boolean indexing create copies.

In [83]:
arr = np.arange(stop=101, step=10)
arr

array([  0,  10,  20,  30,  40,  50,  60,  70,  80,  90, 100])

---

In [84]:
slice_view = arr[::-1]
slice_view

array([100,  90,  80,  70,  60,  50,  40,  30,  20,  10,   0])

In [85]:
slice_view = arr[:5].copy()
slice_view[0] = 9999

print(slice_view)
print(arr)

[9999   10   20   30   40]
[  0  10  20  30  40  50  60  70  80  90 100]


## Chapter 6: Indexing and Slicing

### 6.1 Basic and 2D Slicing
The syntax is `arr[start:stop:step]`, which can be applied to each dimension.

In [86]:
matrix = np.arange(25).reshape(5, 5)
matrix

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [87]:
matrix[:2, :3]
matrix

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

### 6.2 Boolean Indexing
Use conditions to filter data directly. Combine conditions with `&` (and) and `|` (or). **Crucially, you must wrap each condition in parentheses.**

In [88]:
data = np.arange(1, 11)
data

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [89]:
condition1 = (data > 3) & (data % 2 == 0)
data[condition1]

array([ 4,  6,  8, 10])

In [90]:
condition2 = (data <= 2) | (data > 8)
# condition2
data[condition2]

array([ 1,  2,  9, 10])

In [91]:
data2 = np.array(["Hello", "hello"])
condition3 = np.char.startswith(data2, 'h')
data2[condition3]

array(['hello'], dtype='<U5')

### 6.3 Fancy Indexing
Use lists of integers to select data in a specific order. Remember, this creates a copy.

In [92]:
arr = np.arange(100, 110)
arr

array([100, 101, 102, 103, 104, 105, 106, 107, 108, 109])

In [93]:
arr[[1, 4, 7]]

array([101, 104, 107])

---

## Chapter 7: NumPy Operations

### 7.1 Vectorization & Universal Functions (Ufuncs)
A ufunc is a function that operates on arrays in an element-by-element fashion. They are the core of NumPy's vectorized operations.

In [94]:
x = np.array([1, 4, 9, 16])
y = np.array([1, 2, 3, 4])

In [95]:
x+y

array([ 2,  6, 12, 20])

In [96]:
np.add(x, y) #run time ดีกว่า x+y นิดหน่อย

array([ 2,  6, 12, 20])

In [97]:
x-y

array([ 0,  2,  6, 12])

In [98]:
np.sqrt(x).astype(int)

array([1, 2, 3, 4])

In [99]:
list_x = np.array([1, 4, 9, 16])
list_y = np.array([1, 2, 3, 4])
list_x + list_y


array([ 2,  6, 12, 20])

### 7.2 Broadcasting
Broadcasting describes how NumPy treats arrays with different shapes during arithmetic operations. The smaller array is "broadcast" across the larger one so that they have compatible shapes.

In [100]:
matrix = np.ones((3, 4), dtype=int)
arr = np.arange(4)

print(matrix)
print(arr)


[[1 1 1 1]
 [1 1 1 1]
 [1 1 1 1]]
[0 1 2 3]


In [101]:
print(matrix + arr) # rows == column --> Broadcasting

[[1 2 3 4]
 [1 2 3 4]
 [1 2 3 4]]


### 7.3 Statistical Operations
NumPy has a rich suite of statistical functions. The most crucial concept is the **`axis`** parameter, which specifies the dimension to operate along.

#### Finding Indices with `argmin` and `argmax`
Instead of getting the min/max value, these functions give you the *index* of that value. This is extremely useful in machine learning.
- **`axis=0`**: Finds the index of the min/max value in each **column**.
- **`axis=1`**: Finds the index of the min/max value in each **row**.

In [102]:
data = np.array([[10, 7, 4], [1, 2, 3], [9, 99, 999]])
data

array([[ 10,   7,   4],
       [  1,   2,   3],
       [  9,  99, 999]])

In [103]:
"""
axis 0 = column
axis 1 = row

"""
print(data.min(axis=0))
print(data.max(axis=0))

print(data.max(axis=1))

[1 2 3]
[ 10  99 999]
[ 10   3 999]


In [104]:
data.argmax(axis=1) # return index

array([0, 2, 2])

### 7.4 Linear Algebra
The `np.linalg` submodule contains core linear algebra functionality.

In [105]:
A = np.arange(1, 5).reshape(2, 2)
B = np.arange(5, 9).reshape(2, 2)

print(A)
print(B)

[[1 2]
 [3 4]]
[[5 6]
 [7 8]]


In [106]:
A.T

array([[1, 3],
       [2, 4]])

In [107]:
A @ B # np.dot(a, b)

array([[19, 22],
       [43, 50]])

In [108]:
np.linalg.inv(A) # Invert matrix

array([[-2. ,  1. ],
       [ 1.5, -0.5]])

In [109]:
np.linalg.inv(A) @ A

array([[1.00000000e+00, 0.00000000e+00],
       [1.11022302e-16, 1.00000000e+00]])

In [110]:
np.linalg.det(A)

np.float64(-2.0000000000000004)

### 7.5 Modern Random Number Generation
For reproducible science, it's best to create a **Generator** instance and seed it.

In [111]:
(np.random.rand(3, 5)*100//10).astype(int)

array([[0, 5, 5, 9, 9],
       [4, 0, 1, 0, 9],
       [4, 7, 7, 7, 9]])

In [112]:
np.random.randint([1, 5, 7])

array([0, 1, 1], dtype=int32)

In [113]:
from numpy.random import default_rng 

rng = default_rng(seed=42)
rng.integers(10, size=5)

array([0, 7, 6, 4, 4])

In [114]:
rng.standard_normal(10)

array([ 0.94056472, -1.95103519, -1.30217951,  0.1278404 , -0.31624259,
       -0.01680116, -0.85304393,  0.87939797,  0.77779194,  0.0660307 ])

In [115]:
deck = np.arange(10)
rng.shuffle(deck)
deck

array([0, 1, 8, 4, 7, 9, 3, 5, 6, 2])

In [116]:
rng.choice(deck, size=3, replace=False) # replace False ไม่หยิบเลขซ้ำ

array([0, 5, 6])