# Self-learning Numpy

## What's Numpy?

NumPy is a powerful library for numerical computing in Python, offering support for ***arrays*** (including multidimensional arrays), as well as ***an assortment of mathematical functions*** to operate on these arrays. 

## The potential of Numpy

The library's advantages might not be obvious if we're only working with small datasets or straightforward operations that we could manually code.

However, as we progress in our data manipulation and scientific computing tasks, we'll likely find that NumPy offers several key benefits:

1. **Efficiency**: NumPy is implemented in C and optimized for performance. It's generally ***much faster*** than native Python code for numerical operations.

2. **Convenience**: NumPy provides ***a large library*** of built-in functions for mathematical computations, statistics, linear algebra, etc., making it easier to perform complex tasks without needing to write these algorithms from scratch.

3. **Memory Efficiency**: NumPy arrays are ***more memory-efficient*** than native Python lists. This can be crucial when we're working with large datasets.

4. **Broadcasting**: This feature enables NumPy to ***handle arrays with different shapes*** during arithmetic operations, which can reduce the need for explicit loops and make the code more readable and faster.

5. **Ecosystem**: NumPy is ***fundamental to other Python libraries*** like pandas for data manipulation, matplotlib for plotting, and scikit-learn for machine learning. Knowledge of NumPy will be beneficial when diving into these areas.

6. **Vectorization**: NumPy allows for ***vectorized operations***, meaning operations apply to entire arrays rather than element-by-element, leading to more concise and readable code.

7. **Scientific Computing**: In fields like data science, machine learning, scientific research, and engineering, NumPy is practically the standard for numerical computations in Python.

For instance, consider a case where you have to invert a large matrix, perform eigenvalue decompositions (特征值分解), or apply Fourier transformations (傅里叶变换). Doing this manually would require a significant amount of code and would not be as optimized. NumPy has built-in functions for these that are both convenient and efficient.

So while you might not see the immediate benefit as a beginner, NumPy is a tool that becomes increasingly valuable as your tasks become more complex.

## Basic usage

**Create Array and access it**

In [6]:
import numpy as np  # 'np' is a commonly used alias for NumPy

# Creating a 1D array
a = np.array([1, 2, 3])

# Print the array
print(a)

# Print shape of array
print(a.shape)

# Print size of array
print(a.size)

# Print data type of array
print(a.dtype)

[1 2 3]
(3,)
3
int32


In [15]:
# Create a 2D array with 3 rows and 4 columns
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

# Display its shape
print(arr.shape)

# Display its size
print(arr.size)

# Access data
print("The last row :{}".format(arr[-1]))

# Indexing: Get the element in the first row and second column
print(arr[0, 1])

# Slicing: Get the first two rows and first two columns
print(arr[:2, :2]) 

# Reshape into a 1D array and display it
print(arr.reshape(-1))

(3, 4)
12
The last row :[ 9 10 11 12]
2
[[1 2]
 [5 6]]
[ 1  2  3  4  5  6  7  8  9 10 11 12]


**Manipulate array**

In [17]:
# Broadcasting: Add 1 to each element of the array
broadcasted = arr + 1
print(f"Broadcasting result: {broadcasted}")

# Sum along axis=0 (sum of each column)
sum_axis_0 = np.sum(arr, axis=0)
print(f"Summed result: {sum_axis_0}")

# Mean along axis=1 (mean of each row)
mean_axis_1 = np.mean(arr, axis=1)
print(f"Meaned result: {mean_axis_1}")

# Vertical stack
vstacked = np.vstack((arr, arr))
print(f"Vstacked result: {vstacked}")

# Horizontal stack
hstacked = np.hstack((arr, arr))
print(f"hstacked result: {hstacked}")

Broadcasting result: [[ 2  3  4  5]
 [ 6  7  8  9]
 [10 11 12 13]]
Summed result: [15 18 21 24]
Meaned result: [ 2.5  6.5 10.5]
Vstacked result: [[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
hstacked result: [[ 1  2  3  4  1  2  3  4]
 [ 5  6  7  8  5  6  7  8]
 [ 9 10 11 12  9 10 11 12]]


### Talk about something feathers of numpy's syntax

#### Axis

The term "axis" (`轴`) in NumPy refers to the dimensions of the array. For example, in a 2D array (often thought of as a matrix), you have two axes: 

- `axis=0`: This axis runs vertically downward across rows. When you perform an operation along this axis, it acts "column-wise."
- `axis=1`: This axis runs horizontally across columns. When you perform an operation along this axis, it acts "row-wise."

When you perform operations like summing or finding the mean along a specific axis, NumPy will collapse that axis by applying the specified operation. 

For instance, if you sum along `axis=0`, NumPy will sum the values in each column together, effectively "collapsing" the array along its vertical axis.

#### V/H stack

In NumPy, the `vstack` and `hstack` functions are used to stack arrays vertically and horizontally, respectively.

##### Vertical Stack
The `np.vstack` function takes a sequence (usually a tuple or list) of arrays and stacks them vertically, i.e., along `axis=0`.

**Syntax:**
```python
np.vstack((array1, array2, ..., arrayN))
```

**Example:**
```python
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
result = np.vstack((a, b))
```
`result` will be:
```
[[1, 2, 3],
 [4, 5, 6]]
```

##### Horizontal Stack
The `np.hstack` function takes a sequence (usually a tuple or list) of arrays and stacks them horizontally, i.e., along `axis=1`.

**Syntax:**
```python
np.hstack((array1, array2, ..., arrayN))
```

**Example:**
```python
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
result = np.hstack((a, b))
```
`result` will be:
```
[1, 2, 3, 4, 5, 6]
```

###### Note:
- For `vstack`, the number of columns in each array should be the same.
- For `hstack`, the number of rows in each array should be the same.