# Intro to Numpy `.array()`

In [1]:
import numpy as np

a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(a.ndim) # 점
print(b.ndim) # 선
print(c.ndim) # 2d
print(d.ndim) # 3d

0
1
2
3


### Notes

```python
c = np.array([[1, 2, 3], [4, 5, 6], [7,8,9]])
```
위 배열도 2차원인 이유 : 
3행 3열이 되었지만, 여전히 2차원임.
```python
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
```
반면 위는 
* [1, 2, 3] : 1차원 
* [[1, 2, 3], [4, 5, 6]] : 2차원
* [[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]] : 3차원임

다만, 항상 [] 의 숫자와 차원이 일치하지 않음
```python
arr = np.array([1, 2, 3, [4, 5]])
```
리스트 안에 다른 리스트가 있냐 없냐가 중요

## Creating Higher diemnsion `ndmin`
Create an array with 5 dimensions and verify that it has 5 dimensions:

In [2]:
arr = np.array([1, 2, 3, 4], ndmin=5) # 직접 지정 가능
print(arr)
print('number of dimensions :', arr.ndim)

[[[[[1 2 3 4]]]]]
number of dimensions : 5


---

# Indexing


## 1D Array
Same as python or cpp.. Use index

```python
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr[2] + arr[3])
```

## 2D Array
Little bit different that cpp.
Instead of arr[2][3], its like arr[2,3]

In [4]:
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('2nd element on 1st row: ', arr[0, 1])

2nd element on 1st row:  2


In [3]:
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('5th element on 2nd row: ', arr[1, 4])

5th element on 2nd row:  10


## 3D Array
To access elements from 3-D arrays we can use comma separated integers representing the dimensions and the index of the element.

In [6]:
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(arr)
print(arr[0, 1, 2]) # 예상 : 6  실제 :6. 0번째 블록, 1번째 행, 2번째 열 

[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]
6


In [9]:
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(arr)
print(arr[1, 1, 0]) # 첫번째 블록의 첫번째 행의 0번째 요소 -> 7

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]
7


## Negative Indexing
Like Python, negative index is to access an array from the end.

In [7]:
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('Last element from 2nd dim: ', arr[1, -1]) # 1번째 블록의 마지막 요소

Last element from 2nd dim:  10


### Exercise: NumPy Indexing Arrays
Insert the correct syntax for printing the number 50 from the array.

In [10]:
arr = np.array([[10, 20, 30, 40], [50, 60, 70, 80]])

# print(arr[1][0]) 아...
print(arr[1, 0]) # 50

50


### 오답노트

결과는 같지만, [1,0]이 더 빠르고 권장됨
[1][0]은 각 단계마다 배열을 슬라이스해서 임시 배열을 생성하기 때문에 큰 배열에서는 비효율적

---

# Array Slicing

Slicing in python means taking elements from one given index to another given index.

We pass slice instead of index like this: `[start:end]`

We can also define the step, like this: `[start:end:step]`

* If we don't pass start its considered 0
* If we don't pass end its considered length of array in that dimension
* If we don't pass step its considered 1

파이썬 list 같음
```python
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5])
```

## Negative Slicing
Slice from the index 3 from the end to index 1 from the end:

In [11]:

arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[-3:-1])

[5 6]


즉.. 내가 생각한 맨 마지막에서 3번째 ~ 마지막을 프린트하려면, 아래처럼 해야함

In [12]:

arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[-3:])

[5 6 7]


## Step
Use the `step` value to determine the step of the slicing:
### 헷갈리는 부분 - step
step = 2 → "현재 인덱스에서 2칸 이동"
아래에서 처음에는 [2,3,4,5] 중 step 2니까, 2,5 인줄 알았는데, step 1 은 다음 원소라는 뜻.
-> 현재로부터 h 스탭임. => 2,4 임.

```python
mport numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5:2])
```

### Return every other element

In [13]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[::2])

[1 3 5 7]


## Slicing 2D Array

### Examples
From the second element, slice elements from index 1 to index 4 (not included):

In [14]:
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
print(arr[1, 1:4])

[7 8 9]


From both elements, return index 2:

In [15]:
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
print(arr[0:2, 2]) # 모든 블럭의 2번째 요소

[3 8]


From both elements, slice index 1 to index 4 (not included), this will return a 2-D array:

In [16]:
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
print(arr[0:2, 1:4]) # 모든 블록에서, 1~3번째 요소

[[2 3 4]
 [7 8 9]]


Everything from (including) the second item to (not including) the fifth item.

In [20]:
arr = np.array([10, 15, 20, 25, 30, 35, 40])

print(arr[1:4])

[15 20 25]


Every other item from (including) the second item to (not including) the fifth item.

In [22]:
arr = np.array([10, 15, 20, 25, 30, 35, 40])
print(arr[1:4:2])

[15 25]


---

# Data Types

## Data Types in Python

By default Python have these data types:

* strings - used to represent text data, the text is given under quote marks. e.g. "ABCD"
* integer - used to represent integer numbers. e.g. -1, -2, -3
* float - used to represent real numbers. e.g. 1.2, 42.42
* boolean - used to represent True or False.
* complex - used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5j

## Data Types in NumPy
NumPy has some `extra` data types, and refer to data types with one character, like i for integers, u for unsigned integers etc.

Below is a list of all data types in NumPy and the characters used to represent them.

* i - integer
* b - boolean
* u - unsigned integer
* f - float
* c - complex float
* m - timedelta
* M - datetime
* O - object
* S - string
* U - unicode string
* V - fixed chunk of memory for other type ( void )

## Checking the Data Type of an Array `dtype`

In [23]:
arr = np.array([1, 2, 3, 4])
print(arr.dtype)

int64


In [24]:
arr = np.array(['apple', 'banana', 'cherry'])
print(arr.dtype)

<U6


* U → 유니코드 문자열(Unicode)
* 6 → 문자열 최대 길이

그렇다면 string으로 다루고 싶다면..?

In [25]:
arr = np.array(['apple', 'banana', 'cherry'], dtype=object)
print(arr.dtype)  # object
print(arr)

object
['apple' 'banana' 'cherry']



* 장점: 길이 제한 없음, 가변 문자열 가능
* 단점: 속도는 U 타입보다 느릴 수 있음 (왜냐하면 내부적으로 Python 객체로 저장되기 때문)

## Creating Arrays With a Defined Data Type
이미 위에서 했다싶이 dtype를 지정해주면 된다.

### Example
Create an array with data type 4 bytes integer:

In [26]:
arr = np.array([1,2,3,4], dtype='i4')
print(arr.dtype)  # int 32 (2^4 = 32)
print(arr)

int32
[1 2 3 4]


## What if a Value Can Not Be Converted?
If a type is given in which elements can't be casted then NumPy will raise a ValueError.

In [27]:
arr = np.array(['a','b'], dtype='i4')
print(arr.dtype)  # int 32 (2^4 = 32)
print(arr)

ValueError: invalid literal for int() with base 10: 'a'

## Converting Data Type on Existing Arrays `astype()`
`astype()` 을 사용하면, 기존의 data type도 변경할 수 있음

In [31]:
arr = np.array([1.1, 2.1, 3.1])
print(arr.dtype)  # int 16 (2^2 = 16)
# newarr = arr.astype('i')  # 정수로 직접 지정
newarr = arr.astype(int) # parameter로 설정 가능

print(newarr.dtype)
print(newarr)

float64
int64
[1 2 3]


### Examples
Consider the following code:
What will be the printed result?

In [32]:
import numpy as np
arr = np.array([-1, 0, 1])
newarr = arr.astype(bool)
print(newarr)

[ True False  True]


### 어째서 false false true 가 아닌지? - Boolean 변환 규칙
규칙: bool() 변환

* 0 → False
* 0이 아닌 수 → True

---

# Copy vs View
## The Difference Between Copy and View
The main difference between a copy and a view of an array is that the **copy is a new array, and the view is just a view of the original array**

The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy.

The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.

## Copy `.copy()`


In [33]:
arr = np.array([1, 2, 3, 4, 5])
x = arr.copy()
arr[0] = 42

print(arr)
print(x)

[42  2  3  4  5]
[1 2 3 4 5]


## View `.view()`

In [36]:
arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
arr[0] = 42

print(arr)
print(x)


print("Changing view will affect the original array")
x[0]=-3
print(arr)
print(x)

[42  2  3  4  5]
[42  2  3  4  5]
Changing view will affect the original array
[-3  2  3  4  5]
[-3  2  3  4  5]


## Check if Array Owns its Data `base()`
Use `base` to see the owner - if it owns the data, it will return `None`

In [37]:
arr = np.array([1, 2, 3, 4, 5])

x = arr.copy()
y = arr.view()

print(x.base)
print(y.base)

None
[1 2 3 4 5]


---

# Array Shapes
## Shape of an Array
The shape of an array is the `number of elements in each dimension`

# Shape Attribute

Get the Shape of an Array `.shape()`
NumPy arrays have an attribute called `shape` that **returns a tuple with each index having the number of corresponding elements**

### Example 
2D: Print the shape of a 2-D array

In [38]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(arr.shape) 

(2, 4)


### 오답노트
4,4 가 아닌 2,4
The example above returns (2, 4), which means that the **array has 2 dimensions**, where the **first dimension has 2 elements** and the **second has 4**

대 -> 소, 라 블럭 개수-> 요소 개수 이순

In [39]:
arr = np.array([1, 2, 3, 4], ndmin=5)
print(arr)
print('shape of array :', arr.shape)

[[[[[1 2 3 4]]]]]
shape of array : (1, 1, 1, 1, 4)


### 오답노트
* 예상 : ndmin -> 5 차원이기 때문에 (5,4) 이라고 생각했는데 아니었음
### 설명
원래는 1차원. 
1차원일 때는 길이 4(요소개수) <- 이거 마지막에 들어갈 예정
근데, 원래 1차원을 5차원으로 늘림

### 차원 늘리는 방법 : 앞쪽에 길이 1차원짜리를 추가함
(1,1,1,1,4)

### 추가 연습문제..

In [40]:
arr = np.array([10, 20, 30])
newarr = np.array(arr, ndmin=4)

print(newarr)
print(newarr.shape)

[[[[10 20 30]]]]
(1, 1, 1, 3)


예상
1. 1차원 - (3)
2. ndmin으로 차원 늘림 (1,1,1,3)

In [41]:
arr = np.array([[1,2],[3,4]])
newarr = np.array(arr, ndmin=5)

print(newarr)
print(newarr.shape)

[[[[[1 2]
    [3 4]]]]]
(1, 1, 1, 2, 2)


예상
1. 2차원 - (2,2) # 블록 두개 길이 2개
2. 차원 증량 - (1,1,1,2,2)

In [42]:
arr = np.array([[[1,2,3],[4,5,6]]])
print(arr.shape)

newarr = arr[0, :, :]
print(newarr.shape)

(1, 2, 3)
(2, 3)


예상
1. 3차원 - (1,2,3) # 블록 두개, 길이 3 . Outer = none(1)
2. newarr 를 통해 2차원으로 조정 -> (2,3)

In [45]:
arr = np.array([1,2,3,4], ndmin=3)
print(arr)
print(arr.shape)

# 다음 중 4에 접근하려면?
# a) arr[3]
# b) arr[0,0,3]
# c) arr[0,3,0]
print(arr[0,0,3])

[[[1 2 3 4]]]
(1, 1, 4)
4


1. 1차원 (4)
2. 차원증량 (1,1,4)
-> 즉, 새 array는 .. [[[1,2,3,4]]] 이렇게 생겼을 것
shape : 1,1,3 -> 1,1,4라고 함.

4에 접근하려면 arr[1][1][3] 

### 오답노트
결론 : shape - 마지막  길이라 4임, 4 찾기 아래 참고

### 1️⃣ 원래 배열

```python
arr = np.array([1,2,3,4])
```

* 1차원, 길이 4
* 그림으로 표현하면:

```
[ 1, 2, 3, 4 ]
```

* shape = `(4,)`
* ndmin=1 (기본) → 그냥 1차원

---

### 2️⃣ ndmin=3 적용

```python
arr = np.array([1,2,3,4], ndmin=3)
```

* 최소 3차원으로 만들어야 해요

* **앞쪽에 길이 1짜리 차원 2개를 추가**

* 그래서 shape = `(1, 1, 4)`

* 그림으로 표현하면:

```
차원 1 (가장 바깥)
┌───────────────┐
│ 차원 2        │
│ ┌───────────┐ │
│ │ 차원 3    │ │
│ │ [1,2,3,4]│ │
│ └───────────┘ │
└───────────────┘
```

* 여기서:

  * 첫 번째 차원 = 1개 블록
  * 두 번째 차원 = 1개 블록
  * 세 번째 차원 = 4개 요소

---

### 3️⃣ 배열 인덱싱 (완전 이해 도움받음 ㅇㅅㅇ~!)

* `arr[0,0,0]` → 1

* `arr[0,0,1]` → 2

* `arr[0,0,2]` → 3

* `arr[0,0,3]` → 4 

* **주의:** 앞쪽 두 차원의 인덱스는 **0밖에 없음** → 길이가 1이니까

* 따라서 `arr[1][1][3]` → **IndexError** 발생

---

### 4️⃣ 핵심 포인트

1. `ndmin` = 최소 차원 수 → 부족한 차원은 **앞쪽에 길이 1로 추가**
2. 원래 배열 길이는 마지막 차원에 그대로 남음
3. 인덱스 순서 = `(가장 바깥 차원, ..., 마지막 차원)`


In [44]:
arr = np.array([5,6,7,8])
newarr = arr.reshape(2,2)

print(newarr)
print(newarr.shape)

[[5 6]
 [7 8]]
(2, 2)


1. 1차원 (4)
2. reshape을 통해서 [5,6][7,8] 로 쪼갬 
3. newarr는 (2,2) 2차원, 2길이

----

# Reshape `.reshape()`

The shape of an array is the number of elements in each dimension.

By reshaping we can add or remove dimensions or change number of elements in each dimension.

## Reshape From 1-D to 2-D

Convert the following 1-D array with 12 elements into a 2-D array.

The outermost dimension will have 4 arrays, each with 3 elements:

In [48]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

In [49]:
newarr = arr.reshape(4, 3)

print(newarr)
print(newarr.shape)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
(4, 3)


## Reshape From 1-D to 3-D
Convert the following 1-D array with 12 elements into a 3-D array.

The outermost dimension will have 2 arrays that contains 3 arrays, each with 2 elements:

In [50]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

In [53]:
newarr = arr.reshape(2, 3, 2)
print(newarr)
print(newarr.shape)

[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]
(2, 3, 2)


# Can We Reshape Into any Shape?
es, as long as the elements required for reshaping are equal in both shapes.

We can reshape an 8 elements 1D array into 4 elements in 2 rows 2D array but we cannot reshape it into a 3 elements 3 rows 2D array as that would require 3x3 = 9 elements.

In [54]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
newarr = arr.reshape(3, 3)
print(newarr)

ValueError: cannot reshape array of size 8 into shape (3,3)

# Returns Copy or View?
Check if the returned array is a copy or a view

In [55]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
print(arr.reshape(2, 4).base)

[1 2 3 4 5 6 7 8]


### 오답노트
* 내 예상 : reshape 하니까, 지우고 새로 만들게 됌. 즉 None return
* 실제 : The example above returns the original array, so it is a view.

## Unknown Dimension `.reshape(n,n,-1)`

> 관대하다.. You are allowed to have one "unknown" dimension.

Meaning that you do not have to specify an exact number for one of the dimensions in the reshape method.

Pass -1 as the value, and NumPy will calculate this number for you.

> **Note: We can not pass -1 to more than one dimension.**

In [56]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
newarr = arr.reshape(2, 2, -1)
print(newarr)

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


## Flattening the arrays `.reshape(-1)`
Use `reshape(-1)`

Flattening array means converting a multidimensional array into a 1D array.

In [57]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
newarr = arr.reshape(-1)
print(newarr)

[1 2 3 4 5 6]


---

# Iterating

## Iterating Arrays
We can do this using basic for loop of python.

In [58]:
arr = np.array([1, 2, 3])

for x in arr:
  print(x)

1
2
3


근데..? 파이썬 문법도 적용 가능!
```python
print(list(x for x in arr))
# 결과: [1, 2, 3]
```

# Iterating 2-D Arrays

In [59]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

for x in arr:
  print(x)

[1 2 3]
[4 5 6]


In [61]:
print(list(x for x in arr))

[array([1, 2, 3]), array([4, 5, 6])]


To return the actual values, the scalars, we have to iterate the arrays in each dimension.

In [62]:
for x in arr:
  for y in x:
    print(y)

1
2
3
4
5
6


## Iterating 3-D Arrays

In [63]:
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

In [67]:
print(list(_ for x in arr for _ in x))

[array([1, 2, 3]), array([4, 5, 6]), array([7, 8, 9]), array([10, 11, 12])]


In [68]:
# 마찬가지로 모든 원소 프린트하고싶으면 ,cpp 처럼 3중 for loop 쓰면 된다.
for x in arr:
  for y in x:
    for z in y:
      print(z)

1
2
3
4
5
6
7
8
9
10
11
12


## Iterating Arrays Using `nditer()`
The function `nditer()` is a helping function that can be used from very basic to very advanced iterations. **It solves some basic issues** which we face in iteration, lets go through it with examples.

### Iterating on Each Scalar Element
In basic for loops, iterating through each scalar of an array we need to use n for loops which can be difficult to write for arrays with very high dimensionality.

> **즉, 아까의 nested loop를 nditer 를 통해 바로 출력 가능하다.**

In [71]:
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
for x in np.nditer(arr):
  print(x)

1
2
3
4
5
6
7
8
9
10
11
12


## Iterating Array With Different Data Types `op_types`

We can use `op_dtypes` argument and `pass it the expected datatype` **to change the datatype of elements while iterating**

NumPy does not change the data type of the element in-place (where the element is in array) so it needs some other space to perform this action, that extra space is called buffer, and **in order to enable it in nditer() we pass flags=['buffered'].**

Iterate through the array as a string:

In [72]:
arr = np.array([1, 2, 3])

for x in np.nditer(arr, flags=['buffered'], op_dtypes=['S']):
  print(x)

b'1'
b'2'
b'3'


나의 사적인 궁금증..

In [77]:
arr = np.array([1.1, 2.1, 3.1])

for x in np.nditer(arr, flags=['buffered'], op_dtypes=['int']):
    print(int(x))  # x는 numpy scalar, 필요 시 int 변환


TypeError: Iterator operand 0 dtype could not be cast from dtype('float64') to dtype('int64') according to the rule 'safe'

In [78]:
for x in np.nditer(arr, flags=['buffered'], op_dtypes=['int'], casting='unsafe'):
    print(int(x))

1
2
3


## Iterating With Different Step Size
We can use filtering and followed by iteration. - 파이썬 문법과 동일

In [79]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

for x in np.nditer(arr[:, ::2]):
  print(x)

1
3
5
7


# Enumerated Iteration Using `ndenumerate()`
Enumeration means mentioning sequence number of somethings one by one.

Sometimes we **require corresponding index of the element** while iterating, the ndenumerate() method can be used for those usecases.
-> python 의 enumerate와 corresponding

In [80]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

for idx, x in np.ndenumerate(arr):
  print(idx, x)

(0, 0) 1
(0, 1) 2
(0, 2) 3
(0, 3) 4
(1, 0) 5
(1, 1) 6
(1, 2) 7
(1, 3) 8


---

# Joining Array
## Joining NumPy Arrays `.concatenate()`
In SQL we join tables based on a key, whereas in NumPy **we join arrays by axes**

We pass a sequence of arrays that we want to join to the concatenate() function, along with the axis. If axis is not explicitly passed, it is taken as 0.

### .concate()

In [81]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

arr = np.concatenate((arr1, arr2))
print(arr)

[1 2 3 4 5 6]


Join two 2-D arrays along rows (axis=1):

np.concatenate에서 **axis**는 어느 방향으로 배열을 이어 붙일지를 결정
* axis=0 → 세로 방향으로 이어붙임 (행 기준)
* axis=1 → 가로 방향으로 이어붙임 (열 기준)

In [82]:
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
arr = np.concatenate((arr1, arr2), axis=1)

print(arr)

[[1 2 5 6]
 [3 4 7 8]]


arr1과 arr2 모두 2행 2열 (shape=(2,2))

axis=1 → 열 방향(옆으로)

→ 같은 행끼리 옆으로 붙임
```
[1,2] + [5,6] = [1,2,5,6]
[3,4] + [7,8] = [3,4,7,8]
```

axis=0 → 행 방향(아래로)

→ [1,2] 아래에 [5,6], [3,4] 아래에 [7,8]
```
[[1 2]
 [3 4]
 [5 6]
 [7 8]]
 ```

## Joining Arrays Using Stack Functions `.stack()`
Stacking is same as concatenation, the only difference is that **stacking is done along a new axis**

We can concatenate two 1-D arrays along the second axis which would result in putting them one over the other, ie. stacking.

We pass a sequence of arrays that we want to join to the stack() method along with the axis. If axis is not explicitly passed it is taken as 0.

In [83]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.stack((arr1, arr2), axis=1) # column 기준으로 쌓기

print(arr)

[[1 4]
 [2 5]
 [3 6]]


## Stacking Along Rows `hstack()`
NumPy provides a helper function: `hstack()` to stack along rows.

In [84]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.hstack((arr1, arr2))

print(arr)

[1 2 3 4 5 6]


## Stacking Along Columns `vstack()`
NumPy provides a helper function: `vstack()` to stack along columns.

In [85]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.vstack((arr1, arr2))

print(arr)

[[1 2 3]
 [4 5 6]]


## Stacking Along Height (depth) `dstack()`
NumPy provides a helper function: dstack() to **stack along height**, which is the same as depth.

In [86]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.dstack((arr1, arr2))

print(arr)

[[[1 4]
  [2 5]
  [3 6]]]


### 나의 혼란.. dstack과 .stack의 차이점이 무엇인지.?
np.stack에서 axis가 무엇이냐에 따라 결과가 완전히 달라진다

In [88]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

이건, axis를 안 지정했으니 기본값 axis=0으로 쌓여서 shape가 (2,3)이 나온 것.

In [89]:
arr = np.stack((arr1, arr2)) # column 기준으로 쌓기
print(arr)
print(arr.shape)

[[1 2 3]
 [4 5 6]]
(2, 3)


만약, axis =1 로 설정한다면

In [91]:
arr = np.stack((arr1, arr2), axis=1) # axis =1 로 설정
print(arr)
print(arr.shape)

[[1 4]
 [2 5]
 [3 6]]
(3, 2)


반면, dstack은 자세히 보면, depth 가 표현됌.. 즉 3차원임
* stack → axis에 따라 자유롭게 쌓는 일반적인 함수
* dstack → 항상 “깊이 방향(axis=2)”으로 쌓는 특수한 stack (1차원도 깊이 있음)

In [92]:
arr_dstack = np.dstack((arr1, arr2))
print(arr_dstack)
print(arr_dstack.shape)

[[[1 4]
  [2 5]
  [3 6]]]
(1, 3, 2)


## 정리

| 함수          | 차원 변화                | axis 기본값 | 특징/설명                                   |
| ----------- | -------------------- | -------- | --------------------------------------- |
| `np.stack`  | 새로운 축 생성             | 0        | axis에 따라 자유롭게 쌓음, 결과는 n차원               |
| `np.dstack` | 항상 axis=2(깊이 방향)로 쌓임 | 2        | 결과는 3차원 배열, 이미지 채널처럼 사용 가능              |
| `np.hstack` | 수평으로 이어붙임(axis=1)    | N/A      | 1차원 → 1차원, 2차원 → 열 기준으로 옆으로 붙임          |
| `np.vstack` | 수직으로 이어붙임(axis=0)    | N/A      | 1차원 → 2차원으로 세로로 붙임, 2차원 → 행 기준으로 아래에 붙임 |


---

# Splitting Array

## Splitting NumPy Arrays `.array_split()`
Splitting is **reverse operation of Joining**

Joining merges multiple arrays into one and Splitting breaks one array into multiple.

We use `array_split()` for splitting arrays, **we pass it the array we want to split and the number of splits**

### Example
Split the array in 3 parts:

In [93]:
arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr, 3)

print(newarr)

[array([1, 2]), array([3, 4]), array([5, 6])]


> **Note: The return value is a list containing three arrays.**
If the array has **less elements than required**, it will **adjust from the end accordingly**

In [96]:
arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr, 4)
print(newarr)

newarr = np.array_split(arr, 7)
print(newarr)

[array([1, 2]), array([3, 4]), array([5]), array([6])]
[array([1]), array([2]), array([3]), array([4]), array([5]), array([6]), array([], dtype=int64)]


> **Note: We also have the method split() available but it will not adjust the elements when elements are less in source array for splitting like in example above, array_split() worked properly but split() would fail.**

## Split Into Arrays

The **return value** of the array_split() method is a **list** containing each of the split as an array.

If you split an array into 3 arrays, you can access them from the result just like any array element:

In [97]:
arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr, 3)

print(newarr[0])
print(newarr[1])
print(newarr[2])

[1 2]
[3 4]
[5 6]


## Splitting 2-D Arrays

Use the same syntax when splitting 2-D arrays.

Use the array_split() method, **pass in the array you want to split** and **the number of splits you want to do**

In [98]:
arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])
newarr = np.array_split(arr, 3)

print(newarr)

[array([[1, 2],
       [3, 4]]), array([[5, 6],
       [7, 8]]), array([[ 9, 10],
       [11, 12]])]


Above returns three 2-D arrays.

Now, each element in the 2-D arrays contains 3 elements.

In [99]:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])
newarr = np.array_split(arr, 3)

print(newarr)

[array([[1, 2, 3],
       [4, 5, 6]]), array([[ 7,  8,  9],
       [10, 11, 12]]), array([[13, 14, 15],
       [16, 17, 18]])]


### Specify Axis
you can specify which axis you want to do the split around.

The example below also returns three 2-D arrays, but they are split along the column (axis=1).

In [100]:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])
newarr = np.array_split(arr, 3, axis=1)

print(newarr)

[array([[ 1],
       [ 4],
       [ 7],
       [10],
       [13],
       [16]]), array([[ 2],
       [ 5],
       [ 8],
       [11],
       [14],
       [17]]), array([[ 3],
       [ 6],
       [ 9],
       [12],
       [15],
       [18]])]


# `hsplit()`
An alternate solution is using hsplit() opposite of hstack()

In [101]:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])
newarr = np.hsplit(arr, 3)

print(newarr)

[array([[ 1],
       [ 4],
       [ 7],
       [10],
       [13],
       [16]]), array([[ 2],
       [ 5],
       [ 8],
       [11],
       [14],
       [17]]), array([[ 3],
       [ 6],
       [ 9],
       [12],
       [15],
       [18]])]


> **Note: Similar alternates to vstack() and dstack() are available as vsplit() and dsplit()**

----

# Searching Arrays

## Searching Arrays `where()`
> **if 문 안에 적합한 value 가 아니라 index를 리턴한다는 것이 중요**
You can search an array for a **certain value**, and **return the indexes** that get a match.

To search an array, **use the where() method**

### Example
Find the indexes where the value is 4:

In [102]:
arr = np.array([1, 2, 3, 4, 5, 4, 4])

In [103]:
x = np.where(arr == 4)
print(x)

(array([3, 5, 6]),)


The example above will return a tuple: (array([3, 5, 6],)
Which means that the value 4 is present at index 3, 5, and 6.

### Example
Find the indexes where the values are even:

In [104]:
x = np.where(arr % 2 == 0)
print(x)

(array([1, 3, 5, 6]),)


### Example
Find the indexes where the values are odd

In [105]:
x = np.where(arr % 2 == 1)
print(x)

(array([0, 2, 4]),)


## Search Sorted `searchsorted()`
> **간단히 말하면, parameter에 배열과 넣으려는 숫자를 패스해서 어느 인덱스에 넣을지 알려주는 함수**

There is a method called searchsorted() which performs a **binary search in the array**, and **returns the index where the specified value would be inserted to maintain the search order**

The searchsorted() method is assumed to be used on sorted arrays.

In [106]:
arr = np.array([6, 7, 8, 9])

x = np.searchsorted(arr, 7)

print(x)

1


> 그럼 sort 되지 않으면..?

In [113]:
arr = np.array([9, 7, 6, 8])
x = np.searchsorted(arr, 7)

print(x)

3


따라서 sort 하고 써야함..

In [109]:
arr = np.array([6, 9, 7, 8])
arr.sort()
print(arr)

[6 7 8 9]


그럼 뭐 sort 해야겠다.
Anyway,

The method starts the search from the left and returns the first index where the number 7 is no longer larger than the next value.

## Search From the Right Side `side='right'`
By default the left most index is returned, but we can give side='right' to return the right most index instead.

In [111]:
arr = np.array([6, 7, 8, 9])
x = np.searchsorted(arr, 7, side='right')

print(x)

2


## Multiple Values
To search for more than one value, use an array with the specified values.

In [112]:
arr = np.array([1, 3, 5, 7])
x = np.searchsorted(arr, [2, 4, 6])
print(x)

[1 2 3]


The return value is an array: [1 2 3] containing the three indexes where 2, 4, 6 would be inserted in the original array to maintain the order.

---

# Sorting Arrays

## Sorting Arrays `.sort()`
이미 위에서 한번 사용함.

Ordered sequence is any sequence that has an order corresponding to elements, like numeric or alphabetical, ascending or descending.

In [116]:
arr = np.array([3, 2, 0, 1])
print(np.sort(arr))

arr = np.array(['banana', 'cherry', 'apple'])
print(np.sort(arr))

arr = np.array([True, False, True])
print(np.sort(arr))

[0 1 2 3]
['apple' 'banana' 'cherry']
[False  True  True]


> **Note: This method returns a copy of the array, leaving the original array unchanged.**

## Sorting a 2-D Array
If you use the sort() method on a 2-D array, both arrays will be sorted:

In [117]:
arr = np.array([[3, 2, 4], [5, 0, 1]])
print(np.sort(arr))

[[2 3 4]
 [0 1 5]]


### 내림차순 `.sort()[::-1]`

In [125]:
arr = np.array([3, 1, 5, 2])
print(arr)
print("=========")

arr = arr[::-1]   # 뒤집기
print(arr)
print("=========")

desc = np.sort(arr)[::-1]
print(desc)   # [5 3 2 1]

[3 1 5 2]
[2 5 1 3]
[5 3 2 1]


----

# Filter Array
## Filtering Arrays
Getting some elements out of an existing array and creating a new array out of them is called filtering.

In NumPy, you `filter` an array **using a boolean index list**

A boolean index list is a list of booleans corresponding to indexes in the array.

**If the value at an index is True** that **element is contained** in the filtered array, if the value at that index is False that element is excluded from the filtered array.

### Example
Create an array from the elements on index 0 and 2:

In [126]:
arr = np.array([41, 42, 43, 44])
x = [True, False, True, False]
newarr = arr[x] # True 인덱스만 뽑아내기
print(newarr)

[41 43]


## Creating the Filter Array
In the example above we hard-coded the True and False values, but the common use is to create a filter array based on conditions.

말은 그럴싸하지만, 요약하자면 for 과 if-statement 를 사용해서 직접 filter할 수 있다는 소리다.

In [127]:
rr = np.array([41, 42, 43, 44])

# Create an empty list
filter_arr = []

# go through each element in arr
for element in arr:
  # if the element is higher than 42, set the value to True, otherwise False:
  if element > 42:
    filter_arr.append(True)
  else:
    filter_arr.append(False)

newarr = arr[filter_arr]

print(filter_arr)
print(newarr)

[False, False, True, True]
[43 44]


### Example
Create a filter array that will return only even elements from the original array:

In [128]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

# Create an empty list
filter_arr = []

for x in arr :
    if x%2 == 0:
        filter_arr.append(True)
    else :
        filter_arr.append(False)

newarr = arr[filter_arr]
print(newarr)

[2 4 6]


## Creating Filter Directly From Array
The above example is quite a common task in NumPy and NumPy provides a nice way to tackle it.

We can **directly substitute the array** instead of the iterable variable in our condition and it will work just as we expect it to.

### Example
Create a filter array that will return only values higher than 42:

In [129]:
arr = np.array([41, 42, 43, 44])

In [130]:
arr2 = arr > 42
print(arr2)

newarr = arr[arr2]
print(newarr)

[False False  True  True]
[43 44]


### Example
Create a filter array that will return only even elements from the original array:

In [131]:
arr = np.array([41, 42, 43, 44])

In [132]:
arr2 = arr %2 ==0
print(arr2)
newarr = arr[arr2]
print(newarr)

[False  True False  True]
[42 44]
