<a href="https://colab.research.google.com/github/9-coding/PyTorch/blob/main/05-indexing_slicing_fancyIndexing_booleanMask.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Indexing / Slicing / Fancy Indexing / Boolean Mask

In [1]:
import numpy as np
import torch
import tensorflow as tf

In [2]:
for c in [np, torch, tf]:
  print(c.__name__, c.__version__)

numpy 1.25.2
torch 2.2.1+cu121
tensorflow 2.15.0


## Indexing / slicing

### numpy

- 가장 기본.
- arr[x, y, z]와 arr[x][y][z] 둘 다 사용 가능.

In [3]:
a = np.arange(0, 12).reshape(3,4)

print(a)
print('===================')
print(f'a[0] is "{a[0]}"')
print(f'a[0, 2] is "{a[0, 2]}"')
print(f'a[0][2] is "{a[0][2]}"')
print('-------------------')
print(f'a[1, 2:] is "{a[1,2:]}"')
print(f'a[1, ::2] is "{a[1,::2]}"')
print(f'a[1, ::-2] is "{a[1,::-2]}"')
print(f'a[1, ::-1] is "{a[1,::-1]}"')
print(f'a[1, 3:0:-1] is "{a[1, 3:0:-1]}"')

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
a[0] is "[0 1 2 3]"
a[0, 2] is "2"
a[0][2] is "2"
-------------------
a[1, 2:] is "[6 7]"
a[1, ::2] is "[4 6]"
a[1, ::-2] is "[7 5]"
a[1, ::-1] is "[7 6 5 4]"
a[1, 3:0:-1] is "[7 6 5]"


### pytorch

- negative step 동작 X

In [4]:
a = np.arange(0, 12).reshape(3,4)
a_torch = torch.tensor(a)

print(a_torch)
print('===================')
print(f'a_torch[0] is "{a_torch[0]}"')
print(f'a_torch[0, 2] is "{a_torch[0, 2]}"')
print(f'a_torch[0][2] is "{a_torch[0][2]}"')
print('-------------------')
print(f'a_torch[1, 2:] is "{a_torch[1,2:]}"')
print(f'a_torch[1, ::2] is "{a_torch[1,::2]}"')

# negative는 동작하지 않음.
# print(f'a_torch[1, ::-2] is "{a_torch[1,::-2]}"')
# print(f'a_torch[1, ::-1] is "{a_torch[1,::-1]}"')
# print(f'a_torch[1, 3:0:-1] is "{a_torch[1, 3:0:-1]}"')

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
a_torch[0] is "tensor([0, 1, 2, 3])"
a_torch[0, 2] is "2"
a_torch[0][2] is "2"
-------------------
a_torch[1, 2:] is "tensor([6, 7])"
a_torch[1, ::2] is "tensor([4, 6])"


### tensorflow

- tensor가 immutable이라는 것 외에 numpy와 유사.
- negative 동작.

In [5]:
a = np.arange(0, 12).reshape(3,4)
a_tf = tf.constant(a)

print(a_tf)
print('===================')
print(f'a_tf[0] is "{a_tf[0]}"')
print(f'a_tf[0, 2] is "{a_tf[0, 2]}"')
print(f'a_tf[0][2] is "{a_tf[0][2]}"')
print('-------------------')
print(f'a_tf[1, 2:] is "{a_tf[1,2:]}"')
print(f'a_tf[1, ::2] is "{a_tf[1,::2]}"')
print(f'a_tf[1, ::-2] is "{a_tf[1,::-2]}"')
print(f'a_tf[1, ::-1] is "{a_tf[1,::-1]}"')
print(f'a_tf[1, 3:0:-1] is "{a_tf[1, 3:0:-1]}"')

tf.Tensor(
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]], shape=(3, 4), dtype=int64)
a_tf[0] is "[0 1 2 3]"
a_tf[0, 2] is "2"
a_tf[0][2] is "2"
-------------------
a_tf[1, 2:] is "[6 7]"
a_tf[1, ::2] is "[4 6]"
a_tf[1, ::-2] is "[7 5]"
a_tf[1, ::-1] is "[7 6 5 4]"
a_tf[1, 3:0:-1] is "[7 6 5]"


## Fancy indexing

**index tensor를 [ ] 내에 기재하여 여러 elements 한 번에 선택**

- index array는 각 axis에서 index를 나타내는 integer로 이루어진 sequence type instance 사용.
- tensorflow는 `tf.gather` , `tf.gather_nd` 를 통해 유사한 기능을 제공하지만 직접적으로 지원하지 않음.

### 1d tensor

In [7]:
x = np.array([10.,20.,30.,40.,50.])
x_torch = torch.tensor([10.,20.,30.,40.,50.])
x_tf = tf.constant([10.,20.,30.,40.,50.])

f_indices = [3, 4, 1]

print('original:')
print(x)
print('----------')
print('numpy:')
print(x[f_indices])
print('----------')
print('torch:')
print(x_torch[f_indices])
print('----------')
print('tensorflow:')
print(tf.gather(x_tf,f_indices)) #1D 에선 gahter, 2D 이상시 gather_nd
print(tf.gather_nd(x_tf, [ i for i in zip(f_indices,)])) # 굳이 쓴다면, 다음과 같이.

original:
[10. 20. 30. 40. 50.]
----------
numpy:
[40. 50. 20.]
----------
torch:
tensor([40., 50., 20.])
----------
tensorflow:
tf.Tensor([40. 50. 20.], shape=(3,), dtype=float32)
tf.Tensor([40. 50. 20.], shape=(3,), dtype=float32)


### 2d tensor

In [6]:
x = np.arange(5*5).reshape(5,5) * 10
x_torch = torch.arange(5*5).view(size=(5,5)) * 10
x_tf = tf.constant(x)

indices_0 = [0, 1, 2]
indices_1 = [0, 1, 2]

print('original:')
print(x)
print('----------')
print('numpy:')
b = x[indices_0, indices_1]
print('b.shape =',b.shape)
print(b)
print('----------')
print('torch:')
c = x_torch[indices_0, indices_1]
print('c.shape =',c.shape)
print(c)
print('----------')
print('tensorflow:')
d = tf.gather_nd(x_tf, [ i for i in zip(indices_0, indices_1)])
print('d.shape =',d.shape)
print(d)

original:
[[  0  10  20  30  40]
 [ 50  60  70  80  90]
 [100 110 120 130 140]
 [150 160 170 180 190]
 [200 210 220 230 240]]
----------
numpy:
b.shape = (3,)
[  0  60 120]
----------
torch:
c.shape = torch.Size([3])
tensor([  0,  60, 120])
----------
tensorflow:
d.shape = (3,)
tf.Tensor([  0  60 120], shape=(3,), dtype=int64)


### 3d tensor

In [8]:
x = np.arange(5*5*5).reshape(5,5,5) * 10
x_torch = torch.arange(5*5*5).view(size=(5,5,5)) * 10
x_tf = tf.constant(x)

indices_0 = [0, 1] # x
indices_1 = [1, 2] # y
indices_2 = [2, 0] # z

print('original:')
print(x)
print('----------')
print('numpy:')
b = x[indices_0, indices_1, indices_2]
print('b.shape=',b.shape)
print(b)
print('----------')
print('torch:')
c = x_torch[indices_0, indices_1, indices_2]
print('c.shape=',c.shape)
print(c)
print('----------')
print('tensorflow')
d = tf.gather_nd(x_tf, [ i for i in zip(indices_0, indices_1, indices_2)]) # multi-dim 에선 gater_nd 임.
print('d.shape=',d.shape)
print(d)

original:
[[[   0   10   20   30   40]
  [  50   60   70   80   90]
  [ 100  110  120  130  140]
  [ 150  160  170  180  190]
  [ 200  210  220  230  240]]

 [[ 250  260  270  280  290]
  [ 300  310  320  330  340]
  [ 350  360  370  380  390]
  [ 400  410  420  430  440]
  [ 450  460  470  480  490]]

 [[ 500  510  520  530  540]
  [ 550  560  570  580  590]
  [ 600  610  620  630  640]
  [ 650  660  670  680  690]
  [ 700  710  720  730  740]]

 [[ 750  760  770  780  790]
  [ 800  810  820  830  840]
  [ 850  860  870  880  890]
  [ 900  910  920  930  940]
  [ 950  960  970  980  990]]

 [[1000 1010 1020 1030 1040]
  [1050 1060 1070 1080 1090]
  [1100 1110 1120 1130 1140]
  [1150 1160 1170 1180 1190]
  [1200 1210 1220 1230 1240]]]
----------
numpy:
b.shape= (2,)
[ 70 350]
----------
torch:
c.shape= torch.Size([2])
tensor([ 70, 350])
----------
tensorflow
d.shape= (2,)
tf.Tensor([ 70 350], shape=(2,), dtype=int64)


## Boolean Mask

**대상이 되는 tensor와 같은 shape**를 가지는 boolean mask의 tensor를 통해 특정 element를 선택.

- 텐서 인스턴스에 **비교 연산자를 적용하여 boolean mask를 얻을 수 있음**
- 해당 텐서 인스턴스에 관계(relative, 비교)연산자으로 구성된 expression(=***condition***이라고 불림)을 **"index가 기재되는 square bracket 안에 넣는 방식"**으로의 활용이 많음.

In [9]:
x = np.arange(3*3*3).reshape(3,3,3) * 10

print('original:')
print(x)
print('----------')
print('boolean mask:')
b = x <= 270/2
print(b.shape)
print(b)
print('----------')
print('x <= 135')
print(x[b])
print('----------')
print(x[x<=270/2])
print('----------')
print('----------')
print('x <= 135 | x>= 200')
b1 = b | (x >= 200)
print('----------')
print('boolean mask')
print(b1)
print('----------')
print(x[b1])
print('----------')
print(x[ (x<=270/2) | (x>=200)])

original:
[[[  0  10  20]
  [ 30  40  50]
  [ 60  70  80]]

 [[ 90 100 110]
  [120 130 140]
  [150 160 170]]

 [[180 190 200]
  [210 220 230]
  [240 250 260]]]
----------
boolean mask:
(3, 3, 3)
[[[ True  True  True]
  [ True  True  True]
  [ True  True  True]]

 [[ True  True  True]
  [ True  True False]
  [False False False]]

 [[False False False]
  [False False False]
  [False False False]]]
----------
x <= 135
[  0  10  20  30  40  50  60  70  80  90 100 110 120 130]
----------
[  0  10  20  30  40  50  60  70  80  90 100 110 120 130]
----------
----------
x <= 135 | x>= 200
----------
boolean mask
[[[ True  True  True]
  [ True  True  True]
  [ True  True  True]]

 [[ True  True  True]
  [ True  True False]
  [False False False]]

 [[False False  True]
  [ True  True  True]
  [ True  True  True]]]
----------
[  0  10  20  30  40  50  60  70  80  90 100 110 120 130 200 210 220 230
 240 250 260]
----------
[  0  10  20  30  40  50  60  70  80  90 100 110 120 130 200 210 220 230
 24

135 이하 또는 200 이상인 elements 선택 예시.

In [10]:
x_torch = torch.arange(3*3*3).view(size=(3,3,3)) * 10
print(x_torch[ (x_torch<=270/2) | (x_torch>=200)])

print('--------------')

x_tf = tf.constant(x)
print(x_tf[ (x_tf<= tf.cast(270/2, tf.int64)) | (x_tf>=200)])

tensor([  0,  10,  20,  30,  40,  50,  60,  70,  80,  90, 100, 110, 120, 130,
        200, 210, 220, 230, 240, 250, 260])
--------------
tf.Tensor(
[  0  10  20  30  40  50  60  70  80  90 100 110 120 130 200 210 220 230
 240 250 260], shape=(21,), dtype=int64)


**특정 조건에 맞는 element의 index 얻기 → np.where**

In [11]:
import numpy as np

r = np.random.default_rng(seed = 23)

a = r.random((3,3,3)).astype(np.float32) * 10.

bm = a>=5.
idxs = np.where( a>=5)

print(a)
print('-----------------')
print(np.array(idxs).shape)
print(idxs)

[[[6.9393306  6.4145823  1.2864423 ]
  [1.1370804  6.5334554  8.534571  ]
  [2.0177913  2.1801863  7.1658463 ]]

 [[4.706997   4.1522193  3.491478  ]
  [0.6385375  4.546662   3.014533  ]
  [3.8907673  5.402978   6.835897  ]]

 [[6.2475243  7.4270444  0.18217355]
  [6.542572   5.420625   8.513411  ]
  [9.390292   0.12823118 8.283283  ]]]
-----------------
(3, 14)
(array([0, 0, 0, 0, 0, 1, 1, 2, 2, 2, 2, 2, 2, 2]), array([0, 0, 1, 1, 2, 2, 2, 0, 0, 1, 1, 1, 2, 2]), array([0, 1, 1, 2, 2, 1, 2, 0, 1, 0, 1, 2, 0, 2]))
