# 개념

### Deep Learning architecture는 무엇인가요? ★ (면접 interview)
- correlation filter들을 묶어서 filter bank를 만들고 그 filter bank들의 cascade 구조가 Deep Learning architecture 입니다
  - cascade 구조 : layer들이 연결된 것
 
### Dense Layer
- neuron들의 집합
- 앞 neuron들과 연결

### Mxnet
- GPU를 여러 개 돌리는데 특화되어 있음
- keras를 쓰는 사람들이 GPU 여러 개 돌려야 하면 Mxnet 많이 사용함

# 1. Dense Layers

## 1.1 Shapes of Dense Layers

In [1]:
import tensorflow as tf
from tensorflow.keras.layers import Dense

In [2]:
N, n_feature = 1, 10
X = tf.random.normal(shape=(N, n_feature))  # input setting

n_neuron = 3
dense = Dense(units=n_neuron, activation='sigmoid')  # 1개의 dense layer 안에 3개의 neuron 들어 있음
y = dense(X)

W, B = dense.get_weights()

print('===== Input/Weight/Bias =====')
print('X: ', X.shape)
print('W: ', W.shape)
print('B: ', B.shape)
print('y: ', y.shape)

===== Input/Weight/Bias =====
X:  (1, 10)
W:  (10, 3)
B:  (3,)
y:  (1, 3)


In [3]:
N, n_feature = 8, 10
X = tf.random.normal(shape=(N, n_feature))  # input setting

n_neuron = 3
dense = Dense(units=n_neuron, activation='sigmoid')  # 1개의 dense layer 안에 3개의 neuron 들어 있음
y = dense(X)

W, B = dense.get_weights()

print('===== Input/Weight/Bias =====')
print('X: ', X.shape)
print('W: ', W.shape)
print('B: ', B.shape)
print('y: ', y.shape)

===== Input/Weight/Bias =====
X:  (8, 10)
W:  (10, 3)
B:  (3,)
y:  (8, 3)


## 1.2 Output Calculations

- 연산 검증
- tensorflow와 matrix multiplication, dot products 결과가 모두 같게 나오는 지 확인

### 1.2.1 원래 값

In [4]:
import numpy as np
import tensorflow as tf

from tensorflow.math import exp
from tensorflow.linalg import matmul

from tensorflow.keras.layers import Dense

In [13]:
N, n_feature = 4, 10
X = tf.random.normal(shape=(N, n_feature))  # input setting

n_neuron = 3
dense = Dense(units=n_neuron, activation='sigmoid')  # 1개의 dense layer 안에 3개의 neuron 들어 있음
y_tf = dense(X)

W, B = dense.get_weights()
print('y (tensorflow): \n', y_tf.numpy())

y (tensorflow): 
 [[0.38559175 0.4296191  0.4367892 ]
 [0.88488495 0.25752503 0.11596313]
 [0.5542643  0.47863567 0.38486636]
 [0.4652395  0.5958901  0.36504048]]


### 1.2.2 Calculate with matrix multiplication

In [20]:
z = matmul(X, W) + B
y_man = 1/(1+exp(-z))

print('y (manual with matrix multiplication): \n', y_man.numpy())

y (manual with matrix multiplication): 
 [[0.38559175 0.4296191  0.43678918]
 [0.8848849  0.257525   0.11596316]
 [0.5542643  0.47863567 0.38486633]
 [0.46523947 0.5958901  0.36504045]]


### 1.2.3 : Calculate with dot products

- tensor는 'y_man_vec[0,0] = 10' 과 같은 연산 불가능
  - gradient 구할 수 없게 되기 때문
  - 그래서 numpy를 사용해야 함
- 먼저 틀 만들기 : y_man_vec

In [21]:
y_man_vec = np.zeros(shape=(N, n_neuron))
print(y_man_vec)

for x_idx in range(N):
    x = X[x_idx]
    
    for nu_idx in range(n_neuron):
        w, b = W[:, nu_idx], B[nu_idx]
        
        z = tf.reduce_sum(x*w) + b
        a = 1/(1 + np.exp(-z))
        y_man_vec[x_idx, nu_idx] = a

print('y (manual with dot roducts): \n', y_man_vec)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
y (manual with dot roducts): 
 [[0.38559174 0.4296191  0.43678921]
 [0.88488491 0.25752506 0.11596314]
 [0.55426431 0.47863564 0.38486636]
 [0.46523948 0.5958901  0.36504048]]


# 2. Cascaded Dense Layers

## 2.1 Shapes of Cascaded Dense Layers

In [25]:
import tensorflow as tf
from tensorflow.keras.layers import Dense

# N : minibatch size
N, n_feature = 4, 10
X = tf.random.normal(shape=(N, n_feature))

n_neurons = [3, 5]
dense1 = Dense(units=n_neurons[0], activation='sigmoid')   # 첫 번째 dense layer
dense2 = Dense(units=n_neurons[1], activation='sigmoid')   # 두 번째 dense layer

# Forward Propagation
A1 = dense1(X)
y = dense2(A1)

# get weight/bias
W1, B1 = dense1.get_weights()
W2, B2 = dense2.get_weights()

print('X: {}\n'.format(X.shape))

print('W1: ', W1.shape)
print('B1: ', B1.shape)
print('A1: {}\n'.format(A1.shape))  # minibatch × neuron

print('W2: ', W2.shape)
print('B2: ', B2.shape)
print('y: {}\n'.format(y.shape))    # minibatch × neuron

X: (4, 10)

W1:  (10, 3)
B1:  (3,)
A1: (4, 3)

W2:  (3, 5)
B2:  (5,)
y: (4, 5)



## 2.2 Dense Layers with Python List

- TensorFlow는 매우 flexible 해서 구현할 수 있는 방법이 다양함
  - 기존에 알고 있는 python 문법 그대로 이용해도 됨
- dense layer가 여러 개가 있으면, list와 for문 사용해서 효율적으로 코드 짜기 가능
  - dense1 = Dense(units=n_neurons[0], activation='sigmoid')   # 첫 번째 dense layer
  - dense2 = Dense(units=n_neurons[1], activation='sigmoid')   # 두 번째 dense layer
  - 이렇게 10개 만들면 매우 비효율적인 코드

In [31]:
import tensorflow as tf
from tensorflow.keras.layers import Dense

# N : minibatch size
N, n_feature = 4, 10
X = tf.random.normal(shape=(N, n_feature))

n_neurons = [10,20,30,40,50,60,70,80,90,100]

# dense_layers 안에 10개의 dense layer object 만들기
dense_layers = list()
for n_neuron in n_neurons:
    dense = Dense(units=n_neuron, activation='relu')
    dense_layers.append(dense)

print('Input: ', X.shape)
for idx, dense in enumerate(dense_layers):
    X = dense(X)
    print('After dense layer ', idx+1, ' ', X.shape)

y = X

Input:  (4, 10)
After dense layer  1   (4, 10)
After dense layer  2   (4, 20)
After dense layer  3   (4, 30)
After dense layer  4   (4, 40)
After dense layer  5   (4, 50)
After dense layer  6   (4, 60)
After dense layer  7   (4, 70)
After dense layer  8   (4, 80)
After dense layer  9   (4, 90)
After dense layer  10   (4, 100)


## 2.3 Output Calculations

- Tensorflow에서 각 Layer를 통과하면서 Tensorflow의 연산 과정 살펴보기
- tf.identity(X) : original X값 복사해서 저장하기
  - X_cp = X 라고 하면 두 변수의 주소값이 같아서, X 값이 바뀌면 X_cp 값도 바뀜 → 따로 copy 해 둔 의미가 없음

In [36]:
import tensorflow as tf

from tensorflow.math import exp
from tensorflow.linalg import matmul
from tensorflow.keras.layers import Dense

# N : minibatch size
N, n_feature = 4, 10
X = tf.random.normal(shape=(N, n_feature))
X_cp = tf.identity(X)

n_neurons = [3,4,5]

dense_layers = list()
for n_neuron in n_neurons:
    dense = Dense(units=n_neuron, activation='sigmoid')
    dense_layers.append(dense)

# forward propagation (Tensorflow)
W, B = list(), list()
for idx, dense in enumerate(dense_layers):
    X = dense(X)
    
    w, b = dense.get_weights()    
    W.append(w)
    B.append(b)

print('y (Tensorflow): \n', X.numpy())

# forward propagation (Manual)
for layer_idx in range(len(n_neurons)):
    w, b = W[layer_idx], B[layer_idx]
    
    X_cp = matmul(X_cp, w) + b
    X_cp = 1/(1+exp(-X_cp))
    
print('y (Manual): \n', X_cp.numpy())

y (Tensorflow): 
 [[0.33424628 0.4058136  0.48894918 0.4621372  0.57884294]
 [0.36388016 0.38423467 0.4965263  0.46321887 0.5773694 ]
 [0.36786014 0.35406166 0.4990071  0.4631284  0.5875218 ]
 [0.34541464 0.3835951  0.49541283 0.45196107 0.5769063 ]]
y (Manual): 
 [[0.3342463  0.4058136  0.4889492  0.4621372  0.57884294]
 [0.36388016 0.38423467 0.49652627 0.46321884 0.5773694 ]
 [0.36786014 0.35406163 0.4990071  0.46312836 0.5875218 ]
 [0.34541464 0.3835951  0.49541286 0.45196107 0.5769063 ]]


# 3. Model Implementation

Sequential vs Model-subclassing
- 둘 다 가능
  - 아래와 같이 간단한 모델 생성에서는 차이가 없어 보임
  - forward propagation 방법만 살짝 달라짐
- ★ Sequential보다 Model-subclassing 방법에 더 익숙해져야 함
  - 그래야 더 복잡한 모델을 만들 수 있고 내가 원하는 값을 더 flexible하게 분석하기도 좋음
- 나중에는 이 두 개를 섞어 사용하기도 함

## 3.1 Model Implementation with Sequential Method

In [38]:
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential

# Sequential() : layer들을 포함하고 있는 object 생성 (빈 frame 같은 것)
model = Sequential()
model.add(Dense(units=10, activation='sigmoid'))  # 한 layer 안에 neuron 10개 있음
model.add(Dense(units=20, activation='sigmoid'))

아래 두 가지 코드가 근본적으로는 같음
- 그렇지만 tensorflow에서 제공하는 다양한 기능들을 이후에 사용하려면 '방법2'를 써야 좋음

```python
# 방법1
model = list()
for n_neuron in n_neurons:
    model.append(Dense(units=n_neuron, activation='sigmoid'))
    
# 방법2
model = Sequential()
for n_neuron in n_neurons:
    model.add(Dense(units=n_neuron, activation='sigmoid'))
```

## 3.2 Model Implementation with Model-subclassing

- TensorFlow에서 제공하는 'Model'을 상속 받아서 새로운 model 만들기

In [39]:
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Model

class TestModel(Model):
    def __init__(self):
        super(TestModel, self).__init__()   # 이 부분은 무조건 작성해야 함
        
        # 설계도 만드는 역할
        self.dense1 = Dense(units=10, activation='sigmoid')
        self.dense2 = Dense(units=20, activation='sigmoid')
        
# 모델 생성
model = TestModel()

In [41]:
print(model.dense1)
print(model.dense2)

<keras.layers.core.dense.Dense object at 0x000001C3C37EFBE0>
<keras.layers.core.dense.Dense object at 0x000001C3C3821670>


## 3.3 Forward Propagation of Models

In [42]:
import tensorflow as tf

from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from tensorflow.keras.models import Model

In [43]:
X = tf.random.normal(shape=(4,10))

### 3.3.1 sequential method

- 초보자를 위해 tensorflow에서 제공하는 method이기 때문에, 매우 간단함

In [44]:
model = Sequential()
model.add(Dense(units=10, activation='sigmoid'))
model.add(Dense(units=20, activation='sigmoid'))

model(X) : X를 input으로 dense layer를 알아서 차근차근 지나감

In [45]:
y = model(X)
print(y.numpy())  # 4 × 20

[[0.2717339  0.5779645  0.60417205 0.60902226 0.61334085 0.45653743
  0.6171141  0.47270983 0.507848   0.3616433  0.6196947  0.40714997
  0.4430326  0.48904005 0.5354475  0.38162935 0.44112837 0.61389124
  0.3919639  0.4449017 ]
 [0.32069117 0.5001586  0.6161568  0.6949515  0.668616   0.51909095
  0.62964934 0.4325876  0.49256212 0.41801545 0.5936141  0.4002023
  0.5416438  0.45583063 0.49197537 0.46065876 0.42061245 0.5914731
  0.33512124 0.3663872 ]
 [0.36624497 0.49199957 0.55439436 0.6422566  0.5915066  0.4484036
  0.6504155  0.4440811  0.49958375 0.55257857 0.61358964 0.38130254
  0.48231816 0.59819376 0.4823711  0.5699385  0.42495924 0.5741213
  0.28030485 0.40708756]
 [0.2942353  0.57161605 0.57103866 0.60114574 0.5808535  0.39764562
  0.6613636  0.46452296 0.52804756 0.42583257 0.6437923  0.3974021
  0.43033716 0.5687699  0.527315   0.45365173 0.46001095 0.5879725
  0.34410328 0.43245268]]


### 3.3.2 model-subclassing method

<b>기본 code</b>

In [47]:
class TestModel(Model):
    def __init__(self):
        super(TestModel, self).__init__()   # 이 부분은 무조건 작성해야 함
        
        self.dense1 = Dense(units=10, activation='sigmoid')
        self.dense2 = Dense(units=20, activation='sigmoid')
        
    # call : special method (python 문법)
    def call(self, x):
        x = self.dense1(x)
        x = self.dense2(x)
        return x

model = TestModel()
y = model(X)         # call이 special method이기 때문에 이렇게만 써도 call 함수를 자동으로 연결해 줌 (python 문법)

<b>neuron 여러 개 입력할 때 for 문 활용해서 더 효율적인 코드 작성하는 방법</b>

In [None]:
class TestModel(Model):
    def __init__(self, n_neurons):
        super(TestModel, self).__init__()   # 이 부분은 무조건 작성해야 함
        
        self.n_neurons = n_neurons
               
        self.dense_layers = list() 
        for n_neuron in self.n_neurons:
            self.dense_layers.append(Dense(units=n_neuron, activation='sigmoid'))
        
    def call(self, x):
        for dense in self.dense_layers:
            x = dense(x)
        return x

n_neurons = [3,4,5]
model = TestModel(n_neurons)
y = model(X)

<b>sequential 모델과 model-subclassing을 같이 엮어서 쓰는 방법</b>

In [48]:
class TestModel(Model):
    def __init__(self):
        super(TestModel, self).__init__()   # 이 부분은 무조건 작성해야 함
        
        self.model = Sequential()
        self.model.add(Dense(units=10, activation='sigmoid'))
        self.model.add(Dense(units=20, activation='sigmoid'))
        
    def call(self, x):
        x = self.model(x)
        return x

model = TestModel()
y = model(X)

## 3.4 Layers in Models

In [49]:
import tensorflow as tf

from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential

In [53]:
X = tf.random.normal(shape=(4,10))

model = Sequential()
model.add(Dense(units=10, activation='sigmoid'))
model.add(Dense(units=20, activation='sigmoid'))

y = model(X)  # forward propagation

print(type(model.layers))
print(model.layers)    # 두 개의 object가 들어있음 (dense layer 2개)

<class 'list'>
[<keras.layers.core.dense.Dense object at 0x000001C3C3C067F0>, <keras.layers.core.dense.Dense object at 0x000001C3C3AE0040>]


★ dense layer에 어떤 기능들이 있는지 살펴보기
- get_weights 등

In [55]:
dense1 = model.layers[0]
for tmp in dir(dense1):
    print(tmp)

_TF_MODULE_IGNORED_PROPERTIES
__call__
__class__
__delattr__
__dict__
__dir__
__doc__
__eq__
__format__
__ge__
__getattribute__
__getstate__
__gt__
__hash__
__init__
__init_subclass__
__le__
__lt__
__module__
__ne__
__new__
__reduce__
__reduce_ex__
__repr__
__setattr__
__setstate__
__sizeof__
__str__
__subclasshook__
__weakref__
_activity_regularizer
_add_trackable
_add_trackable_child
_add_variable_with_custom_getter
_auto_track_sub_layers
_autocast
_autographed_call
_build_input_shape
_call_accepts_kwargs
_call_arg_was_passed
_call_fn_arg_defaults
_call_fn_arg_positions
_call_fn_args
_call_full_argspec
_callable_losses
_cast_single_input
_checkpoint_dependencies
_clear_losses
_compute_dtype
_compute_dtype_object
_dedup_weights
_default_training_arg
_deferred_dependencies
_delete_tracking
_deserialization_dependencies
_deserialize_from_proto
_dtype
_dtype_policy
_dynamic
_eager_losses
_expects_mask_arg
_expects_training_arg
_flatten
_flatten_layers
_flatten_modules
_functional_constru

이런 식으로 사용 가능

In [56]:
for layer in model.layers:
    w, b = layer.get_weights()
    print(w.shape, b.shape)

(10, 10) (10,)
(10, 20) (20,)


## 3.5 Trainable Variables in Models

In [58]:
import tensorflow as tf

from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential

X = tf.random.normal(shape=(4,10))

model = Sequential()
model.add(Dense(units=10, activation='sigmoid'))
model.add(Dense(units=20, activation='sigmoid'))

y = model(X)

# list type
# 안에 weight, bias 등 기타 학습 대상이 되는 variable들이 들어 있음 (총 4개)
print(type(model.trainable_variables))
print(len(model.trainable_variables))

<class 'list'>
4


하나씩 접근해 보기

In [59]:
for train_var in model.trainable_variables:
    print(type(train_var))

<class 'tensorflow.python.ops.resource_variable_ops.ResourceVariable'>
<class 'tensorflow.python.ops.resource_variable_ops.ResourceVariable'>
<class 'tensorflow.python.ops.resource_variable_ops.ResourceVariable'>
<class 'tensorflow.python.ops.resource_variable_ops.ResourceVariable'>


아래 값은 weight, bias를 의미

In [60]:
for train_var in model.trainable_variables:
    print(train_var.shape)

(10, 10)
(10,)
(10, 20)
(20,)
