# Estimator(Custom)
![image](https://www.tensorflow.org/images/tensorflow_programming_environment.png)
![image2](https://www.tensorflow.org/images/custom_estimators/estimator_types.png)
- Tensorflow High Level API
- [Tensorflow 공식 문서](https://www.tensorflow.org/get_started/custom_estimators)
- 미리 정의된 모델(pre-made) 말고도 custom하게 estimator 사용 가능
- tf.Session을 따로 관리할 필요 없으며, ```tf.global_variables_initializer()``` ```tf.local_variables_initializer()```도 필요없음
- 이 글에선 custom estimator에 대해 이야기함

## 구성 요소
- ```input_fn()``` : feature, label return, feature는 dict으로!
- ```model_fn(features, labels, mode)``` : mode별로 분기 => train은 loss, op, evaluate는 pred, accuracy, pred는 prob, class
- ```est = tf.estimator.Estimator(model_fn)```
    - ```est.train(input_fn, steps=500)```
    - ```est.evaluate(input_fn, steps=10)```
    - ```est.predict(pred_input_fn = tf.estimator.inputs.numpy_input_fn({'feature': data}))```

## 참고 자료
- [이찬우님 유튜브](https://www.youtube.com/watch?v=4vJ_2NtsTVg&list=PL1H8jIvbSo1piZJRnp9bIww8Fp2ddIpeR&index=4)

---

In [43]:
import tensorflow as tf
import numpy as np


# 아래 값들은 Task마다 다르게 설정
BATCH_SIZE = 100
n_hidden = 9
n_input = 1

## input_function

In [None]:
def input_fn():
    '''
    data load하고 feature, label을 return
    단, feature는 dict 형식으로 넣어서 predict때도 사용할 수 있도록 함
    '''
    def map_function(record):
        return record
    
    dataset = tf.data.TFRecordDataset("path")\ # 또는 TextLineDataset
            .batch(BATCH_SIZE)\
            .repeat(9999) # repeat 횟수는 데이터가 적을 때 사용
            .make_one_shot_iterator()\
            .get_next()
            
    lines = tf.decode_csv(dataset)
    feature = tf.stack(lines[1:], axis=1)
    label = tf.expand_dims(lines[0], axis=-1)    

    feature = tf.cast(feature, tf.float32)
    label = tf.cast(label, tf.float32)
    
    return {'feature': feature}, label

## model_function

In [29]:
def model_fn(features, labels, mode, params={}):
    '''
    mode별로 분기 => train은 loss, op, evaluate는 pred, accuracy
    params는 상황에 다라 다름. 하이퍼파라미터
    '''
    
    TRAIN = mode == tf.estimator.ModeKeys.TRAIN
    EVAL = mode == tf.estimator.ModeKeys.EVAL
    PRED = mode == tf.estimator.ModeKeys.PREDICT
    
    def layer_function(features):
        '''
        scope 이름을 줄 경우, 간단한 구조라면 tf.layers.dense 등만 사용해도 무방
        '''
        with tf.variable_scope("scope-name"):
            layer = tf.layer.dense(features["feature"], units=n_hidden, activation=tf.nn.relu)
        return layer
    
    layer1 = layer_function(features)
    layer2 = tf.layers.dense(layer1, units=n_hidden, activation=tf.nn.relu, name='optional_name')
    layer3 = tf.layers.dense(layer2, units=n_hidden, activation=tf.nn.relu, name='optional_name')
    layer4 = tf.layers.dense(layer3, units=n_hidden, activation=tf.nn.relu, name='optional_name')
    out = tf.layers.dense(layer4, units=n_input)
    
    if TRAIN:
        global_step = tf.train.get_global_step()
        loss = tf.losses.sigmoid_cross_entropy(labels, out)
        train_op = tf.train.GradientDescentOptimizer(1e-2).minimize(loss, global_step=global_step)
        return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)
        
    elif EVAL:
        loss = tf.losses.sigmoid_cross_entropy(labels, out) # test loss
        pred = tf.nn.sigmoid(out)
        
        accuracy = tf.metrics.accuracy(labels=labels, predictions=tf.round(pred), name='accuracy')
        # predictions는 상황에 따라 pred를 그냥 넣을수도 있고, round로 반올림할 경우도 있음
        # Task by Task
        
        return tf.estimator.EstimatorSpec(mode=mode, loss=loss, eval_metric_ops={'acc': accuracy})
        
    elif PRED:
        prob = tf.nn.sigmoid(out)
        _class = tf.round(prob)
        return tf.estimator.EstimatorSpec(mode=mode, predictions={'prob': prob, 'class': _class})

In [41]:
if __name__ == '__main__':
    tf.logging.set_verbosity(tf.logging.INFO)
    est = tf.estimator.Estimator(model_fn)
    est.train(input_fn, steps=500)
    est.evaluate(input_fn, steps=10)
    
    data1 = np.array([1,2,3,4,5,6,7,8,9], np.float32)
    data2 = np.array([5,5,5,5,5,5,5,5,5], np.float32)
    data3 = np.array([9-i for i in range(9)], np.float32)
    data = np.stack([data1, data2, data3]) # 여러 데이터 input
    
    pred_input_fn = tf.estimator.inputs.numpy_input_fn({'feature': data}, shuffle=False)
    for d, pred in zip(data, est.predict(pred_input_fn)):
        print('feature: {}, prob: {}, class: {}'.format(d, pred['prob'], pred['class']))

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/var/folders/f7/lrsclmhd6mx2hgq049xw8dv80000gn/T/tmpnjf2yyyn', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x12322c9e8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 1 into /var/folders/f7/