## Understanding Tensorflow 1.x

### Examples of operations

#### Constants

In [1]:
import tensorflow.compat.v1 as tf

In [2]:
tf.disable_v2_behavior()

Instructions for updating:
non-resource variables are not supported in the long term


In [7]:
t_1 = tf.constant(4)
print(t_1)
t_2 = tf.constant([4,3,2])
print(t_2)

Tensor("Const_1:0", shape=(), dtype=int32)
Tensor("Const_2:0", shape=(3,), dtype=int32)


In [9]:
# create a tensor with all elements are zero
z = tf.zeros([3,3], tf.float32)
print(z)

Tensor("zeros:0", shape=(3, 3), dtype=float32)


In [10]:
# create a zero matrix of same shape as t_2
print(tf.zeros_like(t_2))
# create a ones matrix of same shape as t_2
print(tf.ones_like(t_2))

Tensor("zeros_like:0", shape=(3,), dtype=int32)
Tensor("ones_like:0", shape=(3,), dtype=int32)


In [11]:
print(tf.ones([3,2], tf.float32))

Tensor("ones:0", shape=(3, 2), dtype=float32)


#### sequence

In [4]:
s = tf.linspace(1.0, 5.0, 5)
print(s)
with tf.Session() as sess:
    outs = sess.run(s)
    print(outs)

[1. 2. 3. 4. 5.]


In [5]:
r = tf.range(10)
with tf.Session() as sess:
    outs = sess.run(r)
    print(outs)

[0 1 2 3 4 5 6 7 8 9]


#### random tensors

In [6]:
# normal distribution
t_random = tf.random_normal([2,3], mean=2.0, stddev=4, seed=12)
with tf.Session() as sess:
    outs = sess.run(t_random)
    print(outs)

[[ 0.2534746  5.3799095  1.9527606]
 [-1.5376031  1.2588985  2.8478067]]


In [8]:
# truncated normal distribution
t_random = tf.truncated_normal([2,3], seed=12)
with tf.Session() as sess:
    outs = sess.run(t_random)
    print(outs)

[[-0.43663135  0.84497744 -0.01180986]
 [-0.8844008  -1.938745    0.5632738 ]]


In [9]:
t_random = tf.random_uniform([2,3], maxval=4, seed=12)
with tf.Session() as sess:
    outs = sess.run(t_random)
    print(outs)

[[2.54461   3.6963658 2.7051091]
 [2.0085006 3.8445983 3.5426888]]


In [12]:
t_rc = tf.random_crop(t_random, [2,3], seed=12)
with tf.Session() as sess:
    outs = sess.run(t_rc)
    print(outs)

[[2.54461   3.6963658 2.7051091]
 [2.0085006 3.8445983 3.5426888]]


In [13]:
# randomly shuffle a tensor
tf.random_shuffle(t_random)

<tf.Tensor 'RandomShuffle:0' shape=(2, 3) dtype=float32>

In [14]:
# 모든 세션의 모든 랜덤 생성 텐서에 동일한 시드를 적용
tf.set_random_seed(54)

#### Variables

In [16]:
weights = tf.Variable(tf.random_normal([100, 100], stddev=2))
bias = tf.Variable(tf.zeros([100]), name = 'biases')

In [17]:
# 초기화 방법을 지정해둔 Variable을 명시적으로 초기화한다
initial_op = tf.global_variables_initializer()

In [18]:
# Saver class로 Variable을 저장할 수 있다
saver = tf.train.Saver()

## Understanding TensorFlow 2.x

### AutoGraph

In [1]:
import tensorflow as tf

In [2]:
def linear_layer(x):
    return 3 * x + 2

@tf.function
def simple_nn(x):
    return tf.nn.relu(linear_layer(x))

def simple_function(x):
    return 3*x

In [3]:
# simple_nn is a special handler for interacting with TensorFlow internals
# while simple_function is a normal Python handler
print(simple_nn)
print(simple_function)

<tensorflow.python.eager.def_function.Function object at 0x0000025A09308CC8>
<function simple_function at 0x0000025A092CC438>


In [4]:
# Note that with tf.function you need to annotate only one main function, 
# so all other functions called from there will be automatically transformed into
# an optimized computational graph
# tf.function marking code for Just In Time(JUT) compilation

In [5]:
# internal look at the auto-generated code
print(tf.autograph.to_code(simple_nn.python_function, experimental_optional_features=None))
# automatically generated code will be printed

def tf__simple_nn(x):
  do_return = False
  retval_ = ag__.UndefinedReturnValue()
  with ag__.FunctionScope('simple_nn', 'fscope', ag__.ConversionOptions(recursive=True, user_requested=True, optional_features=(), internal_convert_user_code=True)) as fscope:
    do_return = True
    retval_ = fscope.mark_return_value(ag__.converted_call(tf.nn.relu, (ag__.converted_call(linear_layer, (x,), None, fscope),), None, fscope))
  do_return,
  return ag__.retval(retval_)



In [10]:
# let's check speed between code annotated with the tf.function() decorator
# and the same code with no annotation with using a layer LSTMCell()
import tensorflow as tf
import timeit

cell = tf.keras.layers.LSTMCell(100)

@tf.function
def fn(input, state):
    return cell(input, state)

input = tf.zeros([100, 100])
state = [tf.zeros([100, 100])] * 2

#warmup
cell(input, state)
fn(input, state)

graph_time = timeit.timeit(lambda: cell(input, state), number=100)
auto_graph_time = timeit.timeit(lambda: fn(input, state), number=100)

print('graph_time:', graph_time)
print('auto_graph_time:', auto_graph_time)

graph_time: 0.04774670000006154
auto_graph_time: 0.03041829999995116


In short, you can decorate Python functions and methods with tf.function, which converts them to the equivalent of a static graph, with all the optimization that comes with it

### Keras APIs - three programming models

#### Sequential API

In [None]:
tf.keras.utils.plot_model(model, to_file='model.png')

#### Functional API

The Functional API is useful when you want to build a model with more complex (non-linear) topologies, including multiple inputs, multiple outputs, residual
connections with non-sequential flows, and shared and resuable layers. Each layer is callable (with a tensor in input), and each layer returns a tensor as an output. Let's look at an example where we have two separate inputs, two separate logistic regressions as outputs, and one shared module in the middle

In [1]:
import tensorflow as tf
def build_model():
    
    # variable-length sequence of integers
    text_input_a = tf.keras.Input(shape=(None,), dtype='int32')
    
    # variable-length sequence of integers
    text_input_b = tf.keras.Input(shape=(None,), dtype='int32')
    
    # Embedding for 1000 unique words mapped to 128-dimensional vectors
    shared_embedding = tf.keras.layers.Embedding(1000, 128)
    
    # we resue the same layer to encode both inputs
    encoded_input_a = shared_embedding(text_input_a)
    encoded_input_b = shared_embedding(text_input_b)
    
    # two logistic predictions at the end
    # create a layer, then pass it an input
    prediction_a = tf.keras.layers.Dense(1, activation='sigmoid', name='prediction_a')(encoded_input_a)
    prediction_b = tf.keras.layers.Dense(1, activation='sigmoid', name='prediction_b')(encoded_input_b)
    
    # this model has 2 inputs, and 2 outputs
    # in the middle we have a shared model
    model = tf.keras.Model(inputs=[text_input_a, text_input_b], outputs=[prediction_a, prediction_b])
    
    tf.keras.utils.plot_model(model, to_file='shared_model.png')

In [2]:
build_model()

#### Model subclassing

Model subclassing offers the highest flexbility and it is generally used when you need to define your own layer. In other words, it is useful when you are in the business of building your own special lego brick instead of composing more standard and well-known bricks.

In order to create a custom layer, we can subclass tf.keras.layers.Layer and implement the following methods:
<ul>
    <li>__init__ : Optionally used to define all the sublayers to be used by this layer. this is constructor where you can declare your model.
    <li>build : Used to create the weights of the layer. You can add weights with add_weight()
    <li>call : Used to define the forward pass. This is where your layer is called and chained in functional style.
    <li>Optionally, a layer can be serialized by using get_config() and deserialized using from_config()

In [8]:
# Let's see an exmple of a custom layer that simply multiples and input by a matrix named kernel
class MyLayer(tf.keras.layers.Layer):
    def __init__(self, output_dim, **kwargs):
        self.output_dim = output_dim
        super(MyLayer, self).__init__(**kwargs)
    
    def build(self, input_shape):
        # create a trainable weight variable for this layer
        self.kernel = self.add_weight(name='kernel', shape=(input_shape[1], self.output_dim), initializer='uniform', trainable=True)
        
    def call(self, inputs):
        # Do the multiplication and return
        return tf.matmul(inputs, self.kernel)

In [9]:
# Once the MyLayer() custom brick is defined, it can be composed just like any other brick
model = tf.keras.Sequential([MyLayer(20), tf.keras.layers.Activation('softmax')])

### Callbacks

Callbacks are objects passed to a model to extend or modify behaviors during training
<ul>
    <li>tf.keras.callbacks.ModelCheckPoint : This feature is used to save checkpoints of your model at regular intervals and recover in case of problems.
    <li>tf.keras.callbacks.LearningRateScheduler : This feature is used to dynamically change the learning rate duringg optimization
    <li>tf.keras.callbacks.EarlyStopping : This feature is sued to interrupt training when validation performance has stopped improbing after a while
    <li>tf.keras.callbacks.TensorBoard : This feature is used to monitor the model's behavior using TensorBoard

In [None]:
callbacks = [
    # Write TensorBoard logs to './logs' directory
    tf.keras.callbacks.TensorBoard(log_dir='./logs')
]
model.fit(data, labels, batch_size=256, epochs=100, callbacks=callbacks, validation_data=(val_data, val_labels))

### Saving a model and weights

In [None]:
# Save weights to a Tensorflow Checkpoint file
model.save_weights('./weights/my_model')

In [None]:
# If you want to save in Keras's format, which is portable across multiple backends
# Save weights to a HDF5 file
model.save_weights('my_model.h5', save_format='h5')

In [None]:
# Restore the model's state
model.load_weights(file_path)

In [None]:
# a model can be serialized in JSON with:
json_string = tf.keras.models.model_from_json(json_string) # rstore

In [None]:
# If you prefer, a model can be serialized in YAML with:
yaml_string = model.to_yaml() # save
model = tf.keras.model.model_from_yaml(yaml_string) # restore

In [None]:
# If you want to save a model together with its weights and the optimization parameters,
model.save('my_model.h5') # save
model = tf.keras.models.load_model('my_model.h5') # restore

### Training from tf.data.datasets

pip install tensorflow-datasets

In [10]:
import tensorflow as tf
import tensorflow_datasets as tfds

In [11]:
# See all registered datasets
builders = tfds.list_builders()
print(builders)

['abstract_reasoning', 'aflw2k3d', 'amazon_us_reviews', 'bair_robot_pushing_small', 'bigearthnet', 'binarized_mnist', 'binary_alpha_digits', 'caltech101', 'caltech_birds2010', 'caltech_birds2011', 'cats_vs_dogs', 'celeb_a', 'celeb_a_hq', 'chexpert', 'cifar10', 'cifar100', 'cifar10_corrupted', 'clevr', 'cnn_dailymail', 'coco', 'coco2014', 'coil100', 'colorectal_histology', 'colorectal_histology_large', 'curated_breast_imaging_ddsm', 'cycle_gan', 'deep_weeds', 'definite_pronoun_resolution', 'diabetic_retinopathy_detection', 'downsampled_imagenet', 'dsprites', 'dtd', 'dummy_dataset_shared_generator', 'dummy_mnist', 'emnist', 'eurosat', 'fashion_mnist', 'flores', 'food101', 'gap', 'glue', 'groove', 'higgs', 'horses_or_humans', 'image_label_folder', 'imagenet2012', 'imagenet2012_corrupted', 'imdb_reviews', 'iris', 'kitti', 'kmnist', 'lfw', 'lm1b', 'lsun', 'mnist', 'mnist_corrupted', 'moving_mnist', 'multi_nli', 'nsynth', 'omniglot', 'open_images_v4', 'oxford_flowers102', 'oxford_iiit_pet', 

In [12]:
# Load a given dataset by name, along with the DatasetInfo metadata
data, info = tfds.load('mnist', with_info=True)
train_data, test_data = data['train'], data['test']

[1mDownloading and preparing dataset mnist (11.06 MiB) to C:\Users\polas\tensorflow_datasets\mnist\1.0.0...[0m


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Dl Completed...', max=1.0, style=Progre…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Dl Size...', max=1.0, style=ProgressSty…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Extraction completed...', max=1.0, styl…











HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, description='Shuffling...', max=10.0, style=ProgressStyle(description_…

Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Reading...', max=1.0, style=ProgressSty…

HBox(children=(FloatProgress(value=0.0, description='Writing...', max=6000.0, style=ProgressStyle(description_…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Reading...', max=1.0, style=ProgressSty…

HBox(children=(FloatProgress(value=0.0, description='Writing...', max=6000.0, style=ProgressStyle(description_…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Reading...', max=1.0, style=ProgressSty…

HBox(children=(FloatProgress(value=0.0, description='Writing...', max=6000.0, style=ProgressStyle(description_…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Reading...', max=1.0, style=ProgressSty…

HBox(children=(FloatProgress(value=0.0, description='Writing...', max=6000.0, style=ProgressStyle(description_…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Reading...', max=1.0, style=ProgressSty…

HBox(children=(FloatProgress(value=0.0, description='Writing...', max=6000.0, style=ProgressStyle(description_…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Reading...', max=1.0, style=ProgressSty…

HBox(children=(FloatProgress(value=0.0, description='Writing...', max=6000.0, style=ProgressStyle(description_…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Reading...', max=1.0, style=ProgressSty…

HBox(children=(FloatProgress(value=0.0, description='Writing...', max=6000.0, style=ProgressStyle(description_…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Reading...', max=1.0, style=ProgressSty…

HBox(children=(FloatProgress(value=0.0, description='Writing...', max=6000.0, style=ProgressStyle(description_…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Reading...', max=1.0, style=ProgressSty…

HBox(children=(FloatProgress(value=0.0, description='Writing...', max=6000.0, style=ProgressStyle(description_…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Reading...', max=1.0, style=ProgressSty…

HBox(children=(FloatProgress(value=0.0, description='Writing...', max=6000.0, style=ProgressStyle(description_…

HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, description='Shuffling...', max=1.0, style=ProgressStyle(description_w…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Reading...', max=1.0, style=ProgressSty…

HBox(children=(FloatProgress(value=0.0, description='Writing...', max=10000.0, style=ProgressStyle(description…

[1mDataset mnist downloaded and prepared to C:\Users\polas\tensorflow_datasets\mnist\1.0.0. Subsequent calls will reuse this data.[0m




tfds.core.DatasetInfo(
    name='mnist',
    version=1.0.0,
    description='The MNIST database of handwritten digits.',
    urls=['https://storage.googleapis.com/cvdf-datasets/mnist/'],
    features=FeaturesDict({
        'image': Image(shape=(28, 28, 1), dtype=tf.uint8),
        'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
    }),
    total_num_examples=70000,
    splits={
        'test': 10000,
        'train': 60000,
    },
    supervised_keys=('image', 'label'),
    citation="""@article{lecun2010mnist,
      title={MNIST handwritten digit database},
      author={LeCun, Yann and Cortes, Corinna and Burges, CJ},
      journal={ATT Labs [Online]. Available: http://yann. lecun. com/exdb/mnist},
      volume={2},
      year={2010}
    }""",
    redistribution_info=,
)



In [13]:
# we get the metainfo for MNIST
print(info)

tfds.core.DatasetInfo(
    name='mnist',
    version=1.0.0,
    description='The MNIST database of handwritten digits.',
    urls=['https://storage.googleapis.com/cvdf-datasets/mnist/'],
    features=FeaturesDict({
        'image': Image(shape=(28, 28, 1), dtype=tf.uint8),
        'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
    }),
    total_num_examples=70000,
    splits={
        'test': 10000,
        'train': 60000,
    },
    supervised_keys=('image', 'label'),
    citation="""@article{lecun2010mnist,
      title={MNIST handwritten digit database},
      author={LeCun, Yann and Cortes, Corinna and Burges, CJ},
      journal={ATT Labs [Online]. Available: http://yann. lecun. com/exdb/mnist},
      volume={2},
      year={2010}
    }""",
    redistribution_info=,
)



In [15]:
# Sometimes it is useful to create a dataset from a NumPy array
import tensorflow as tf
import numpy as np
num_items = 100
num_list = np.arange(num_items)

# create the dataset from numpy array
num_list_dataset = tf.data.Dataset.from_tensor_slices(num_list)

In [16]:
print(num_list)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
 96 97 98 99]


In [17]:
print(num_list_dataset)

<TensorSliceDataset shapes: (), types: tf.int32>


In [None]:
# We can also download a dataset, shuffle and batch the data, and take a slice from the generator
datasets, info = tfds.load('imdb_reviews', with_info=True, as_supervised=True)
train_dataset = datasets['train']
# shuffle() is a transformation that randomly shuffles the input dataset
# batch() creates batches of tensors
train_dataset = train_dataset.batch(5).shuffle(50).take(2)
for data in train_dataset:
    print(data)

[1mDownloading and preparing dataset imdb_reviews (80.23 MiB) to C:\Users\polas\tensorflow_datasets\imdb_reviews\plain_text\0.1.0...[0m


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Dl Completed...', max=1.0, style=Progre…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Dl Size...', max=1.0, style=ProgressSty…

A dataset is a library for dealing with data in principled way.
Operations include:
<ol><li>Creation:
        <ol><li>from_tensor_slices() : which accepts individual (or multiple) NumPy (or tensors) and supports batches
            <li>from_tensors() : which is similar to the above but it does not support batches
            <li>from_generator() : which takes input from a generator function</ol><br>
    <li>Transformation:
        <ol><li>batch() : which sequentially devides the dataset by the specified size
            <li>repeat() : which duplicates the data
            <li>shuffle() : which randomly shuffles the data
            <li>map() : which applies a function to the data
            <li>filter() : which applies a function to filter the data</ol><br>
    <li>Iterators:
        <ol><li>next_batch = iterator.get_next()
            

Dataset uses TFRecord, a representation of the data (in any format) that can be easily ported across multiple systems and is independent of the particular model used for training

### tf.keras or Estimators?

What are Estimators? Put simply, they are another way to build or to use prebuilt bricks. A longer answer is that they are highly efficient learning models for large-scale production-ready environments, which can be trained on single machines or on distributed nulti-servers, and they can run on CPUs, GPUs, or TPUs without recoding your model. These models include Linear Classifiers, Deep Learning Classifiers, Gradient Boosted Trees, and many more.

Let's see an example of an Estimator used for building a classifier with 2 dense hidden layers, each with 10 neurons, and with 3 output classes

In [None]:
# Build a DNN with 2 hidden layers and 10 nodes in each hidden layer.
classifier = tf.estimator.DNNClassifier(feature_columns=my_feature_columns,
    # Two hidden layers of 10 nodes each.
    hidden_units=[10, 10],
    # The model must choose between 3 classes.
    n_classes=3)

The feature_columns=my_feature_columns is a list of feature columns each describing a single feature your want the model to use. For example, a typical use would be something like:

In [None]:
# Fetch the data
(train_x, train_y), (test_x, test_y) = load_data()

# Feature columns describe how to use the input.
my_feature_columns = []
for key in train_x.keys():
    my_feature_columns.append(tf.feature_column.numeric_column(key=key))

There, tf.feature_column.numeric_column() represents real valued or numerical features (https://www.tensorflow.org/api_docs/python/tf/feature_column/numeric_column)

Efficiency Estimators should be trained using tf.Datasets as input. Here is an example where MNIST is loaded, scaled, shuffled, and batched:

In [8]:
import tensorflow as tf
import tensorflow_datasets as tfds
BUFFER_SIZE = 10000
BATCH_SIZE = 64
def input_fn(mode):
    datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True)
    mnist_dataset = (datasets['train'] if mode == tf.estimator.ModeKeys.TRAIN else datasets['test'])
    
    def scale(image, label):
        image = tf.cast(image, tf.float32)
        image /= 255
        return image, label
 
    return mnist_dataset.map(scale).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)

In [9]:
test = input_fn('test')
train = input_fn(tf.estimator.ModeKeys.TRAIN)
print(test)
print(train)



<BatchDataset shapes: ((None, 28, 28, 1), (None,)), types: (tf.float32, tf.int64)>
<BatchDataset shapes: ((None, 28, 28, 1), (None,)), types: (tf.float32, tf.int64)>


Then, the Estimator can be trained and evaluated by using tf.estimator.train_and_evaluate() and passing the input_fn which will iterate on the data:

In [None]:
tf.estimator.train_and_evaluate(
    classifier,
    train_spec = tf.estimator.TrainSpec(input_fn=input_fn),
    eval_spec = tf.estimator.EvalSpec(input_fn=input_fn)
)

### Ragged tensors

TensorFlow 2.x added support for "ragged" tensors, which are a special type of dense tensor with non-uniformly shaped dimensions. This is particulary useful for dealing with sequences and other data issues where the dimensions can change across batches, such as text sentences and hierarchical data. Note that ragged tensors are more efficient than padding tf.Tensor, since no time or space is wasted

In [15]:
ragged = tf.ragged.constant([[1,2,3], [3,4], [5,6,7,8]])
print(ragged)

<tf.RaggedTensor [[1, 2, 3], [3, 4], [5, 6, 7, 8]]>


### Custom training

Custom training is useful when you want to have finer control over optimization.

There are multiple ways of computing gradients

1. tf.GradientTape() : This class records operations for automatic differentiation. Let's look at an example where we use the parameter persistent=True (a Boolean controlling whether a persistent gradient tape is created, which means that multiple calls can be made to the gradient() method on this object):

persistence는 다음과 같이 정의된다.

Persistence는 객체의 존재가 시간(즉, 객체는 객체를 생성한 무엇인가가 존재를 끝마치게 한 후에 계속해서 존재함)이나 공간(즉, 객체의 위치는 객체가 생성되었던 주소 공간에서 부터 이동됨)을 초월해서 여기저기 옮겨다니는(through) 객체의 성질(property)이다.

이상과 같이 Booch 책에서 persistence라는 의미를 설명한 내용이다. 마지막에 요약되어서 나와있지만, 객체가 시간과 공간을 초월하여 여러 프로그램간을 옮겨다닐 수 있는 성질로 표현할 수 있는데, 딱히 우리말로 한단어로 표현하기가 쉽지 않다. 프로그래머가 가장 이해하기 쉬운 단어는 '객체의 저장' 정도가 되겠지만, 이렇게 번역하기는 또한 persistence가 갖게 되는 다른 의미가 많이 상쇄되는 느낌이다. 왜냐하면 '저장' 이라는 의미가 주로 DB 저장을 뜻하는 경우가 대부분이기 때문일 것이다. 아무튼 'DB 저장' 과는 별도의 개념인 '객체 저장(persistence)' 이라는 개념이 필요할 것 같다.

출처: https://homo-ware.tistory.com/4 [人-ware : Forwards Veritas]

In [16]:
import tensorflow as tf
x = tf.constant(4.0)
with tf.GradientTape(persistent=True) as g:
    g.watch(x)
    y = x * x
    z = y * y
dz_dx = g.gradient(z, x) # 256.0 (4*x^3 at x = 4)
dy_dx = g.gradient(y, x) # 8.0
print(dz_dx)
print(dy_dx)
del g # Drop the reference to the tape

tf.Tensor(256.0, shape=(), dtype=float32)
tf.Tensor(8.0, shape=(), dtype=float32)


2. tf.gradient_function() : This returns a function that computes the derivatives of its inpuit function parameter with respect to its arguments.
<br><br>
3. tf.value_and_gradients_function() : This returns the value from the input function in addition to the list of derivatives of the input function with respect to its arguments
<br><br>
4. tf.implicit_gradients() : This computes the gradients of the outputs of the input function with regards to all trainable variables these outputs depend on

Let's see a skeleton of a custom gradient computation where a model is given as input, and the training steps are computing total_loss = pred_loss + regularization_loss. The decorator @tf.function is used for AutoGraph, and tape.gradient() and apply_gradients() are used to compute and apply the gradients:

In [None]:
@tf.function
def train_step(inputs, labels):
    with tf.GradientTape() as tape:
        predictions = model(inputs, training=True)
        regularization_loss = # TBD according to the problem
        pred_loss = # TBD according th toe problem
        total_loss = pred_loss + regularization_loss
    
    gradients = tape.gradient(total_loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

Then, the training step train_step(inputs, labels) is applied for each epoch, for each input and its associated label in train_data:

In [None]:
for epoch in range(NUM_EPOCHS):
    for inputs, labels in train_data:
        train_step(inputs, labels)
    print("Finished epoch", epoch)

So, put simply, GradientTape() allows us to control and change how the training process is performed internally

### Distributed training in TensorFlow 2.x

You can switch between GPUs, TPUs, and muptiple machines by just changing the tf.distribute.Strategy instance.
Strategies can be synchronous, where all workers train over different slices of input data in a form of sync data parallel computaiton, or asynchronous, where updates from the optimizers are not happening in sync. All strategies require that data is loaded in batches the API tf.data.Dataset API.

#### Multiple GPUs

1. In order to load out data in a way that can be distributed into the GPUs, we simply need a tf.data.Dataset. If we do not have a tf.data.Dataset but we have a normal tensor, then we can easily conver the latter into the former using tf.data.Dataset.from_tensors_slices(). This will take a tensor in memory and return a source dataset, the elements of which are slices of the given tensor.

In [20]:
import tensorflow as tf
import numpy as np
from tensorflow import keras

N_TRAIN_EXAMPLES = 1024*1024
N_FEATURES = 10
SIZE_BATCHES = 256

# 10 random floats in the half-open interval [0.0, 1.0).
x = np.random.random((N_TRAIN_EXAMPLES, N_FEATURES))
y = np.random.randint(2, size=(N_TRAIN_EXAMPLES, 1))
x = tf.dtypes.cast(x, tf.float32)
print(x)

dataset = tf.data.Dataset.from_tensor_slices((x,y))
dataset = dataset.shuffle(buffer_size=N_TRAIN_EXAMPLES).batch(SIZE_BATCHES)

tf.Tensor(
[[0.7064347  0.21554118 0.16618264 ... 0.7284623  0.8475142  0.38597292]
 [0.02400104 0.5342169  0.5304134  ... 0.9043175  0.45012575 0.2092488 ]
 [0.2288879  0.23640999 0.9511873  ... 0.5638631  0.557472   0.72549695]
 ...
 [0.66274357 0.29937664 0.33537513 ... 0.0792544  0.68229526 0.86084676]
 [0.20431727 0.847871   0.62121135 ... 0.43766165 0.45112902 0.42994523]
 [0.9513815  0.9032169  0.87162656 ... 0.4489549  0.8452597  0.16410215]], shape=(1048576, 10), dtype=float32)


2. In order to distribute some eomputations to GPUs, we instantiate a distribution = tf.distribute.MirroredStrategy() object, which supports synchronous distributed training on multiple GPUs on one machine. Then, we move the creation and compliation of the Keras model inside the strategy.scope(). Note that each variable in the model is mirrored across all the replicas

In [21]:
# this is the distribution strategy
distribution = tf.distribute.MirroredStrategy()

# this piece of code is disctributed to multiple GPUs
with distribution.scope():
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Dense(16, activation='relu', input_shape=(N_FEATURES,)))
    model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
    optimizer = tf.keras.optimizers.SGD(0.2)
    model.compile(loss='binary_crossentropy', optimizer=optimizer)
    model.summary()
    
    # Optimize in the usual way but in reality you are using GPUs.
    model.fit(dataset, epochs=5, steps_per_epoch=10)

INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 16)                176       
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 17        
Total params: 193
Trainable params: 193
Non-trainable params: 0
_________________________________________________________________
Train for 10 steps
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


Note that each batch of the given input is divided equally among the multiple GPUs. For instance, if using MirroredStrategy() with two GPUs, each batch of size 256 will be divided among the two GPUs, with each of them receiving 128 input examples for each step. In addition, note that each GPU will optimize on the received batches and the TensorFlow backend will combine all these independent optimizations on our behalf. If you want to know more, you can have a look to the notebook online (https://colab.research.google.com/drive/1mf-PK0a20CkObnT0hCl9VPEje1szhHat#scrollTo=wYar3A0vBVtZ) where I explain how to use GPUs in Colab with a Keras model built for MNIST classification.

#### MultiWorkerMirroredStrategy

This strategy implements synchronous distributed training across multiple workers, each one with potentially multiple GPUs. This strategy should be used if you are aiming at scaling beyond a single machine with high performance. Data must be loaded with tf.Dataset and shared across workers so that each worker can read a unique subset.

#### TPUStrategy

This strategy implements synchronous distributed training on TPUs. TPUs are Google's specialized ASICs chips designed to significantly accelerate machine learning workloads in a way often mor effienct than GPUs.

#### ParameterServerStrategy

This strategy implements either multi-GPU synchronous local training or asynchronous multi-machine training. For local training on one machine, the variables of the models are placed on the CPU and operations are replicated across all local GPUs.<br>
For multi-machine training, some machines are designated as workers and some as parameter servers with the variables of the model placed on parameter servers. Computation is replicated across all GPUs of all workers. Multiple workers can be set up with the environment variable TF_CONFIG as in the following example:

In [None]:
os.environ['TF_CONFIG'] = json.dumps({
    "cluster":{
        "worker":["host1:port", "host2:port", "host3:port"],
        "ps":["host4:port", "host5:port"]
    },
    "task":{"type":"worker", "index":1}
})

### Changes in namespaces

<ul><li>tf.keras.layers : Contains all symbols that were previously under tf.layers
    <li>tf.keras.losses : Contains all symbols that were previously under tf.losses
    <li>tf.keras.metrics : Contains all symbols that were previously under tf.metrics
    <li>tf.debugging : A new namespace for debugging
    <li>tf.dtypes : A new namespace for data types
    <li>tf.io : A new namespace for I/O
    <li>tf.quantization : A new namespace for quantization

### Converting from 1.x to 2.x

In [None]:
# for a single file
tf_upgrade_v2 --infile tensorfoo.py -- outfile tensorfoo-upgraded.py
# for multiple files in a directory
tf_upgrade_v2 --intree incode --outtree code-upgraded

### Using TensorFlow 2.x effectively

1. Default to higher-level APIs such as tf.keras (or in certain situations, Estimators) and avoid lower-level APIs
<br><br>
2. Add a tf.function decorator to make it run efficiently in graph mode with AutoGraph. Only use tf.function to decorate high-level computations; all functions invoked by high-level computations are automatically annotated on your behalf. In this way, you ge the best of both worlds: high-level APIs with eager support, and the efficiency of computational graphs.
<br><br>
3. Use Python objects to track variables and losses. So, be Pythonic and use tf.Variable instead of tf.get_variable. In this way, variables will be treated with the normal Python scope.
<br><br>
4. Use tf.data datasets for data inputs and provide these objects directly to tf.keras.Model.fit. In this way, you will have a collection of high-performance classes for manipulating data and will adopt the best way to stream training data from disk.
<br><br>
5. Use tf.layers modules to combine predifined "lego bricks" whenever it is possible, either with Sequential of Functional APIs, or with Subclassing.
<br><br>
6. Consider using a distribution strategy across GPUs, CPUs, and multiple servers