# Sparse Operation Kit #
---
This notebook introduces what is sparse operation kit and how to use it to accerlerate the recommander system's training process.

Sparse Opertion Kit (hereafter SOK) is a toolkit aiming at wrapping those effecient algorithms / implementations used in recommendation scenarios, which includes many sparse operations, into a user-friendly library. When user wants to leverage those GPU-accerlerated algorithms to speed up their application, they can quickly start from this Python toolkit. 

## Features ##
- Embedding Algorithms:
    - HugeCTR::Distributed, which supports combiner={Mean, Sum}
- Synchronized Training:
    - Single-node, multiple GPUs
    - Multi-node, multiple GPUs
- Data-Parallel Inputs
    - No further burden to datareader except common data preprocessing.
- Deep Learning Framework compatibility:
    - TensorFlow
        - version
            - 2.x
        - optimizers
            - Adam
            - SGD

## Menu ##
1. **Installation**
2. **Single-node, Multi-GPUs synchronized training**
3. **Multi-node, Multi-GPUs synchronized training**

### Installation ###
+ **Requirements**
    - TensorFlow 2.x


+ **Get SOK from NGC** <br>
The SparseOperationKit is preinstalled in the [Merlin Tensorflow Training Container](https://ngc.nvidia.com/catalog/containers/nvidia:merlin:merlin-tensorflow-training): `nvcr.io/nvidia/merlin/merlin-tensorflow-training:0.7`. <br>
You can check the existence of required libraries by running the following Python code after launching this container.
```shell
$ python3 -c "import sparse_operation_kit as sok"
```

+ **Build SOK from Souce Code** <br>
If you want to build SparseOperationKit from the souce code instead of using the NGC container, please refer to the [Setup development environment](../docs/hugectr_contributor_guide.md#build-sparse-operation-kit-sok-from-source-code).

### Single-node, Multi-GPUs synchronized training ###

Firstly, specify hyper parameters.

In [1]:
%reset -f

args = dict()

args["gpu_num"] = 8                               # the number of available GPUs
args["iter_num"] = 50                             # the number of training iteration
args["max_vocabulary_size_per_gpu"] = 1024
args["slot_num"] = 10                             # the number of feature fields in this embedding layer
args["max_nnz"] = 4                               # the maximum number of valid features in each slot
args["embedding_vec_size"] = 4                    # the dimension of embedding vectors
args["combiner"] = "mean"                         # the reduction combiner used intra slots, it can be [mean, sum]
args["global_batch_size"] = 65536                 # the globally batchsize for all GPUs
args["optimizer"] = "plugin_adam"                 # the optimizer used for training, it can be [plugin_adam, adam, sgd]

Secondly, import the used modules.

In [2]:
import sys, os, json
sys.path.append("../")
import sparse_operation_kit as sok
import tensorflow as tf
os.environ["CUDA_VISIBLE_DEVICES"] = ",".join(map(str, range(args["gpu_num"])))
import numpy as np

[INFO]: sparse_operation_kit is imported


Thirdly, define a DNN model using TensorFlow.

In [3]:
class TfDemo(tf.keras.models.Model):
    def __init__(self, 
                 init_tensors, 
                 combiner, 
                 global_batch_size,
                 slot_num, 
                 embedding_vec_size,
                 **kwargs):
        super(TfDemo, self).__init__(**kwargs)
        self.combiner = combiner
        self.global_batch_size = global_batch_size
        self.slot_num = slot_num
        self.embedding_vec_size = embedding_vec_size

        self.init_tensors = init_tensors
        self.params = tf.Variable(initial_value=tf.concat(self.init_tensors, axis=0))

        self.dense_layer = tf.keras.layers.Dense(units=1, activation=None,
                                                 kernel_initializer="ones",
                                                 bias_initializer="zeros")

    def call(self, inputs, training=True):
        # [batchsize * slot_num, embedding_vec_size]
        embedding_vector = tf.nn.embedding_lookup_sparse(params=self.params, sp_ids=inputs,
                                                        sp_weights=None, combiner=self.combiner)

        # [batchsize, slot_num * embedding_vec_size]
        embedding_vector = tf.reshape(embedding_vector, shape=[self.global_batch_size, self.slot_num * self.embedding_vec_size])
        logit = self.dense_layer(embedding_vector)
        return logit, embedding_vector

Fourthly, define the same DNN model using SOK.

In [4]:
class SOKDemo(tf.keras.models.Model):
    def __init__(self,
                 combiner,
                 max_vocabulary_size_per_gpu,
                 slot_num,
                 max_nnz,
                 embedding_vec_size, 
                 **kwargs):
        super(SOKDemo, self).__init__(**kwargs)

        self.combiner = combiner
        self.max_vocabulary_size_per_gpu = max_vocabulary_size_per_gpu
        self.slot_num = slot_num
        self.max_nnz = max_nnz
        self.embedding_vec_size = embedding_vec_size

        self.embedding_layer = sok.DistributedEmbedding(combiner=self.combiner,
                                                           max_vocabulary_size_per_gpu=self.max_vocabulary_size_per_gpu,
                                                           embedding_vec_size=self.embedding_vec_size,
                                                           slot_num=self.slot_num,
                                                           max_nnz=self.max_nnz)

        self.dense_layer = tf.keras.layers.Dense(units=1, activation=None,
                                                 kernel_initializer="ones",
                                                 bias_initializer="zeros")

    def call(self, inputs, training=True):
        # [batchsize, slot_num, embedding_vec_size]
        embedding_vector = self.embedding_layer(inputs, training=training)
        # [batchsize, slot_num * embedding_vec_size]
        embedding_vector = tf.reshape(embedding_vector, shape=[-1, self.slot_num * self.embedding_vec_size])
        # [batchsize, 1]
        logit = self.dense_layer(embedding_vector)
        return logit, embedding_vector

Fifthly, generate synthetic dataset and initial values that is used to initialize embedding parameters.

In [5]:
# import utility python script
sys.path.append("../unit_test/test_scripts/")
import utils

In [6]:
# -1 is used to represent the invalid keys
random_samples = utils.generate_random_samples(num_of_samples=args["global_batch_size"] * args["iter_num"],
                                               vocabulary_size=args["gpu_num"] * args["max_vocabulary_size_per_gpu"],
                                               slot_num=args["slot_num"],
                                               max_nnz=args["max_nnz"])

[INFO]: begin to generate random samples
[INFO]: generated random samples


In [7]:
# check ramdom_samples
random_samples

(array([[[  38,   -1,   -1,   -1],
         [ 819,   -1,   -1,   -1],
         [2444, 2431,   -1,   -1],
         ...,
         [6389, 5973,   -1,   -1],
         [  -1,   -1,   -1,   -1],
         [7910, 7850,   -1,   -1]],
 
        [[ 676,  607,  573,   -1],
         [1631, 1563,   -1,   -1],
         [1733, 1699,   -1,   -1],
         ...,
         [6459, 5950,   -1,   -1],
         [7234, 6724,   -1,   -1],
         [8163,   -1,   -1,   -1]],
 
        [[ 438,  283,   -1,   -1],
         [1212, 1057,  902,   -1],
         [1831,   -1,   -1,   -1],
         ...,
         [6414, 6336,   -1,   -1],
         [6863,   -1,   -1,   -1],
         [7469,   -1,   -1,   -1]],
 
        ...,
 
        [[ 688,  618,   -1,   -1],
         [1257,   -1,   -1,   -1],
         [1975, 1897, 1647,   -1],
         ...,
         [6439, 6337,   -1,   -1],
         [7309, 6642,   -1,   -1],
         [7665, 7614, 7512,   -1]],
 
        [[ 505,  454,   -1,   -1],
         [1527, 1476, 1426,   -1],
       

In [8]:
# generate initial value for embedding parameters
init_tensors = utils.get_ones_tensor(max_vocab_size_per_gpu=args["max_vocabulary_size_per_gpu"],
                                     embedding_vec_size=args["embedding_vec_size"],
                                     num=args["gpu_num"])

In [9]:
# check init_tensors
init_tensors

[array([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        ...,
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]], dtype=float32),
 array([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        ...,
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]], dtype=float32),
 array([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        ...,
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]], dtype=float32),
 array([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        ...,
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]], dtype=float32),
 array([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        ...,
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]], dtype=float32),
 array([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1

Sixly, define training loop for TensorFlow and SOK

In [10]:
def test_tf_demo(args, init_tensors, *random_samples):
    dataset = utils.tf_dataset(*random_samples, batchsize=args["global_batch_size"], to_sparse_tensor=True, repeat=1)

    loss_fn = tf.keras.losses.BinaryCrossentropy(from_logits=True)

    tf_demo = TfDemo(init_tensors, args["combiner"], args["global_batch_size"], 
                     args["slot_num"], args["embedding_vec_size"])

    optimizer = utils.get_dense_optimizer(args["optimizer"])(learning_rate=0.1)

    @tf.function
    def _train_step(inputs, labels):
        with tf.GradientTape() as tape:
            logit, embedding_vector = tf_demo(inputs, training=True)
            loss = loss_fn(labels, logit)
        grads = tape.gradient(loss, tf_demo.trainable_variables)
        optimizer.apply_gradients(zip(grads, tf_demo.trainable_variables))
        return logit, embedding_vector

    tf_results = list()

    for i, (sparse_tensors, labels) in enumerate(dataset):
        print("-"*30, str(i), "-"*30)
        logit, embedding_vector = _train_step(sparse_tensors, labels)
        print("[INFO]: embedding_vector:\n", embedding_vector)
        tf_results.append(embedding_vector)

        # FIXME: because plugin sleepd, here is only used for 
        # simulate the same DNN structure. 
        import time
        time.sleep(0.2) # seconds

    return tf_results

In [11]:
def test_sok_demo(args, init_tensors, *random_samples):
    strategy = tf.distribute.MirroredStrategy()
    with strategy.scope():
        result = sok.Init(global_batch_size=args["global_batch_size"])

        plugin_demo = SOKDemo(combiner=args["combiner"], 
                                 max_vocabulary_size_per_gpu=args["max_vocabulary_size_per_gpu"],
                                 slot_num=args["slot_num"], max_nnz=args["max_nnz"],
                                 embedding_vec_size=args["embedding_vec_size"])

        emb_opt = utils.get_embedding_optimizer(args["optimizer"])(learning_rate=0.1)
        dense_opt = utils.get_dense_optimizer(args["optimizer"])(learning_rate=0.1)

    plugin_saver = sok.Saver()

    plugin_saver.load_embedding_values(plugin_demo.embedding_layer.embedding_variable, init_tensors)

    loss_fn = tf.keras.losses.BinaryCrossentropy(from_logits=True, reduction=tf.keras.losses.Reduction.NONE)
    def _replica_loss(labels, logits):
        loss = loss_fn(labels, logits)
        return tf.nn.compute_average_loss(loss, global_batch_size=args["global_batch_size"])

    @tf.function
    def _train_step(inputs, labels):
        with tf.GradientTape() as tape:
            logit, embedding_vector = plugin_demo(inputs, training=True)
            loss = _replica_loss(labels, logit)
        embedding_variables, other_variable = sok.split_embedding_variable_from_others(plugin_demo.trainable_variables)
        grads, emb_grads = tape.gradient(loss, [other_variable, embedding_variables])
        if 'plugin' not in args["optimizer"]:
            with sok.OptimizerScope(embedding_variables):
                emb_opt.apply_gradients(zip(emb_grads, embedding_variables),
                                        experimental_aggregate_gradients=False)
        else:
            emb_opt.apply_gradients(zip(emb_grads, embedding_variables),
                                    experimental_aggregate_gradients=False)
        dense_opt.apply_gradients(zip(grads, other_variable))
        return logit, embedding_vector

    sok_results = list()

    def _dataset_fn(input_context):
        replica_batch_size = input_context.get_per_replica_batch_size(args["global_batch_size"])
        dataset = utils.tf_dataset(*random_samples, batchsize=replica_batch_size, to_sparse_tensor=True, repeat=1)
        dataset = dataset.shard(input_context.num_input_pipelines, input_context.input_pipeline_id)
        return dataset

    dataset = strategy.distribute_datasets_from_function(_dataset_fn)
    
    for i, (sparse_tensors, replica_labels) in enumerate(dataset):
        print("-" * 30, "step ", str(i), "-" * 30)
        logit, embedding_vector = strategy.run(_train_step, args=(sparse_tensors, replica_labels))
        print("[INFO]: embedding_vector\n", embedding_vector)
        sok_results.append(embedding_vector)

        # FIXME: when the forward computation is too fast, there
        # may exist some conficts with datareader, which cause the program hang.
        import time
        time.sleep(0.2) # seconds

    return sok_results

Sevenly, start training process. Compare whether the embedding vectors obtained from TensorFlow and SOK are consistent in all iterations.

In [12]:
# train TensorFlow Demo Model, this command will print each iteration's embedding vector
tf_results = test_tf_demo(args, init_tensors, *random_samples)

------------------------------ 0 ------------------------------
[INFO]: embedding_vector:
 tf.Tensor(
[[1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]
 ...
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]], shape=(65536, 40), dtype=float32)
------------------------------ 1 ------------------------------
[INFO]: embedding_vector:
 tf.Tensor(
[[0.9004926  0.9004926  0.9004926  ... 0.90051043 0.90051043 0.90051043]
 [0.9005401  0.9005401  0.9005401  ... 0.90051115 0.90051115 0.90051115]
 [0.9005135  0.9005135  0.9005135  ... 0.90062284 0.90062284 0.90062284]
 ...
 [0.90055627 0.90055627 0.90055627 ... 0.9005     0.9005     0.9005    ]
 [0.9004842  0.9004842  0.9004842  ... 0.90050197 0.90050197 0.90050197]
 [0.90063393 0.90063393 0.90063393 ... 0.9005594  0.9005594  0.9005594 ]], shape=(65536, 40), dtype=float32)
------------------------------ 2 ------------------------------
[INFO]: embedding_vector:
 tf.Tensor(
[[0.8013395  0.8013395  

------------------------------ 15 ------------------------------
[INFO]: embedding_vector:
 tf.Tensor(
[[-0.17062765 -0.17062765 -0.17062765 ... -0.17748016 -0.17748016
  -0.17748016]
 [-0.18568465 -0.18568465 -0.18568465 ... -0.16705126 -0.16705126
  -0.16705126]
 [-0.15869047 -0.15869047 -0.15869047 ... -0.17683107 -0.17683107
  -0.17683107]
 ...
 [-0.16965656 -0.16965656 -0.16965656 ... -0.18408883 -0.18408883
  -0.18408883]
 [-0.16662782 -0.16662782 -0.16662782 ... -0.17666113 -0.17666113
  -0.17666113]
 [-0.15216029 -0.15216029 -0.15216029 ... -0.17087948 -0.17087948
  -0.17087948]], shape=(65536, 40), dtype=float32)
------------------------------ 16 ------------------------------
[INFO]: embedding_vector:
 tf.Tensor(
[[-0.24626109 -0.24626109 -0.24626109 ... -0.21986282 -0.21986282
  -0.21986282]
 [-0.20474827 -0.20474827 -0.20474827 ... -0.20359749 -0.20359749
  -0.20359749]
 [-0.20505694 -0.20505694 -0.20505694 ... -0.24400447 -0.24400447
  -0.24400447]
 ...
 [-0.21973535 -0.21

------------------------------ 29 ------------------------------
[INFO]: embedding_vector:
 tf.Tensor(
[[-0.19414708 -0.19414708 -0.19414708 ... -0.16186176 -0.16186176
  -0.16186176]
 [-0.18264094 -0.18264094 -0.18264094 ... -0.18946269 -0.18946269
  -0.18946269]
 [-0.18857247 -0.18857247 -0.18857247 ... -0.19120839 -0.19120839
  -0.19120839]
 ...
 [-0.1658213  -0.1658213  -0.1658213  ... -0.19272026 -0.19272026
  -0.19272026]
 [-0.1939144  -0.1939144  -0.1939144  ... -0.1559137  -0.1559137
  -0.1559137 ]
 [-0.16430692 -0.16430692 -0.16430692 ... -0.17493583 -0.17493583
  -0.17493583]], shape=(65536, 40), dtype=float32)
------------------------------ 30 ------------------------------
[INFO]: embedding_vector:
 tf.Tensor(
[[-0.17045125 -0.17045125 -0.17045125 ... -0.15462823 -0.15462823
  -0.15462823]
 [-0.17679203 -0.17679203 -0.17679203 ... -0.16658922 -0.16658922
  -0.16658922]
 [-0.15218404 -0.15218404 -0.15218404 ... -0.15486553 -0.15486553
  -0.15486553]
 ...
 [-0.15468912 -0.154

------------------------------ 43 ------------------------------
[INFO]: embedding_vector:
 tf.Tensor(
[[-0.2105065  -0.2105065  -0.2105065  ... -0.18041237 -0.18041237
  -0.18041237]
 [-0.20118019 -0.20118019 -0.20118019 ... -0.26570445 -0.26570445
  -0.26570445]
 [-0.20779352 -0.20779352 -0.20779352 ... -0.18782339 -0.18782339
  -0.18782339]
 ...
 [-0.18462594 -0.18462594 -0.18462594 ... -0.19564806 -0.19564806
  -0.19564806]
 [-0.22160469 -0.22160469 -0.22160469 ... -0.18537885 -0.18537885
  -0.18537885]
 [-0.20659022 -0.20659022 -0.20659022 ... -0.22612928 -0.22612928
  -0.22612928]], shape=(65536, 40), dtype=float32)
------------------------------ 44 ------------------------------
[INFO]: embedding_vector:
 tf.Tensor(
[[-0.2109933  -0.2109933  -0.2109933  ... -0.21877787 -0.21877787
  -0.21877787]
 [-0.21374518 -0.21374518 -0.21374518 ... -0.16920957 -0.16920957
  -0.16920957]
 [-0.23200426 -0.23200426 -0.23200426 ... -0.20374995 -0.20374995
  -0.20374995]
 ...
 [-0.22763464 -0.22

In [13]:
# train SOK Demo Model, this command will print each iteration's embedding vector 
sok_results = test_sok_demo(args, init_tensors, *random_samples)

INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0', '/job:localhost/replica:0/task:0/device:GPU:1', '/job:localhost/replica:0/task:0/device:GPU:2', '/job:localhost/replica:0/task:0/device:GPU:3', '/job:localhost/replica:0/task:0/device:GPU:4', '/job:localhost/replica:0/task:0/device:GPU:5', '/job:localhost/replica:0/task:0/device:GPU:6', '/job:localhost/replica:0/task:0/device:GPU:7')
You are using the plugin with MirroredStrategy.
------------------------------ step  0 ------------------------------
INFO:tensorflow:batch_all_reduce: 2 all-reduces with algorithm = nccl, num_packs = 1
INFO:tensorflow:batch_all_reduce: 2 all-reduces with algorithm = nccl, num_packs = 1
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]
 ...
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[1. 1. 1. ..

------------------------------ step  3 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[0.7060577  0.7060577  0.7060577  ... 0.70395017 0.70395017 0.70395017]
 [0.7036203  0.7036203  0.7036203  ... 0.7027384  0.7027384  0.7027384 ]
 [0.7021664  0.7021664  0.7021664  ... 0.7022897  0.7022897  0.7022897 ]
 ...
 [0.7051804  0.7051804  0.7051804  ... 0.7032761  0.7032761  0.7032761 ]
 [0.70289946 0.70289946 0.70289946 ... 0.70413184 0.70413184 0.70413184]
 [0.7064773  0.7064773  0.7064773  ... 0.703575   0.703575   0.703575  ]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[0.7029407  0.7029407  0.7029407  ... 0.70313156 0.70313156 0.70313156]
 [0.70317686 0.70317686 0.70317686 ... 0.70624536 0.70624536 0.70624536]
 [0.7043319  0.7043319  0.7043319  ... 0.7035043  0.7035043  0.7035043 ]
 ...
 [0.70132977 0.70132977 0.70132977 ... 0.70977783 0.70977783 0.70977783]
 [0.7031638  0.7031638  0.7031638  ... 0.7029564  0.7029564  0.7029564 ]
 [0.703057

------------------------------ step  6 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[0.4105538  0.4105538  0.4105538  ... 0.42588177 0.42588177 0.42588177]
 [0.4197083  0.4197083  0.4197083  ... 0.41922885 0.41922885 0.41922885]
 [0.41254905 0.41254905 0.41254905 ... 0.41198918 0.41198918 0.41198918]
 ...
 [0.42641217 0.42641217 0.42641217 ... 0.41832227 0.41832227 0.41832227]
 [0.42001283 0.42001283 0.42001283 ... 0.413566   0.413566   0.413566  ]
 [0.4191956  0.4191956  0.4191956  ... 0.4227935  0.4227935  0.4227935 ]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[0.4204505  0.4204505  0.4204505  ... 0.4150052  0.4150052  0.4150052 ]
 [0.42168778 0.42168778 0.42168778 ... 0.41749144 0.41749144 0.41749144]
 [0.41580355 0.41580355 0.41580355 ... 0.42963973 0.42963973 0.42963973]
 ...
 [0.42314714 0.42314714 0.42314714 ... 0.4265498  0.4265498  0.4265498 ]
 [0.41508466 0.41508466 0.41508466 ... 0.41742906 0.41742906 0.41742906]
 [0.418328

------------------------------ step  9 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[0.1579948  0.1579948  0.1579948  ... 0.17363048 0.17363048 0.17363048]
 [0.14760572 0.14760572 0.14760572 ... 0.15634492 0.15634492 0.15634492]
 [0.15677038 0.15677038 0.15677038 ... 0.16316155 0.16316155 0.16316155]
 ...
 [0.16873369 0.16873369 0.16873369 ... 0.15639687 0.15639687 0.15639687]
 [0.16345012 0.16345012 0.16345012 ... 0.1576001  0.1576001  0.1576001 ]
 [0.16045508 0.16045508 0.16045508 ... 0.15251702 0.15251702 0.15251702]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[0.15678135 0.15678135 0.15678135 ... 0.16477747 0.16477747 0.16477747]
 [0.16192615 0.16192615 0.16192615 ... 0.16698089 0.16698089 0.16698089]
 [0.15099943 0.15099943 0.15099943 ... 0.17474762 0.17474762 0.17474762]
 ...
 [0.16206431 0.16206431 0.16206431 ... 0.16256058 0.16256058 0.16256058]
 [0.1687254  0.1687254  0.1687254  ... 0.16451517 0.16451517 0.16451517]
 [0.154466

------------------------------ step  12 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.01844369 -0.01844369 -0.01844369 ... -0.03240526 -0.03240526
  -0.03240526]
 [-0.03888315 -0.03888315 -0.03888315 ... -0.02606202 -0.02606202
  -0.02606202]
 [-0.02816605 -0.02816605 -0.02816605 ... -0.02493167 -0.02493167
  -0.02493167]
 ...
 [-0.02413027 -0.02413027 -0.02413027 ... -0.04944995 -0.04944995
  -0.04944995]
 [-0.03621512 -0.03621512 -0.03621512 ... -0.01311048 -0.01311048
  -0.01311048]
 [-0.03537244 -0.03537244 -0.03537244 ... -0.03908113 -0.03908113
  -0.03908113]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.02457406 -0.02457406 -0.02457406 ... -0.00881258 -0.00881258
  -0.00881258]
 [-0.02927418 -0.02927418 -0.02927418 ... -0.03563254 -0.03563254
  -0.03563254]
 [-0.02485804 -0.02485804 -0.02485804 ... -0.02413274 -0.02413274
  -0.02413274]
 ...
 [-0.02219195 -0.02219195 -0.02219195 ... -0.02059275 -0.02059275
  -0.02059275]
 [

------------------------------ step  14 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.13456151 -0.13456151 -0.13456151 ... -0.12766093 -0.12766093
  -0.12766093]
 [-0.12349813 -0.12349813 -0.12349813 ... -0.12989113 -0.12989113
  -0.12989113]
 [-0.13824216 -0.13824216 -0.13824216 ... -0.12623492 -0.12623492
  -0.12623492]
 ...
 [-0.16102213 -0.16102213 -0.16102213 ... -0.12952195 -0.12952195
  -0.12952195]
 [-0.12476768 -0.12476768 -0.12476768 ... -0.12646726 -0.12646726
  -0.12646726]
 [-0.1300712  -0.1300712  -0.1300712  ... -0.13060434 -0.13060434
  -0.13060434]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.12628663 -0.12628663 -0.12628663 ... -0.12169008 -0.12169008
  -0.12169008]
 [-0.11757454 -0.11757454 -0.11757454 ... -0.12088549 -0.12088549
  -0.12088549]
 [-0.12744215 -0.12744215 -0.12744215 ... -0.11539056 -0.11539056
  -0.11539056]
 ...
 [-0.12925446 -0.12925446 -0.12925446 ... -0.13329597 -0.13329597
  -0.13329597]
 [

------------------------------ step  16 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.24625628 -0.24625628 -0.24625628 ... -0.21985784 -0.21985784
  -0.21985784]
 [-0.20474344 -0.20474344 -0.20474344 ... -0.20359266 -0.20359266
  -0.20359266]
 [-0.20505208 -0.20505208 -0.20505208 ... -0.2439996  -0.2439996
  -0.2439996 ]
 ...
 [-0.19627512 -0.19627512 -0.19627512 ... -0.21528088 -0.21528088
  -0.21528088]
 [-0.20826048 -0.20826048 -0.20826048 ... -0.21069205 -0.21069205
  -0.21069205]
 [-0.2158664  -0.2158664  -0.2158664  ... -0.20976496 -0.20976496
  -0.20976496]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.22018811 -0.22018811 -0.22018811 ... -0.20702025 -0.20702025
  -0.20702025]
 [-0.20266518 -0.20266518 -0.20266518 ... -0.20552881 -0.20552881
  -0.20552881]
 [-0.22961155 -0.22961155 -0.22961155 ... -0.22339785 -0.22339785
  -0.22339785]
 ...
 [-0.21654728 -0.21654728 -0.21654728 ... -0.22036771 -0.22036771
  -0.22036771]
 [-

------------------------------ step  18 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.27355644 -0.27355644 -0.27355644 ... -0.27461708 -0.27461708
  -0.27461708]
 [-0.2757353  -0.2757353  -0.2757353  ... -0.26505312 -0.26505312
  -0.26505312]
 [-0.2471411  -0.2471411  -0.2471411  ... -0.29370663 -0.29370663
  -0.29370663]
 ...
 [-0.2874976  -0.2874976  -0.2874976  ... -0.2775972  -0.2775972
  -0.2775972 ]
 [-0.2692095  -0.2692095  -0.2692095  ... -0.25273898 -0.25273898
  -0.25273898]
 [-0.2916608  -0.2916608  -0.2916608  ... -0.26685974 -0.26685974
  -0.26685974]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.24646106 -0.24646106 -0.24646106 ... -0.27069205 -0.27069205
  -0.27069205]
 [-0.2740635  -0.2740635  -0.2740635  ... -0.2720453  -0.2720453
  -0.2720453 ]
 [-0.27998734 -0.27998734 -0.27998734 ... -0.28767952 -0.28767952
  -0.28767952]
 ...
 [-0.26766163 -0.26766163 -0.26766163 ... -0.27459824 -0.27459824
  -0.27459824]
 [-0

------------------------------ step  20 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.29480737 -0.29480737 -0.29480737 ... -0.29964885 -0.29964885
  -0.29964885]
 [-0.28105026 -0.28105026 -0.28105026 ... -0.29397857 -0.29397857
  -0.29397857]
 [-0.31948215 -0.31948215 -0.31948215 ... -0.29435566 -0.29435566
  -0.29435566]
 ...
 [-0.29461217 -0.29461217 -0.29461217 ... -0.29549837 -0.29549837
  -0.29549837]
 [-0.32828438 -0.32828438 -0.32828438 ... -0.31999078 -0.31999078
  -0.31999078]
 [-0.30701736 -0.30701736 -0.30701736 ... -0.32273546 -0.32273546
  -0.32273546]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.31106097 -0.31106097 -0.31106097 ... -0.31862557 -0.31862557
  -0.31862557]
 [-0.3205421  -0.3205421  -0.3205421  ... -0.29235953 -0.29235953
  -0.29235953]
 [-0.29246277 -0.29246277 -0.29246277 ... -0.3035956  -0.3035956
  -0.3035956 ]
 ...
 [-0.29327172 -0.29327172 -0.29327172 ... -0.30096027 -0.30096027
  -0.30096027]
 [-

------------------------------ step  22 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.29692656 -0.29692656 -0.29692656 ... -0.29907715 -0.29907715
  -0.29907715]
 [-0.2902543  -0.2902543  -0.2902543  ... -0.31220055 -0.31220055
  -0.31220055]
 [-0.30377987 -0.30377987 -0.30377987 ... -0.30168673 -0.30168673
  -0.30168673]
 ...
 [-0.32135484 -0.32135484 -0.32135484 ... -0.32732773 -0.32732773
  -0.32732773]
 [-0.29787904 -0.29787904 -0.29787904 ... -0.29608577 -0.29608577
  -0.29608577]
 [-0.32090095 -0.32090095 -0.32090095 ... -0.29767215 -0.29767215
  -0.29767215]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.30389607 -0.30389607 -0.30389607 ... -0.289379   -0.289379
  -0.289379  ]
 [-0.27885956 -0.27885956 -0.27885956 ... -0.30369315 -0.30369315
  -0.30369315]
 [-0.29801762 -0.29801762 -0.29801762 ... -0.2984103  -0.2984103
  -0.2984103 ]
 ...
 [-0.29698035 -0.29698035 -0.29698035 ... -0.30005583 -0.30005583
  -0.30005583]
 [-0.

------------------------------ step  24 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.26579773 -0.26579773 -0.26579773 ... -0.28213546 -0.28213546
  -0.28213546]
 [-0.28979206 -0.28979206 -0.28979206 ... -0.24270838 -0.24270838
  -0.24270838]
 [-0.25450405 -0.25450405 -0.25450405 ... -0.28582194 -0.28582194
  -0.28582194]
 ...
 [-0.27448225 -0.27448225 -0.27448225 ... -0.28782874 -0.28782874
  -0.28782874]
 [-0.26438886 -0.26438886 -0.26438886 ... -0.2636644  -0.2636644
  -0.2636644 ]
 [-0.2883687  -0.2883687  -0.2883687  ... -0.29709715 -0.29709715
  -0.29709715]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.24874389 -0.24874389 -0.24874389 ... -0.2826122  -0.2826122
  -0.2826122 ]
 [-0.2706433  -0.2706433  -0.2706433  ... -0.30472493 -0.30472493
  -0.30472493]
 [-0.26901674 -0.26901674 -0.26901674 ... -0.2697509  -0.2697509
  -0.2697509 ]
 ...
 [-0.26536915 -0.26536915 -0.26536915 ... -0.26986572 -0.26986572
  -0.26986572]
 [-0.

------------------------------ step  26 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.24492475 -0.24492475 -0.24492475 ... -0.24990399 -0.24990399
  -0.24990399]
 [-0.26482874 -0.26482874 -0.26482874 ... -0.189196   -0.189196
  -0.189196  ]
 [-0.2322499  -0.2322499  -0.2322499  ... -0.25062007 -0.25062007
  -0.25062007]
 ...
 [-0.25369114 -0.25369114 -0.25369114 ... -0.2361034  -0.2361034
  -0.2361034 ]
 [-0.24962084 -0.24962084 -0.24962084 ... -0.21201885 -0.21201885
  -0.21201885]
 [-0.23945338 -0.23945338 -0.23945338 ... -0.23686072 -0.23686072
  -0.23686072]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.22326243 -0.22326243 -0.22326243 ... -0.21573405 -0.21573405
  -0.21573405]
 [-0.21413173 -0.21413173 -0.21413173 ... -0.26417392 -0.26417392
  -0.26417392]
 [-0.1993093  -0.1993093  -0.1993093  ... -0.22902411 -0.22902411
  -0.22902411]
 ...
 [-0.22004037 -0.22004037 -0.22004037 ... -0.21198806 -0.21198806
  -0.21198806]
 [-0.

------------------------------ step  28 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.18635729 -0.18635729 -0.18635729 ... -0.20042941 -0.20042941
  -0.20042941]
 [-0.19710803 -0.19710803 -0.19710803 ... -0.17423408 -0.17423408
  -0.17423408]
 [-0.15554178 -0.15554178 -0.15554178 ... -0.18868107 -0.18868107
  -0.18868107]
 ...
 [-0.19996485 -0.19996485 -0.19996485 ... -0.17833707 -0.17833707
  -0.17833707]
 [-0.18448445 -0.18448445 -0.18448445 ... -0.18585192 -0.18585192
  -0.18585192]
 [-0.18646042 -0.18646042 -0.18646042 ... -0.20021617 -0.20021617
  -0.20021617]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.21064489 -0.21064489 -0.21064489 ... -0.1908755  -0.1908755
  -0.1908755 ]
 [-0.19490421 -0.19490421 -0.19490421 ... -0.21328694 -0.21328694
  -0.21328694]
 [-0.21340495 -0.21340495 -0.21340495 ... -0.18713818 -0.18713818
  -0.18713818]
 ...
 [-0.1790587  -0.1790587  -0.1790587  ... -0.19395858 -0.19395858
  -0.19395858]
 [-

------------------------------ step  30 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.17044371 -0.17044371 -0.17044371 ... -0.15462027 -0.15462027
  -0.15462027]
 [-0.1767847  -0.1767847  -0.1767847  ... -0.16658153 -0.16658153
  -0.16658153]
 [-0.15217653 -0.15217653 -0.15217653 ... -0.15485828 -0.15485828
  -0.15485828]
 ...
 [-0.17875132 -0.17875132 -0.17875132 ... -0.20561591 -0.20561591
  -0.20561591]
 [-0.1772938  -0.1772938  -0.1772938  ... -0.17730264 -0.17730264
  -0.17730264]
 [-0.11076629 -0.11076629 -0.11076629 ... -0.19088434 -0.19088434
  -0.19088434]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.20619439 -0.20619439 -0.20619439 ... -0.1684539  -0.1684539
  -0.1684539 ]
 [-0.18725482 -0.18725482 -0.18725482 ... -0.16075353 -0.16075353
  -0.16075353]
 [-0.17245702 -0.17245702 -0.17245702 ... -0.16585478 -0.16585478
  -0.16585478]
 ...
 [-0.13223642 -0.13223642 -0.13223642 ... -0.14630026 -0.14630026
  -0.14630026]
 [-

------------------------------ step  32 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.12899607 -0.12899607 -0.12899607 ... -0.14400077 -0.14400077
  -0.14400077]
 [-0.1569424  -0.1569424  -0.1569424  ... -0.15274674 -0.15274674
  -0.15274674]
 [-0.12037688 -0.12037688 -0.12037688 ... -0.16365762 -0.16365762
  -0.16365762]
 ...
 [-0.17451555 -0.17451555 -0.17451555 ... -0.14207014 -0.14207014
  -0.14207014]
 [-0.16671592 -0.16671592 -0.16671592 ... -0.17051905 -0.17051905
  -0.17051905]
 [-0.15482403 -0.15482403 -0.15482403 ... -0.13039105 -0.13039105
  -0.13039105]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.15434495 -0.15434495 -0.15434495 ... -0.13361761 -0.13361761
  -0.13361761]
 [-0.15515858 -0.15515858 -0.15515858 ... -0.14111197 -0.14111197
  -0.14111197]
 [-0.14025939 -0.14025939 -0.14025939 ... -0.14732832 -0.14732832
  -0.14732832]
 ...
 [-0.14195338 -0.14195338 -0.14195338 ... -0.1728141  -0.1728141
  -0.1728141 ]
 [-

------------------------------ step  34 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.12835163 -0.12835163 -0.12835163 ... -0.1330441  -0.1330441
  -0.1330441 ]
 [-0.12473193 -0.12473193 -0.12473193 ... -0.14387807 -0.14387807
  -0.14387807]
 [-0.16320704 -0.16320704 -0.16320704 ... -0.13258144 -0.13258144
  -0.13258144]
 ...
 [-0.13822386 -0.13822386 -0.13822386 ... -0.15052137 -0.15052137
  -0.15052137]
 [-0.11483593 -0.11483593 -0.11483593 ... -0.1546197  -0.1546197
  -0.1546197 ]
 [-0.12905166 -0.12905166 -0.12905166 ... -0.14812905 -0.14812905
  -0.14812905]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.15970655 -0.15970655 -0.15970655 ... -0.12368432 -0.12368432
  -0.12368432]
 [-0.11259543 -0.11259543 -0.11259543 ... -0.14034893 -0.14034893
  -0.14034893]
 [-0.12332937 -0.12332937 -0.12332937 ... -0.08909948 -0.08909948
  -0.08909948]
 ...
 [-0.1182633  -0.1182633  -0.1182633  ... -0.10544099 -0.10544099
  -0.10544099]
 [-0

------------------------------ step  36 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.17245945 -0.17245945 -0.17245945 ... -0.13498092 -0.13498092
  -0.13498092]
 [-0.14281653 -0.14281653 -0.14281653 ... -0.13562274 -0.13562274
  -0.13562274]
 [-0.14343955 -0.14343955 -0.14343955 ... -0.16686746 -0.16686746
  -0.16686746]
 ...
 [-0.12131992 -0.12131992 -0.12131992 ... -0.12639962 -0.12639962
  -0.12639962]
 [-0.11779241 -0.11779241 -0.11779241 ... -0.1526514  -0.1526514
  -0.1526514 ]
 [-0.14451557 -0.14451557 -0.14451557 ... -0.13266848 -0.13266848
  -0.13266848]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.12822081 -0.12822081 -0.12822081 ... -0.14106819 -0.14106819
  -0.14106819]
 [-0.15334699 -0.15334699 -0.15334699 ... -0.14020355 -0.14020355
  -0.14020355]
 [-0.13633403 -0.13633403 -0.13633403 ... -0.11283325 -0.11283325
  -0.11283325]
 ...
 [-0.125033   -0.125033   -0.125033   ... -0.12299336 -0.12299336
  -0.12299336]
 [-

------------------------------ step  38 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.21158373 -0.21158373 -0.21158373 ... -0.1317352  -0.1317352
  -0.1317352 ]
 [-0.15868872 -0.15868872 -0.15868872 ... -0.16532277 -0.16532277
  -0.16532277]
 [-0.13352317 -0.13352317 -0.13352317 ... -0.13595678 -0.13595678
  -0.13595678]
 ...
 [-0.16263115 -0.16263115 -0.16263115 ... -0.18466619 -0.18466619
  -0.18466619]
 [-0.15793282 -0.15793282 -0.15793282 ... -0.17517017 -0.17517017
  -0.17517017]
 [-0.09809172 -0.09809172 -0.09809172 ... -0.19828922 -0.19828922
  -0.19828922]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.15617965 -0.15617965 -0.15617965 ... -0.16749883 -0.16749883
  -0.16749883]
 [-0.17097507 -0.17097507 -0.17097507 ... -0.13153398 -0.13153398
  -0.13153398]
 [-0.15691312 -0.15691312 -0.15691312 ... -0.14331326 -0.14331326
  -0.14331326]
 ...
 [-0.16439992 -0.16439992 -0.16439992 ... -0.19523953 -0.19523953
  -0.19523953]
 [-

------------------------------ step  40 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.14578068 -0.14578068 -0.14578068 ... -0.20067462 -0.20067462
  -0.20067462]
 [-0.13057902 -0.13057902 -0.13057902 ... -0.19506142 -0.19506142
  -0.19506142]
 [-0.20201227 -0.20201227 -0.20201227 ... -0.16477202 -0.16477202
  -0.16477202]
 ...
 [-0.14560255 -0.14560255 -0.14560255 ... -0.16252151 -0.16252151
  -0.16252151]
 [-0.16606383 -0.16606383 -0.16606383 ... -0.17239738 -0.17239738
  -0.17239738]
 [-0.19666556 -0.19666556 -0.19666556 ... -0.15619218 -0.15619218
  -0.15619218]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.17178753 -0.17178753 -0.17178753 ... -0.17795643 -0.17795643
  -0.17795643]
 [-0.19007666 -0.19007666 -0.19007666 ... -0.19053559 -0.19053559
  -0.19053559]
 [-0.14407048 -0.14407048 -0.14407048 ... -0.14322627 -0.14322627
  -0.14322627]
 ...
 [-0.20170252 -0.20170252 -0.20170252 ... -0.16819994 -0.16819994
  -0.16819994]
 [

------------------------------ step  42 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.19167188 -0.19167188 -0.19167188 ... -0.26264137 -0.26264137
  -0.26264137]
 [-0.21755669 -0.21755669 -0.21755669 ... -0.2098138  -0.2098138
  -0.2098138 ]
 [-0.19374077 -0.19374077 -0.19374077 ... -0.20081697 -0.20081697
  -0.20081697]
 ...
 [-0.18769276 -0.18769276 -0.18769276 ... -0.20685501 -0.20685501
  -0.20685501]
 [-0.17655183 -0.17655183 -0.17655183 ... -0.19016416 -0.19016416
  -0.19016416]
 [-0.15241548 -0.15241548 -0.15241548 ... -0.19887695 -0.19887695
  -0.19887695]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.1931665  -0.1931665  -0.1931665  ... -0.13349657 -0.13349657
  -0.13349657]
 [-0.19947068 -0.19947068 -0.19947068 ... -0.19171639 -0.19171639
  -0.19171639]
 [-0.22939122 -0.22939122 -0.22939122 ... -0.2026543  -0.2026543
  -0.2026543 ]
 ...
 [-0.19573733 -0.19573733 -0.19573733 ... -0.21818948 -0.21818948
  -0.21818948]
 [-0

------------------------------ step  44 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.2109896  -0.2109896  -0.2109896  ... -0.21877484 -0.21877484
  -0.21877484]
 [-0.21374257 -0.21374257 -0.21374257 ... -0.16920632 -0.16920632
  -0.16920632]
 [-0.23200105 -0.23200105 -0.23200105 ... -0.2037467  -0.2037467
  -0.2037467 ]
 ...
 [-0.20910196 -0.20910196 -0.20910196 ... -0.1786393  -0.1786393
  -0.1786393 ]
 [-0.19057699 -0.19057699 -0.19057699 ... -0.17450081 -0.17450081
  -0.17450081]
 [-0.24734297 -0.24734297 -0.24734297 ... -0.22391725 -0.22391725
  -0.22391725]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.21556352 -0.21556352 -0.21556352 ... -0.20855372 -0.20855372
  -0.20855372]
 [-0.22166187 -0.22166187 -0.22166187 ... -0.18645974 -0.18645974
  -0.18645974]
 [-0.26165748 -0.26165748 -0.26165748 ... -0.20556459 -0.20556459
  -0.20556459]
 ...
 [-0.22729544 -0.22729544 -0.22729544 ... -0.26249874 -0.26249874
  -0.26249874]
 [-0

------------------------------ step  46 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.26487082 -0.26487082 -0.26487082 ... -0.23970175 -0.23970175
  -0.23970175]
 [-0.2404138  -0.2404138  -0.2404138  ... -0.26857358 -0.26857358
  -0.26857358]
 [-0.24418813 -0.24418813 -0.24418813 ... -0.22789705 -0.22789705
  -0.22789705]
 ...
 [-0.23508665 -0.23508665 -0.23508665 ... -0.23829213 -0.23829213
  -0.23829213]
 [-0.22368917 -0.22368917 -0.22368917 ... -0.2507833  -0.2507833
  -0.2507833 ]
 [-0.2787284  -0.2787284  -0.2787284  ... -0.21240795 -0.21240795
  -0.21240795]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.28410178 -0.28410178 -0.28410178 ... -0.27136627 -0.27136627
  -0.27136627]
 [-0.24789992 -0.24789992 -0.24789992 ... -0.2731675  -0.2731675
  -0.2731675 ]
 [-0.2615593  -0.2615593  -0.2615593  ... -0.2276803  -0.2276803
  -0.2276803 ]
 ...
 [-0.2253224  -0.2253224  -0.2253224  ... -0.24861962 -0.24861962
  -0.24861962]
 [-0.

------------------------------ step  48 ------------------------------
[INFO]: embedding_vector
 PerReplica:{
  0: tf.Tensor(
[[-0.22812551 -0.22812551 -0.22812551 ... -0.2713868  -0.2713868
  -0.2713868 ]
 [-0.2600761  -0.2600761  -0.2600761  ... -0.25804752 -0.25804752
  -0.25804752]
 [-0.22659056 -0.22659056 -0.22659056 ... -0.24469978 -0.24469978
  -0.24469978]
 ...
 [-0.21432105 -0.21432105 -0.21432105 ... -0.23977128 -0.23977128
  -0.23977128]
 [-0.28816798 -0.28816798 -0.28816798 ... -0.26758283 -0.26758283
  -0.26758283]
 [-0.20406762 -0.20406762 -0.20406762 ... -0.2416647  -0.2416647
  -0.2416647 ]], shape=(8192, 40), dtype=float32),
  1: tf.Tensor(
[[-0.22131589 -0.22131589 -0.22131589 ... -0.25021574 -0.25021574
  -0.25021574]
 [-0.21980968 -0.21980968 -0.21980968 ... -0.24015644 -0.24015644
  -0.24015644]
 [-0.25869566 -0.25869566 -0.25869566 ... -0.24941725 -0.24941725
  -0.24941725]
 ...
 [-0.23439011 -0.23439011 -0.23439011 ... -0.22400156 -0.22400156
  -0.22400156]
 [-0

Finally, check the consistency of the embedding vectors obtained from TensorFlow ans SOK.

In [14]:
if (len(sok_results) != len(tf_results)):
    raise ValueError("The length of sok results is not equal to that of TensorFlow.")
if (len(tf_results) != args["iter_num"]):
    raise ValueError("The length of embedding vectors: %d is not equal to iteration number: %d."
                    %(len(tf_results), args["iter_num"]))
    
for i, sok_vector in enumerate(sok_results):
    if args["gpu_num"] != 1:
        sok_vector = tf.stack(sok_vector.values, axis=0)
    tf.debugging.assert_near(tf.reshape(sok_vector,
                                        shape=[-1, tf.shape(sok_vector)[-1]]),
                             tf_results[i],
                             atol=1e-4,
                             rtol=1e-4,
                             message="The embedding vectors obtained from TF and SOK vary in iteration: %d" %i)
    
# if no exception, then the embedding vectors for all iterations are consistent.
print(("\n[INFO]: With MirroredStrategy, when %d GPUs are used, the embedding vectors obtained from TensorFlow" 
       "and SOK are consistent for %d iterations") 
      %(args["gpu_num"], args["iter_num"]))


[INFO]: With MirroredStrategy, when 8 GPUs are used, the embedding vectors obtained from TensorFlowand SOK are consistent for 50 iterations


### Multi-node, Multi-GPUs synchronized training ###

**The jupyter notebook must be restarted!!**

Firstly, specify hyper parameters

In [1]:
%reset -f

args = dict()

args["iter_num"] = 50                             # the number of training iteration
args["max_vocabulary_size_per_gpu"] = 1024
args["slot_num"] = 10                             # the number of feature fields in this embedding layer
args["max_nnz"] = 4                               # the maximum number of valid features in each slot
args["embedding_vec_size"] = 4                    # the dimension of embedding vectors
args["combiner"] = "mean"                         # the reduction combiner used intra slots, it can be [mean, sum]
args["global_batch_size"] = 65536                 # the globally batchsize for all GPUs
args["optimizer"] = "plugin_adam"                 # the optimizer used for training, it can be [plugin_adam, adam, sgd]
args["ips"] = ["localhost", "localhost"]          # specify the ip addr of each node. Here we use different GPUs to 
                                                  # simulate multi-node with single-node    
args["worker_num"] = len(args["ips"])             # the number of workers in synchronized training

In [2]:
import sys, os, json
sys.path.append("../")
import sparse_operation_kit as sok
import tensorflow as tf
import numpy as np
# import utility python script
sys.path.append("../unit_test/test_scripts/")
import utils

[INFO]: sparse_operation_kit is imported


In [3]:
total_gpu_num = utils.get_local_gpu_count()
print("[INFO]: There are %d GPUs in total" %total_gpu_num)
if (total_gpu_num % args["worker_num"] != 0):
    raise RuntimeError("total_gpu_num:%d is not divisible by workers_num: %d" %(total_gpu_num, args["worker_num"]))
    
per_worker_gpu_num = total_gpu_num // args["worker_num"]
args["local_gpu_num"] = per_worker_gpu_num # the number of avaiable GPUs in each process

[INFO]: There are 16 GPUs in total


Secondly, define DNN model using Tensorflow and SOK.

In [4]:
class TfDemo(tf.keras.models.Model):
    def __init__(self, 
                 init_tensors, 
                 combiner, 
                 global_batch_size,
                 slot_num, 
                 embedding_vec_size,
                 **kwargs):
        super(TfDemo, self).__init__(**kwargs)
        self.combiner = combiner
        self.global_batch_size = global_batch_size
        self.slot_num = slot_num
        self.embedding_vec_size = embedding_vec_size

        self.init_tensors = init_tensors
        self.params = tf.Variable(initial_value=tf.concat(self.init_tensors, axis=0))

        self.dense_layer = tf.keras.layers.Dense(units=1, activation=None,
                                                 kernel_initializer="ones",
                                                 bias_initializer="zeros")

    def call(self, inputs, training=True):
        # [batchsize * slot_num, embedding_vec_size]
        embedding_vector = tf.nn.embedding_lookup_sparse(params=self.params, sp_ids=inputs,
                                                        sp_weights=None, combiner=self.combiner)

        # [batchsize, slot_num * embedding_vec_size]
        embedding_vector = tf.reshape(embedding_vector, shape=[self.global_batch_size, self.slot_num * self.embedding_vec_size])
        logit = self.dense_layer(embedding_vector)
        return logit, embedding_vector

In [5]:
class SOKDemo(tf.keras.models.Model):
    def __init__(self,
                 combiner,
                 max_vocabulary_size_per_gpu,
                 slot_num,
                 max_nnz,
                 embedding_vec_size, 
                 **kwargs):
        super(SOKDemo, self).__init__(**kwargs)

        self.combiner = combiner
        self.max_vocabulary_size_per_gpu = max_vocabulary_size_per_gpu
        self.slot_num = slot_num
        self.max_nnz = max_nnz
        self.embedding_vec_size = embedding_vec_size

        self.embedding_layer = sok.DistributedEmbedding(combiner=self.combiner,
                                                           max_vocabulary_size_per_gpu=self.max_vocabulary_size_per_gpu,
                                                           embedding_vec_size=self.embedding_vec_size,
                                                           slot_num=self.slot_num,
                                                           max_nnz=self.max_nnz)

        self.dense_layer = tf.keras.layers.Dense(units=1, activation=None,
                                                 kernel_initializer="ones",
                                                 bias_initializer="zeros")

    def call(self, inputs, training=True):
        # [batchsize, slot_num, embedding_vec_size]
        embedding_vector = self.embedding_layer(inputs, training=training)
        # [batchsize, slot_num * embedding_vec_size]
        embedding_vector = tf.reshape(embedding_vector, shape=[-1, self.slot_num * self.embedding_vec_size])
        # [batchsize, 1]
        logit = self.dense_layer(embedding_vector)
        return logit, embedding_vector

In [6]:
def test_tf_demo(args, init_tensors, *random_samples):
    dataset = utils.tf_dataset(*random_samples, batchsize=args["global_batch_size"], to_sparse_tensor=True, repeat=1)

    loss_fn = tf.keras.losses.BinaryCrossentropy(from_logits=True)

    tf_demo = TfDemo(init_tensors, args["combiner"], args["global_batch_size"], 
                     args["slot_num"], args["embedding_vec_size"])

    optimizer = utils.get_dense_optimizer(args["optimizer"])(learning_rate=0.1)

    @tf.function
    def _train_step(inputs, labels):
        with tf.GradientTape() as tape:
            logit, embedding_vector = tf_demo(inputs, training=True)
            loss = loss_fn(labels, logit)
        grads = tape.gradient(loss, tf_demo.trainable_variables)
        optimizer.apply_gradients(zip(grads, tf_demo.trainable_variables))
        return logit, embedding_vector

    tf_results = list()

    for i, (sparse_tensors, labels) in enumerate(dataset):
        print("-"*30, str(i), "-"*30)
        logit, embedding_vector = _train_step(sparse_tensors, labels)
        print("[INFO]: embedding_vector:\n", embedding_vector)
        tf_results.append(embedding_vector)

        # FIXME: because plugin sleepd, here is only used for 
        # simulate the same DNN structure. 
        import time
        time.sleep(0.2) # seconds

    return tf_results

Thirdly, define multi-node training loop for SOK.

In [7]:
def test_sok_demo(args, task_id, init_tensors, *random_samples):
    physical_devices = tf.config.list_physical_devices('GPU')
    print("[INFO]: physical_devices on task %d:" %task_id, physical_devices)
    
    port = 12345
    os.environ["TF_CONFIG"] = json.dumps({
        'cluster': {"worker": [args["ips"][i] + ":" + str(port + i) for i in range(args["worker_num"])] },
        'task': {"type": 'worker', "index": task_id}
    })
    strategy = tf.distribute.MultiWorkerMirroredStrategy()
    with strategy.scope():
        sok.Init(global_batch_size=args["global_batch_size"])

        sok_demo = SOKDemo(combiner=args["combiner"], 
                            max_vocabulary_size_per_gpu=args["max_vocabulary_size_per_gpu"],
                            slot_num=args["slot_num"], max_nnz=args["max_nnz"],
                            embedding_vec_size=args["embedding_vec_size"])

        emb_opt = utils.get_embedding_optimizer(args["optimizer"])(learning_rate=0.1)
        dense_opt = utils.get_dense_optimizer(args["optimizer"])(learning_rate=0.1)
    
    sok_saver = sok.Saver()
    sok_saver.load_embedding_values(sok_demo.embedding_layer.embedding_variable, init_tensors)

    loss_fn = tf.keras.losses.BinaryCrossentropy(from_logits=True, reduction=tf.keras.losses.Reduction.NONE)
    def _replica_loss(labels, logits):
        loss = loss_fn(labels, logits)
        return tf.nn.compute_average_loss(loss, global_batch_size=args["global_batch_size"])

    @tf.function
    def _train_step(inputs, labels):
        with tf.GradientTape() as tape:
            logit, embedding_vector = sok_demo(inputs, training=True)
            loss = _replica_loss(labels, logit)
        embedding_variables, other_variable = sok.split_embedding_variable_from_others(sok_demo.trainable_variables)
        grads, emb_grads = tape.gradient(loss, [other_variable, embedding_variables])
        if "plugin" not in args["optimizer"]:
            with sok.OptimizerScope(embedding_variables):
                emb_opt.apply_gradients(zip(emb_grads, embedding_variables),
                                        experimental_aggregate_gradients=False)
        else:
            emb_opt.apply_gradients(zip(emb_grads, embedding_variables),
                                    experimental_aggregate_gradients=False)
        dense_opt.apply_gradients(zip(grads, other_variable))
        return logit, embedding_vector

    sok_results = list()

    def _dataset_fn(input_context):
        replica_batch_size = input_context.get_per_replica_batch_size(args["global_batch_size"])
        dataset = utils.tf_dataset(*random_samples, batchsize=replica_batch_size, to_sparse_tensor=True, repeat=1)
        # because each worker has its own data source, so that no need to shard the dataset.
        return dataset

    dataset = strategy.distribute_datasets_from_function(_dataset_fn)

    for i, (sparse_tensors, replica_labels) in enumerate(dataset):
        print("-" * 30, "step ", str(i), "-" * 30)
        logit, embedding_vector = strategy.run(_train_step, args=(sparse_tensors, replica_labels))
        print("[INFO]: embedding_vector\n", embedding_vector)
        sok_results.append(embedding_vector)
        # FIXME: when the forward computation is too fast, there
        # may exist some conficts with datareader, which cause the program hang.
        import time
        time.sleep(0.2) # seconds

    return sok_results

Fourthly, define subprocess work function to simulate multi-node synchronized training

In [8]:
def compare_sok_with_tf(args, task_id):
    if (args["global_batch_size"] % args["local_gpu_num"] != 0):
        raise ValueError("global_batch_size: %d is not divisible by local_gpu_num: %d"
                            %(args["global_batch_size"], args["local_gpu_num"]))
    if (args["global_batch_size"] % args["worker_num"] != 0):
        raise ValueError("global_batch_size: %d is not divisible by worker_num: %d"
                            %(args["global_batch_size"], args["worker_num"]))

    # each worker generate different dataset
    worker_batch_size = args["global_batch_size"] // args["worker_num"]
    random_samples_local = utils.generate_random_samples(num_of_samples=worker_batch_size * args["iter_num"],
                                                         vocabulary_size=args["local_gpu_num"] * args["max_vocabulary_size_per_gpu"] * args["worker_num"],
                                                         slot_num=args["slot_num"],
                                                         max_nnz=args["max_nnz"])
    utils.save_to_file(r"./random_samples_" + str(task_id) + r".file", *random_samples_local)

    # each worker generate same init tensors, because each worker will do the filtering by itself.
    init_tensors = utils.get_ones_tensor(max_vocab_size_per_gpu=args["max_vocabulary_size_per_gpu"],
                                            embedding_vec_size=args["embedding_vec_size"],
                                            num=args["local_gpu_num"] * args["worker_num"])

    sok_results_local = test_sok_demo(args, task_id, init_tensors, *random_samples_local)
    # save the forward embedding vector from different worker to file
    utils.save_to_file(r"./sok_embedding_vectors_" + str(task_id) + r".file", *sok_results_local)

    # aggregate dataset from different worker
    dataset_filenames = [r"./random_samples_" + str(task_id) + r".file"
                         for task_id in range(args["worker_num"])]
    random_samples_total = [list() for _ in range(args["iter_num"])]
    random_labels_total = [list() for _ in range(args["iter_num"])]
    local_batch_size = args["global_batch_size"] // args["worker_num"]
    for work_id in range(args["worker_num"]):
        samples, labels = utils.restore_from_file(dataset_filenames[work_id])
        for i in range(args["iter_num"]):
            random_samples_total[i].extend(samples[i * local_batch_size : (i + 1) * local_batch_size])
            random_labels_total[i].extend(labels[i * local_batch_size : (i + 1) * local_batch_size])
    random_samples_total = np.concatenate(random_samples_total, axis=0)
    random_labels_total = np.concatenate(random_labels_total, axis=0)

    tf_results = test_tf_demo(args, init_tensors, random_samples_total, random_labels_total)

    # aggregate forward embedding vector from different worker
    sok_results_filenames = [r"./sok_embedding_vectors_" + str(task_id) + r".file"
                             for task_id in range(args["worker_num"])]
    sok_results_total = list()
    for file_name in sok_results_filenames:
        sok_results_local = utils.restore_from_file(file_name)
        sok_results_total.append(sok_results_local)

    if (len(sok_results_total[0]) != len(tf_results)):
        raise ValueError("The length of results obtained from sok: %d is not equal to that of tensorflow: %d."
                        %(len(sok_results_total[0]), len(tf_results)))
    if (len(tf_results) != args["iter_num"]):
        raise ValueError("The length of embedding vectors: %d is not equal to iteration number: %d."
                         %(len(tf_results), args["iter_num"]))

    # for i, sok_vector in enumerate(sok_results_total):
    for i in range(args["iter_num"]):
        if args["local_gpu_num"] != 1:
            sok_vector = tf.concat([tf.concat(sok_results_total[task_id][i].values, axis=0)
                                    for task_id in range(args["worker_num"])], axis=0)
        else:
            sok_vector = tf.concat([sok_results_total[task_id][i]
                                    for task_id in range(args["worker_num"])], axis=0)
        tf.debugging.assert_near(tf.reshape(sok_vector, 
                                            shape=[-1, tf.shape(sok_vector)[-1]]),
                                 tf_results[i],
                                 atol=1e-4,
                                 rtol=1e-4)

    print(("\n[INFO]: With MultiWorkerMirroredStrategy, when %d GPUs are used for each node and %d GPUs in total, "
           "the embedding vectors obtained from TensorFlow and SOK are consistent for %d iterations")
          %(args["local_gpu_num"], args["local_gpu_num"] * args["worker_num"], args["iter_num"]))

Fifthly, create sub CPU processes to simulate multi-node synchronized training 

In [9]:
from multiprocessing import Process

processes = list()
for task_id in range(args["worker_num"]):
    available_gpus = ",".join([str(per_worker_gpu_num * task_id + i)
                              for i in range(per_worker_gpu_num)])
    print("[INFO]: on task: %d, its avaiable GPUs are: %s" %(task_id, available_gpus))
    
    os.environ["CUDA_VISIBLE_DEVICES"] = available_gpus
    process = Process(target=compare_sok_with_tf, args=(args, task_id))
    process.start()
    processes.append(process)
    
    
for process in processes:
    if process.is_alive():
        process.join()

[INFO]: on task: 0, its avaiable GPUs are: 0,1,2,3,4,5,6,7
[INFO]: on task: 1, its avaiable GPUs are: 8,9,10,11,12,13,14,15
[INFO]: begin to generate random samples
[INFO]: begin to generate random samples
[INFO]: generated random samples
[INFO]: generated random samples
[INFO]: dumpped items to file ./random_samples_0.file
[INFO]: dumpped items to file ./random_samples_1.file
[INFO]: physical_devices on task 0: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:2', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:3', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:4', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:5', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:6', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:7', device_type='GPU')]
[INFO]: physical_devices on task 1: [PhysicalDevice(nam

}

------------------------------------------------------------  step step   11  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[0.90112394 0.90112394 0.90112394 ... 0.90102184 0.90102184 0.90102184]
 [0.9010962  0.9010962  0.9010962  ... 0.9012493  0.9012493  0.9012493 ]
 [0.9010795  0.9010795  0.9010795  ... 0.90097725 0.90097725 0.90097725]
 ...
 [0.90110785 0.90110785 0.90110785 ... 0.90119267 0.90119267 0.90119267]
 [0.9010775  0.9010775  0.9010775  ... 0.90110064 0.90110064 0.90110064]
 [0.9011415  0.9011415  0.9011415  ... 0.90099823 0.90099823 0.90099823]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[0.9010893  0.9010893  0.9010893  ... 0.90109587 0.90109587 0.90109587]
 [0.901051   0.901051   0.901051   ... 0.9008249  0.9008249  0.9008249 ]
 [0.901079   0.901079   0.901079   ... 0.900967   0.900967   0.900967  ]
 ...
 [0.9012722  0.9012722  0.9012722  ... 0.9009889  0.90098

------------------------------------------------------------  step step   33  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[0.704871   0.704871   0.704871   ... 0.7038018  0.7038018  0.7038018 ]
 [0.70504844 0.70504844 0.70504844 ... 0.70853865 0.70853865 0.70853865]
 [0.7068519  0.7068519  0.7068519  ... 0.70444393 0.70444393 0.70444393]
 ...
 [0.7054967  0.7054967  0.7054967  ... 0.70610404 0.70610404 0.70610404]
 [0.7049196  0.7049196  0.7049196  ... 0.7071366  0.7071366  0.7071366 ]
 [0.7025713  0.7025713  0.7025713  ... 0.7045839  0.7045839  0.7045839 ]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[0.70521504 0.70521504 0.70521504 ... 0.70445836 0.70445836 0.70445836]
 [0.7085757  0.7085757  0.7085757  ... 0.7083036  0.7083036  0.7083036 ]
 [0.70290124 0.70290124 0.70290124 ... 0.7059128  0.7059128  0.7059128 ]
 ...
 [0.70643246 0.70643246 0.70643246 ... 0.7033013  0.7033013 

------------------------------------------------------------  step step   55  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[0.5122816  0.5122816  0.5122816  ... 0.5092242  0.5092242  0.5092242 ]
 [0.5185317  0.5185317  0.5185317  ... 0.5199945  0.5199945  0.5199945 ]
 [0.51819414 0.51819414 0.51819414 ... 0.5085467  0.5085467  0.5085467 ]
 ...
 [0.51714337 0.51714337 0.51714337 ... 0.5089298  0.5089298  0.5089298 ]
 [0.51046157 0.51046157 0.51046157 ... 0.5192856  0.5192856  0.5192856 ]
 [0.51097524 0.51097524 0.51097524 ... 0.5158346  0.5158346  0.5158346 ]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[0.5137605  0.5137605  0.5137605  ... 0.51073706 0.51073706 0.51073706]
 [0.5101888  0.5101888  0.5101888  ... 0.5171164  0.5171164  0.5171164 ]
 [0.511326   0.511326   0.511326   ... 0.51996773 0.51996773 0.51996773]
 ...
 [0.5091344  0.5091344  0.5091344  ... 0.5262461  0.5262461 

------------------------------------------------------------  step step   77  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[0.3593893  0.3593893  0.3593893  ... 0.32317182 0.32317182 0.32317182]
 [0.33919325 0.33919325 0.33919325 ... 0.33115053 0.33115053 0.33115053]
 [0.32679847 0.32679847 0.32679847 ... 0.34565246 0.34565246 0.34565246]
 ...
 [0.35293257 0.35293257 0.35293257 ... 0.3350122  0.3350122  0.3350122 ]
 [0.335076   0.335076   0.335076   ... 0.34141397 0.34141397 0.34141397]
 [0.33428544 0.33428544 0.33428544 ... 0.35380754 0.35380754 0.35380754]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[0.33319855 0.33319855 0.33319855 ... 0.333704   0.333704   0.333704  ]
 [0.32769334 0.32769334 0.32769334 ... 0.34162462 0.34162462 0.34162462]
 [0.3408941  0.3408941  0.3408941  ... 0.34208474 0.34208474 0.34208474]
 ...
 [0.33087453 0.33087453 0.33087453 ... 0.3309713  0.3309713 

------------------------------------------------------------  step step   99  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[0.15920895 0.15920895 0.15920895 ... 0.1765612  0.1765612  0.1765612 ]
 [0.14717862 0.14717862 0.14717862 ... 0.14821711 0.14821711 0.14821711]
 [0.17819886 0.17819886 0.17819886 ... 0.15838541 0.15838541 0.15838541]
 ...
 [0.16575342 0.16575342 0.16575342 ... 0.17809466 0.17809466 0.17809466]
 [0.1607973  0.1607973  0.1607973  ... 0.17552212 0.17552212 0.17552212]
 [0.15770957 0.15770957 0.15770957 ... 0.17104529 0.17104529 0.17104529]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[0.17559296 0.17559296 0.17559296 ... 0.1514681  0.1514681  0.1514681 ]
 [0.17546183 0.17546183 0.17546183 ... 0.18673763 0.18673763 0.18673763]
 [0.1680075  0.1680075  0.1680075  ... 0.14598344 0.14598344 0.14598344]
 ...
 [0.1590979  0.1590979  0.1590979  ... 0.17177951 0.17177951

------------------------------------------------------------  step step   1111  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[0.03646904 0.03646904 0.03646904 ... 0.05057871 0.05057871 0.05057871]
 [0.01363798 0.01363798 0.01363798 ... 0.02358386 0.02358386 0.02358386]
 [0.04044686 0.04044686 0.04044686 ... 0.03189035 0.03189035 0.03189035]
 ...
 [0.03002228 0.03002228 0.03002228 ... 0.021887   0.021887   0.021887  ]
 [0.03397012 0.03397012 0.03397012 ... 0.01494481 0.01494481 0.01494481]
 [0.02338313 0.02338313 0.02338313 ... 0.05983157 0.05983157 0.05983157]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.00091602 -0.00091602 -0.00091602 ...  0.03587131  0.03587131
   0.03587131]
 [ 0.0577733   0.0577733   0.0577733  ...  0.0308164   0.0308164
   0.0308164 ]
 [ 0.03908461  0.03908461  0.03908461 ...  0.03381071  0.03381071
   0.03381071]
 ...
 [ 0.00460153  0.00460153  0.004601

}
------------------------------------------------------------  step step   1313  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.08509585 -0.08509585 -0.08509585 ... -0.06106248 -0.06106248
  -0.06106248]
 [-0.07287173 -0.07287173 -0.07287173 ... -0.07836593 -0.07836593
  -0.07836593]
 [-0.06278088 -0.06278088 -0.06278088 ... -0.06677349 -0.06677349
  -0.06677349]
 ...
 [-0.06807541 -0.06807541 -0.06807541 ... -0.0852205  -0.0852205
  -0.0852205 ]
 [-0.06821102 -0.06821102 -0.06821102 ... -0.09162238 -0.09162238
  -0.09162238]
 [-0.05821199 -0.05821199 -0.05821199 ... -0.06037588 -0.06037588
  -0.06037588]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.06919853 -0.06919853 -0.06919853 ... -0.09260419 -0.09260419
  -0.09260419]
 [-0.08447115 -0.08447115 -0.08447115 ... -0.09773367 -0.09773367
  -0.09773367]
 [-0.00219292 -0.00219292 -0.00219292 ... -0.10390902 -0.10390902
  -0.

}

------------------------------------------------------------  step step   1414  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.15159969 -0.15159969 -0.15159969 ... -0.10377492 -0.10377492
  -0.10377492]
 [-0.12837213 -0.12837213 -0.12837213 ... -0.13060065 -0.13060065
  -0.13060065]
 [-0.11969067 -0.11969067 -0.11969067 ... -0.11593433 -0.11593433
  -0.11593433]
 ...
 [-0.10532669 -0.10532669 -0.10532669 ... -0.10428045 -0.10428045
  -0.10428045]
 [-0.13421696 -0.13421696 -0.13421696 ... -0.11592335 -0.11592335
  -0.11592335]
 [-0.1130067  -0.1130067  -0.1130067  ... -0.0979188  -0.0979188
  -0.0979188 ]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.10911784 -0.10911784 -0.10911784 ... -0.14132452 -0.14132452
  -0.14132452]
 [-0.1166123  -0.1166123  -0.1166123  ... -0.11295331 -0.11295331
  -0.11295331]
 [-0.11570926 -0.11570926 -0.11570926 ... -0.10146423 -0.10146423
  -0

}
------------------------------ ------------------------------step   step 15  15------------------------------ 
------------------------------
[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.17213109 -0.17213109 -0.17213109 ... -0.20751715 -0.20751715
  -0.20751715]
 [-0.15845911 -0.15845911 -0.15845911 ... -0.18588838 -0.18588838
  -0.18588838]
 [-0.17115116 -0.17115116 -0.17115116 ... -0.18285856 -0.18285856
  -0.18285856]
 ...
 [-0.19632532 -0.19632532 -0.19632532 ... -0.15795359 -0.15795359
  -0.15795359]
 [-0.16868813 -0.16868813 -0.16868813 ... -0.18464336 -0.18464336
  -0.18464336]
 [-0.14961106 -0.14961106 -0.14961106 ... -0.14461923 -0.14461923
  -0.14461923]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.15291046 -0.15291046 -0.15291046 ... -0.16138437 -0.16138437
  -0.16138437]
 [-0.17295389 -0.17295389 -0.17295389 ... -0.13835643 -0.13835643
  -0.13835643]
 [-0.15934648 -0.15934648 -0.15934648 ... -0.12764432 -0.12764432
  -0

}

------------------------------------------------------------  step step   1616  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.20272204 -0.20272204 -0.20272204 ... -0.18685567 -0.18685567
  -0.18685567]
 [-0.19805965 -0.19805965 -0.19805965 ... -0.19816731 -0.19816731
  -0.19816731]
 [-0.20674944 -0.20674944 -0.20674944 ... -0.21804884 -0.21804884
  -0.21804884]
 ...
 [-0.18368706 -0.18368706 -0.18368706 ... -0.2332663  -0.2332663
  -0.2332663 ]
 [-0.21872976 -0.21872976 -0.21872976 ... -0.21831161 -0.21831161
  -0.21831161]
 [-0.19412181 -0.19412181 -0.19412181 ... -0.20590113 -0.20590113
  -0.20590113]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.22329587 -0.22329587 -0.22329587 ... -0.18126443 -0.18126443
  -0.18126443]
 [-0.15995698 -0.15995698 -0.15995698 ... -0.19179428 -0.19179428
  -0.19179428]
 [-0.22463968 -0.22463968 -0.22463968 ... -0.20243077 -0.20243077
  -0

}

------------------------------------------------------------  step step   1717  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.24873674 -0.24873674 -0.24873674 ... -0.24268483 -0.24268483
  -0.24268483]
 [-0.24838948 -0.24838948 -0.24838948 ... -0.23663697 -0.23663697
  -0.23663697]
 [-0.24255309 -0.24255309 -0.24255309 ... -0.2338368  -0.2338368
  -0.2338368 ]
 ...
 [-0.20830259 -0.20830259 -0.20830259 ... -0.23421246 -0.23421246
  -0.23421246]
 [-0.20424946 -0.20424946 -0.20424946 ... -0.26049536 -0.26049536
  -0.26049536]
 [-0.23810562 -0.23810562 -0.23810562 ... -0.20547834 -0.20547834
  -0.20547834]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.24138074 -0.24138074 -0.24138074 ... -0.25019297 -0.25019297
  -0.25019297]
 [-0.23530099 -0.23530099 -0.23530099 ... -0.24958771 -0.24958771
  -0.24958771]
 [-0.25100467 -0.25100467 -0.25100467 ... -0.22452983 -0.22452983
  -0

}

------------------------------------------------------------  step step   1818  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.2804451  -0.2804451  -0.2804451  ... -0.27070996 -0.27070996
  -0.27070996]
 [-0.2418918  -0.2418918  -0.2418918  ... -0.26375818 -0.26375818
  -0.26375818]
 [-0.28635767 -0.28635767 -0.28635767 ... -0.27068448 -0.27068448
  -0.27068448]
 ...
 [-0.24918133 -0.24918133 -0.24918133 ... -0.25032154 -0.25032154
  -0.25032154]
 [-0.27018005 -0.27018005 -0.27018005 ... -0.28020513 -0.28020513
  -0.28020513]
 [-0.27584416 -0.27584416 -0.27584416 ... -0.29467097 -0.29467097
  -0.29467097]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.2481695  -0.2481695  -0.2481695  ... -0.25757805 -0.25757805
  -0.25757805]
 [-0.27945656 -0.27945656 -0.27945656 ... -0.2730348  -0.2730348
  -0.2730348 ]
 [-0.26844326 -0.26844326 -0.26844326 ... -0.30953592 -0.30953592
  -0

}

------------------------------------------------------------  step step   1919  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.2831504  -0.2831504  -0.2831504  ... -0.27237654 -0.27237654
  -0.27237654]
 [-0.2748297  -0.2748297  -0.2748297  ... -0.27686292 -0.27686292
  -0.27686292]
 [-0.28103873 -0.28103873 -0.28103873 ... -0.31633413 -0.31633413
  -0.31633413]
 ...
 [-0.30422086 -0.30422086 -0.30422086 ... -0.33585098 -0.33585098
  -0.33585098]
 [-0.2257222  -0.2257222  -0.2257222  ... -0.30988258 -0.30988258
  -0.30988258]
 [-0.30158928 -0.30158928 -0.30158928 ... -0.28388608 -0.28388608
  -0.28388608]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.27902853 -0.27902853 -0.27902853 ... -0.3019514  -0.3019514
  -0.3019514 ]
 [-0.29415908 -0.29415908 -0.29415908 ... -0.32229516 -0.32229516
  -0.32229516]
 [-0.28913185 -0.28913185 -0.28913185 ... -0.3137501  -0.3137501
  -0.

}

------------------------------------------------------------  step step   2020  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.30985576 -0.30985576 -0.30985576 ... -0.3231176  -0.3231176
  -0.3231176 ]
 [-0.29728013 -0.29728013 -0.29728013 ... -0.33737975 -0.33737975
  -0.33737975]
 [-0.2938209  -0.2938209  -0.2938209  ... -0.29164016 -0.29164016
  -0.29164016]
 ...
 [-0.2805484  -0.2805484  -0.2805484  ... -0.27966797 -0.27966797
  -0.27966797]
 [-0.29280883 -0.29280883 -0.29280883 ... -0.29729682 -0.29729682
  -0.29729682]
 [-0.3285194  -0.3285194  -0.3285194  ... -0.32319343 -0.32319343
  -0.32319343]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.32337302 -0.32337302 -0.32337302 ... -0.30335224 -0.30335224
  -0.30335224]
 [-0.28154415 -0.28154415 -0.28154415 ... -0.31610832 -0.31610832
  -0.31610832]
 [-0.29092667 -0.29092667 -0.29092667 ... -0.27143994 -0.27143994
  -0

}

------------------------------------------------------------  step step   2121  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.31506187 -0.31506187 -0.31506187 ... -0.2859416  -0.2859416
  -0.2859416 ]
 [-0.36119527 -0.36119527 -0.36119527 ... -0.2941081  -0.2941081
  -0.2941081 ]
 [-0.3114394  -0.3114394  -0.3114394  ... -0.2891469  -0.2891469
  -0.2891469 ]
 ...
 [-0.2912076  -0.2912076  -0.2912076  ... -0.27944493 -0.27944493
  -0.27944493]
 [-0.2942837  -0.2942837  -0.2942837  ... -0.30679372 -0.30679372
  -0.30679372]
 [-0.31287378 -0.31287378 -0.31287378 ... -0.28504282 -0.28504282
  -0.28504282]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.28172114 -0.28172114 -0.28172114 ... -0.31627908 -0.31627908
  -0.31627908]
 [-0.29976985 -0.29976985 -0.29976985 ... -0.30754918 -0.30754918
  -0.30754918]
 [-0.30921984 -0.30921984 -0.30921984 ... -0.32110754 -0.32110754
  -0.3

}

------------------------------------------------------------  step step   2222  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.29189587 -0.29189587 -0.29189587 ... -0.27310795 -0.27310795
  -0.27310795]
 [-0.3180353  -0.3180353  -0.3180353  ... -0.29107964 -0.29107964
  -0.29107964]
 [-0.31912804 -0.31912804 -0.31912804 ... -0.31468034 -0.31468034
  -0.31468034]
 ...
 [-0.29230145 -0.29230145 -0.29230145 ... -0.3049745  -0.3049745
  -0.3049745 ]
 [-0.32237148 -0.32237148 -0.32237148 ... -0.27190417 -0.27190417
  -0.27190417]
 [-0.29042003 -0.29042003 -0.29042003 ... -0.3184242  -0.3184242
  -0.3184242 ]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.29485375 -0.29485375 -0.29485375 ... -0.2988013  -0.2988013
  -0.2988013 ]
 [-0.3040998  -0.3040998  -0.3040998  ... -0.30464122 -0.30464122
  -0.30464122]
 [-0.31793252 -0.31793252 -0.31793252 ... -0.30757752 -0.30757752
  -0.3

}

------------------------------------------------------------  step step   2323  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.2773705  -0.2773705  -0.2773705  ... -0.30401462 -0.30401462
  -0.30401462]
 [-0.29424414 -0.29424414 -0.29424414 ... -0.33663687 -0.33663687
  -0.33663687]
 [-0.2878427  -0.2878427  -0.2878427  ... -0.26258802 -0.26258802
  -0.26258802]
 ...
 [-0.27395588 -0.27395588 -0.27395588 ... -0.28050894 -0.28050894
  -0.28050894]
 [-0.2920748  -0.2920748  -0.2920748  ... -0.30809912 -0.30809912
  -0.30809912]
 [-0.29247415 -0.29247415 -0.29247415 ... -0.301642   -0.301642
  -0.301642  ]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.24894778 -0.24894778 -0.24894778 ... -0.24169579 -0.24169579
  -0.24169579]
 [-0.20627566 -0.20627566 -0.20627566 ... -0.2953134  -0.2953134
  -0.2953134 ]
 [-0.31063855 -0.31063855 -0.31063855 ... -0.30213937 -0.30213937
  -0.3

}

------------------------------------------------------------  step step   2424  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.30214956 -0.30214956 -0.30214956 ... -0.28784537 -0.28784537
  -0.28784537]
 [-0.27040073 -0.27040073 -0.27040073 ... -0.21122575 -0.21122575
  -0.21122575]
 [-0.2711837  -0.2711837  -0.2711837  ... -0.28761598 -0.28761598
  -0.28761598]
 ...
 [-0.21949835 -0.21949835 -0.21949835 ... -0.2801079  -0.2801079
  -0.2801079 ]
 [-0.2661655  -0.2661655  -0.2661655  ... -0.27831012 -0.27831012
  -0.27831012]
 [-0.2451719  -0.2451719  -0.2451719  ... -0.2671461  -0.2671461
  -0.2671461 ]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.26936445 -0.26936445 -0.26936445 ... -0.28623015 -0.28623015
  -0.28623015]
 [-0.2618669  -0.2618669  -0.2618669  ... -0.27220473 -0.27220473
  -0.27220473]
 [-0.27863866 -0.27863866 -0.27863866 ... -0.2914466  -0.2914466
  -0.2

}

------------------------------------------------------------  step step   2525  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.22815642 -0.22815642 -0.22815642 ... -0.2980778  -0.2980778
  -0.2980778 ]
 [-0.26630285 -0.26630285 -0.26630285 ... -0.25867778 -0.25867778
  -0.25867778]
 [-0.26226643 -0.26226643 -0.26226643 ... -0.15055262 -0.15055262
  -0.15055262]
 ...
 [-0.24374133 -0.24374133 -0.24374133 ... -0.2759994  -0.2759994
  -0.2759994 ]
 [-0.26303086 -0.26303086 -0.26303086 ... -0.1874293  -0.1874293
  -0.1874293 ]
 [-0.2601848  -0.2601848  -0.2601848  ... -0.26061606 -0.26061606
  -0.26061606]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.27897805 -0.27897805 -0.27897805 ... -0.252837   -0.252837
  -0.252837  ]
 [-0.25865078 -0.25865078 -0.25865078 ... -0.2638474  -0.2638474
  -0.2638474 ]
 [-0.24851376 -0.24851376 -0.24851376 ... -0.24314706 -0.24314706
  -0.2431

}

------------------------------------------------------------  step step   2626  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.2607072  -0.2607072  -0.2607072  ... -0.2349512  -0.2349512
  -0.2349512 ]
 [-0.20573275 -0.20573275 -0.20573275 ... -0.25033188 -0.25033188
  -0.25033188]
 [-0.21667746 -0.21667746 -0.21667746 ... -0.23003057 -0.23003057
  -0.23003057]
 ...
 [-0.23011558 -0.23011558 -0.23011558 ... -0.25690466 -0.25690466
  -0.25690466]
 [-0.1504552  -0.1504552  -0.1504552  ... -0.22690585 -0.22690585
  -0.22690585]
 [-0.19283053 -0.19283053 -0.19283053 ... -0.2500184  -0.2500184
  -0.2500184 ]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.32142004 -0.32142004 -0.32142004 ... -0.23066376 -0.23066376
  -0.23066376]
 [-0.24315184 -0.24315184 -0.24315184 ... -0.25197503 -0.25197503
  -0.25197503]
 [-0.23497252 -0.23497252 -0.23497252 ... -0.21015017 -0.21015017
  -0.

}

------------------------------------------------------------  step step   2727  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.2152215  -0.2152215  -0.2152215  ... -0.21556908 -0.21556908
  -0.21556908]
 [-0.18615165 -0.18615165 -0.18615165 ... -0.19948278 -0.19948278
  -0.19948278]
 [-0.19952047 -0.19952047 -0.19952047 ... -0.16427106 -0.16427106
  -0.16427106]
 ...
 [-0.2312809  -0.2312809  -0.2312809  ... -0.20466773 -0.20466773
  -0.20466773]
 [-0.2100859  -0.2100859  -0.2100859  ... -0.2244058  -0.2244058
  -0.2244058 ]
 [-0.20233741 -0.20233741 -0.20233741 ... -0.15502226 -0.15502226
  -0.15502226]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.24383268 -0.24383268 -0.24383268 ... -0.26544353 -0.26544353
  -0.26544353]
 [-0.21491563 -0.21491563 -0.21491563 ... -0.19598982 -0.19598982
  -0.19598982]
 [-0.19336319 -0.19336319 -0.19336319 ... -0.19559702 -0.19559702
  -0

}

------------------------------------------------------------  step step   2828  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.23929635 -0.23929635 -0.23929635 ... -0.16323856 -0.16323856
  -0.16323856]
 [-0.17133151 -0.17133151 -0.17133151 ... -0.17282742 -0.17282742
  -0.17282742]
 [-0.16150045 -0.16150045 -0.16150045 ... -0.17817633 -0.17817633
  -0.17817633]
 ...
 [-0.1776069  -0.1776069  -0.1776069  ... -0.20740712 -0.20740712
  -0.20740712]
 [-0.23252816 -0.23252816 -0.23252816 ... -0.14596501 -0.14596501
  -0.14596501]
 [-0.18228303 -0.18228303 -0.18228303 ... -0.18651243 -0.18651243
  -0.18651243]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.18025476 -0.18025476 -0.18025476 ... -0.18606366 -0.18606366
  -0.18606366]
 [-0.18055263 -0.18055263 -0.18055263 ... -0.20658748 -0.20658748
  -0.20658748]
 [-0.1846795  -0.1846795  -0.1846795  ... -0.19222277 -0.19222277
  -

}

------------------------------------------------------------  step step   2929  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.12298647 -0.12298647 -0.12298647 ... -0.16182289 -0.16182289
  -0.16182289]
 [-0.1405854  -0.1405854  -0.1405854  ... -0.19512956 -0.19512956
  -0.19512956]
 [-0.16487174 -0.16487174 -0.16487174 ... -0.1301983  -0.1301983
  -0.1301983 ]
 ...
 [-0.19809745 -0.19809745 -0.19809745 ... -0.16470237 -0.16470237
  -0.16470237]
 [-0.20869839 -0.20869839 -0.20869839 ... -0.15193966 -0.15193966
  -0.15193966]
 [-0.16768938 -0.16768938 -0.16768938 ... -0.19017078 -0.19017078
  -0.19017078]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.18428802 -0.18428802 -0.18428802 ... -0.19609754 -0.19609754
  -0.19609754]
 [-0.1722162  -0.1722162  -0.1722162  ... -0.18918143 -0.18918143
  -0.18918143]
 [-0.14371543 -0.14371543 -0.14371543 ... -0.1152166  -0.1152166
  -0.

}

------------------------------------------------------------  step step   3030  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.1294701  -0.1294701  -0.1294701  ... -0.16965991 -0.16965991
  -0.16965991]
 [-0.14464939 -0.14464939 -0.14464939 ... -0.15347195 -0.15347195
  -0.15347195]
 [-0.15044259 -0.15044259 -0.15044259 ... -0.17673033 -0.17673033
  -0.17673033]
 ...
 [-0.16566125 -0.16566125 -0.16566125 ... -0.16608994 -0.16608994
  -0.16608994]
 [-0.12975246 -0.12975246 -0.12975246 ... -0.11276378 -0.11276378
  -0.11276378]
 [-0.15185487 -0.15185487 -0.15185487 ... -0.20941734 -0.20941734
  -0.20941734]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.15287822 -0.15287822 -0.15287822 ... -0.17019215 -0.17019215
  -0.17019215]
 [-0.15238988 -0.15238988 -0.15238988 ... -0.14812392 -0.14812392
  -0.14812392]
 [-0.14903732 -0.14903732 -0.14903732 ... -0.1299049  -0.1299049
  -0

}

------------------------------------------------------------  step step   3131  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.15931132 -0.15931132 -0.15931132 ... -0.09288435 -0.09288435
  -0.09288435]
 [-0.15388979 -0.15388979 -0.15388979 ... -0.1853422  -0.1853422
  -0.1853422 ]
 [-0.17481256 -0.17481256 -0.17481256 ... -0.18880989 -0.18880989
  -0.18880989]
 ...
 [-0.1653611  -0.1653611  -0.1653611  ... -0.1108848  -0.1108848
  -0.1108848 ]
 [-0.1587308  -0.1587308  -0.1587308  ... -0.1094273  -0.1094273
  -0.1094273 ]
 [-0.15673286 -0.15673286 -0.15673286 ... -0.14230393 -0.14230393
  -0.14230393]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.2103738  -0.2103738  -0.2103738  ... -0.05281026 -0.05281026
  -0.05281026]
 [-0.1231436  -0.1231436  -0.1231436  ... -0.15711409 -0.15711409
  -0.15711409]
 [-0.14227468 -0.14227468 -0.14227468 ... -0.14452581 -0.14452581
  -0.1

}

------------------------------------------------------------  step step   3232  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.20027965 -0.20027965 -0.20027965 ... -0.16287258 -0.16287258
  -0.16287258]
 [-0.18181825 -0.18181825 -0.18181825 ... -0.1385417  -0.1385417
  -0.1385417 ]
 [-0.17322889 -0.17322889 -0.17322889 ... -0.14449993 -0.14449993
  -0.14449993]
 ...
 [-0.12445017 -0.12445017 -0.12445017 ... -0.14696571 -0.14696571
  -0.14696571]
 [-0.1529257  -0.1529257  -0.1529257  ... -0.1558251  -0.1558251
  -0.1558251 ]
 [-0.14323059 -0.14323059 -0.14323059 ... -0.13120644 -0.13120644
  -0.13120644]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.15145752 -0.15145752 -0.15145752 ... -0.11451367 -0.11451367
  -0.11451367]
 [-0.14372748 -0.14372748 -0.14372748 ... -0.10771458 -0.10771458
  -0.10771458]
 [-0.13902362 -0.13902362 -0.13902362 ... -0.14213273 -0.14213273
  -0.

}

------------------------------------------------------------  step step  33  33------------------------------ 
------------------------------
[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.10811418 -0.10811418 -0.10811418 ... -0.13933046 -0.13933046
  -0.13933046]
 [-0.10307702 -0.10307702 -0.10307702 ... -0.0883117  -0.0883117
  -0.0883117 ]
 [-0.11718222 -0.11718222 -0.11718222 ... -0.12002757 -0.12002757
  -0.12002757]
 ...
 [-0.1045165  -0.1045165  -0.1045165  ... -0.15237564 -0.15237564
  -0.15237564]
 [-0.14010367 -0.14010367 -0.14010367 ... -0.1293545  -0.1293545
  -0.1293545 ]
 [-0.10953968 -0.10953968 -0.10953968 ... -0.11657615 -0.11657615
  -0.11657615]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.14208528 -0.14208528 -0.14208528 ... -0.08392645 -0.08392645
  -0.08392645]
 [-0.1137597  -0.1137597  -0.1137597  ... -0.11963835 -0.11963835
  -0.11963835]
 [-0.15079811 -0.15079811 -0.15079811 ... -0.14393857 -0.14393857
  -0.

}

------------------------------------------------------------  step step   3434 ------------------------------ 
------------------------------
[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.22062957 -0.22062957 -0.22062957 ... -0.1453096  -0.1453096
  -0.1453096 ]
 [-0.12506    -0.12506    -0.12506    ... -0.18492499 -0.18492499
  -0.18492499]
 [-0.13907273 -0.13907273 -0.13907273 ... -0.12085127 -0.12085127
  -0.12085127]
 ...
 [-0.09939612 -0.09939612 -0.09939612 ... -0.13110074 -0.13110074
  -0.13110074]
 [-0.14231876 -0.14231876 -0.14231876 ... -0.14577758 -0.14577758
  -0.14577758]
 [-0.12110004 -0.12110004 -0.12110004 ... -0.0771338  -0.0771338
  -0.0771338 ]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.12152126 -0.12152126 -0.12152126 ... -0.14669845 -0.14669845
  -0.14669845]
 [-0.15132736 -0.15132736 -0.15132736 ... -0.15247968 -0.15247968
  -0.15247968]
 [-0.11045301 -0.11045301 -0.11045301 ... -0.1825099  -0.1825099
  -0.1

}

------------------------------------------------------------  step step   3535  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.09385185 -0.09385185 -0.09385185 ... -0.1067581  -0.1067581
  -0.1067581 ]
 [-0.05937953 -0.05937953 -0.05937953 ... -0.1850739  -0.1850739
  -0.1850739 ]
 [-0.11532103 -0.11532103 -0.11532103 ... -0.05737102 -0.05737102
  -0.05737102]
 ...
 [-0.12228486 -0.12228486 -0.12228486 ... -0.18722478 -0.18722478
  -0.18722478]
 [-0.12122894 -0.12122894 -0.12122894 ... -0.23335305 -0.23335305
  -0.23335305]
 [-0.16181302 -0.16181302 -0.16181302 ... -0.13612027 -0.13612027
  -0.13612027]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.09822626 -0.09822626 -0.09822626 ... -0.11362479 -0.11362479
  -0.11362479]
 [-0.09400871 -0.09400871 -0.09400871 ... -0.12799564 -0.12799564
  -0.12799564]
 [-0.15371576 -0.15371576 -0.15371576 ... -0.15363872 -0.15363872
  -0.

}
------------------------------------------------------------  step step   3636  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.10140312 -0.10140312 -0.10140312 ... -0.12190683 -0.12190683
  -0.12190683]
 [-0.14488141 -0.14488141 -0.14488141 ... -0.19310501 -0.19310501
  -0.19310501]
 [-0.10653575 -0.10653575 -0.10653575 ... -0.19207132 -0.19207132
  -0.19207132]
 ...
 [-0.12733698 -0.12733698 -0.12733698 ... -0.14755762 -0.14755762
  -0.14755762]
 [-0.11590626 -0.11590626 -0.11590626 ... -0.13365832 -0.13365832
  -0.13365832]
 [-0.0955961  -0.0955961  -0.0955961  ... -0.13004635 -0.13004635
  -0.13004635]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.14837591 -0.14837591 -0.14837591 ... -0.14120592 -0.14120592
  -0.14120592]
 [-0.10556714 -0.10556714 -0.10556714 ... -0.1304493  -0.1304493
  -0.1304493 ]
 [-0.1181667  -0.1181667  -0.1181667  ... -0.22365448 -0.22365448
  -0.

}

------------------------------ ------------------------------step  step  37  37------------------------------ ------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.1162863  -0.1162863  -0.1162863  ... -0.16857524 -0.16857524
  -0.16857524]
 [-0.16121247 -0.16121247 -0.16121247 ... -0.09079751 -0.09079751
  -0.09079751]
 [-0.16843966 -0.16843966 -0.16843966 ... -0.12605645 -0.12605645
  -0.12605645]
 ...
 [-0.12389784 -0.12389784 -0.12389784 ... -0.15602641 -0.15602641
  -0.15602641]
 [-0.187013   -0.187013   -0.187013   ... -0.15555969 -0.15555969
  -0.15555969]
 [-0.17658691 -0.17658691 -0.17658691 ... -0.12691279 -0.12691279
  -0.12691279]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.11070486 -0.11070486 -0.11070486 ... -0.13534655 -0.13534655
  -0.13534655]
 [-0.06702974 -0.06702974 -0.06702974 ... -0.14633745 -0.14633745
  -0.14633745]
 [-0.10781243 -0.10781243 -0.10781243 ... -0.16639453 -0.16639453
  -

}

------------------------------------------------------------  step step   3838  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.17929408 -0.17929408 -0.17929408 ... -0.10324753 -0.10324753
  -0.10324753]
 [-0.14376704 -0.14376704 -0.14376704 ... -0.09163474 -0.09163474
  -0.09163474]
 [-0.12457985 -0.12457985 -0.12457985 ... -0.1715586  -0.1715586
  -0.1715586 ]
 ...
 [-0.16022491 -0.16022491 -0.16022491 ... -0.1301797  -0.1301797
  -0.1301797 ]
 [-0.13794477 -0.13794477 -0.13794477 ... -0.19071302 -0.19071302
  -0.19071302]
 [-0.12189702 -0.12189702 -0.12189702 ... -0.15505914 -0.15505914
  -0.15505914]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.14835885 -0.14835885 -0.14835885 ... -0.14584811 -0.14584811
  -0.14584811]
 [-0.19296244 -0.19296244 -0.19296244 ... -0.20370395 -0.20370395
  -0.20370395]
 [-0.09817441 -0.09817441 -0.09817441 ... -0.16246179 -0.16246179
  -0.

}

------------------------------------------------------------  step step   3939  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.15202442 -0.15202442 -0.15202442 ... -0.1766717  -0.1766717
  -0.1766717 ]
 [-0.16797516 -0.16797516 -0.16797516 ... -0.16682824 -0.16682824
  -0.16682824]
 [-0.12528767 -0.12528767 -0.12528767 ... -0.17113724 -0.17113724
  -0.17113724]
 ...
 [-0.08605936 -0.08605936 -0.08605936 ... -0.1791083  -0.1791083
  -0.1791083 ]
 [-0.18029553 -0.18029553 -0.18029553 ... -0.15319684 -0.15319684
  -0.15319684]
 [-0.14999515 -0.14999515 -0.14999515 ... -0.17306136 -0.17306136
  -0.17306136]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.13688268 -0.13688268 -0.13688268 ... -0.17159411 -0.17159411
  -0.17159411]
 [-0.16675717 -0.16675717 -0.16675717 ... -0.11563352 -0.11563352
  -0.11563352]
 [-0.20073703 -0.20073703 -0.20073703 ... -0.08668078 -0.08668078
  -0.

}

------------------------------------------------------------  step step   4040  ------------------------------
------------------------------
[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.15022647 -0.15022647 -0.15022647 ... -0.19310248 -0.19310248
  -0.19310248]
 [-0.1585796  -0.1585796  -0.1585796  ... -0.17682858 -0.17682858
  -0.17682858]
 [-0.17643423 -0.17643423 -0.17643423 ... -0.13770463 -0.13770463
  -0.13770463]
 ...
 [-0.17894666 -0.17894666 -0.17894666 ... -0.18342501 -0.18342501
  -0.18342501]
 [-0.1430084  -0.1430084  -0.1430084  ... -0.15073879 -0.15073879
  -0.15073879]
 [-0.16460475 -0.16460475 -0.16460475 ... -0.11946483 -0.11946483
  -0.11946483]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.1937725  -0.1937725  -0.1937725  ... -0.19084652 -0.19084652
  -0.19084652]
 [-0.16866586 -0.16866586 -0.16866586 ... -0.11656992 -0.11656992
  -0.11656992]
 [-0.14271729 -0.14271729 -0.14271729 ... -0.20634446 -0.20634446
  -

}
------------------------------------------------------------  step step   4141  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.1760145  -0.1760145  -0.1760145  ... -0.22521895 -0.22521895
  -0.22521895]
 [-0.20653461 -0.20653461 -0.20653461 ... -0.1810481  -0.1810481
  -0.1810481 ]
 [-0.16049288 -0.16049288 -0.16049288 ... -0.20318803 -0.20318803
  -0.20318803]
 ...
 [-0.20112321 -0.20112321 -0.20112321 ... -0.15485205 -0.15485205
  -0.15485205]
 [-0.22424416 -0.22424416 -0.22424416 ... -0.16043316 -0.16043316
  -0.16043316]
 [-0.13806295 -0.13806295 -0.13806295 ... -0.17790592 -0.17790592
  -0.17790592]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.16051768 -0.16051768 -0.16051768 ... -0.11951704 -0.11951704
  -0.11951704]
 [-0.19745642 -0.19745642 -0.19745642 ... -0.11727287 -0.11727287
  -0.11727287]
 [-0.14226203 -0.14226203 -0.14226203 ... -0.16445845 -0.16445845
  -0.

}

------------------------------------------------------------  step step   4242  ------------------------------
------------------------------
[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.22483402 -0.22483402 -0.22483402 ... -0.17522405 -0.17522405
  -0.17522405]
 [-0.19558303 -0.19558303 -0.19558303 ... -0.22876209 -0.22876209
  -0.22876209]
 [-0.16584177 -0.16584177 -0.16584177 ... -0.166182   -0.166182
  -0.166182  ]
 ...
 [-0.18065254 -0.18065254 -0.18065254 ... -0.19359903 -0.19359903
  -0.19359903]
 [-0.17426792 -0.17426792 -0.17426792 ... -0.1952105  -0.1952105
  -0.1952105 ]
 [-0.21963936 -0.21963936 -0.21963936 ... -0.19917208 -0.19917208
  -0.19917208]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.20120709 -0.20120709 -0.20120709 ... -0.20310523 -0.20310523
  -0.20310523]
 [-0.18586345 -0.18586345 -0.18586345 ... -0.1587748  -0.1587748
  -0.1587748 ]
 [-0.20012383 -0.20012383 -0.20012383 ... -0.22467914 -0.22467914
  -0.22

}

------------------------------------------------------------  step step   4343  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.17179495 -0.17179495 -0.17179495 ... -0.24620189 -0.24620189
  -0.24620189]
 [-0.18665339 -0.18665339 -0.18665339 ... -0.11395717 -0.11395717
  -0.11395717]
 [-0.22886798 -0.22886798 -0.22886798 ... -0.2358864  -0.2358864
  -0.2358864 ]
 ...
 [-0.2269699  -0.2269699  -0.2269699  ... -0.24852684 -0.24852684
  -0.24852684]
 [-0.14036813 -0.14036813 -0.14036813 ... -0.273029   -0.273029
  -0.273029  ]
 [-0.19380128 -0.19380128 -0.19380128 ... -0.21270566 -0.21270566
  -0.21270566]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.21667995 -0.21667995 -0.21667995 ... -0.2235429  -0.2235429
  -0.2235429 ]
 [-0.20624216 -0.20624216 -0.20624216 ... -0.24397548 -0.24397548
  -0.24397548]
 [-0.12362527 -0.12362527 -0.12362527 ... -0.14366567 -0.14366567
  -0.14

}

------------------------------------------------------------  step step   4444  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.22073118 -0.22073118 -0.22073118 ... -0.1581519  -0.1581519
  -0.1581519 ]
 [-0.26119336 -0.26119336 -0.26119336 ... -0.18408181 -0.18408181
  -0.18408181]
 [-0.24590586 -0.24590586 -0.24590586 ... -0.19066226 -0.19066226
  -0.19066226]
 ...
 [-0.21409683 -0.21409683 -0.21409683 ... -0.18314117 -0.18314117
  -0.18314117]
 [-0.2452758  -0.2452758  -0.2452758  ... -0.19982082 -0.19982082
  -0.19982082]
 [-0.23960078 -0.23960078 -0.23960078 ... -0.27898702 -0.27898702
  -0.27898702]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.2671387  -0.2671387  -0.2671387  ... -0.25710332 -0.25710332
  -0.25710332]
 [-0.22090809 -0.22090809 -0.22090809 ... -0.25706467 -0.25706467
  -0.25706467]
 [-0.18384326 -0.18384326 -0.18384326 ... -0.21424799 -0.21424799
  -0

}
------------------------------------------------------------  step step   4545  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.23986612 -0.23986612 -0.23986612 ... -0.22068065 -0.22068065
  -0.22068065]
 [-0.2466579  -0.2466579  -0.2466579  ... -0.20176785 -0.20176785
  -0.20176785]
 [-0.24697356 -0.24697356 -0.24697356 ... -0.2265924  -0.2265924
  -0.2265924 ]
 ...
 [-0.22941756 -0.22941756 -0.22941756 ... -0.2219078  -0.2219078
  -0.2219078 ]
 [-0.23275442 -0.23275442 -0.23275442 ... -0.10608855 -0.10608855
  -0.10608855]
 [-0.17004183 -0.17004183 -0.17004183 ... -0.27840263 -0.27840263
  -0.27840263]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.2017912  -0.2017912  -0.2017912  ... -0.17018594 -0.17018594
  -0.17018594]
 [-0.26617163 -0.26617163 -0.26617163 ... -0.26324862 -0.26324862
  -0.26324862]
 [-0.21274802 -0.21274802 -0.21274802 ... -0.23861536 -0.23861536
  -0.2

}

------------------------------------------------------------ step   step  46 ------------------------------46
 ------------------------------
[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.23559593 -0.23559593 -0.23559593 ... -0.2531055  -0.2531055
  -0.2531055 ]
 [-0.2075781  -0.2075781  -0.2075781  ... -0.29453123 -0.29453123
  -0.29453123]
 [-0.26525277 -0.26525277 -0.26525277 ... -0.2615906  -0.2615906
  -0.2615906 ]
 ...
 [-0.2326729  -0.2326729  -0.2326729  ... -0.21549165 -0.21549165
  -0.21549165]
 [-0.26436317 -0.26436317 -0.26436317 ... -0.25163788 -0.25163788
  -0.25163788]
 [-0.22864495 -0.22864495 -0.22864495 ... -0.2281408  -0.2281408
  -0.2281408 ]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.2935084  -0.2935084  -0.2935084  ... -0.19787465 -0.19787465
  -0.19787465]
 [-0.21954173 -0.21954173 -0.21954173 ... -0.2542009  -0.2542009
  -0.2542009 ]
 [-0.18746144 -0.18746144 -0.18746144 ... -0.27816102 -0.27816102
  -0.27

}

------------------------------------------------------------  step step   4747  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.20925215 -0.20925215 -0.20925215 ... -0.18087082 -0.18087082
  -0.18087082]
 [-0.23435944 -0.23435944 -0.23435944 ... -0.21694446 -0.21694446
  -0.21694446]
 [-0.22611341 -0.22611341 -0.22611341 ... -0.25619924 -0.25619924
  -0.25619924]
 ...
 [-0.25600117 -0.25600117 -0.25600117 ... -0.18800856 -0.18800856
  -0.18800856]
 [-0.27133155 -0.27133155 -0.27133155 ... -0.26675808 -0.26675808
  -0.26675808]
 [-0.24803647 -0.24803647 -0.24803647 ... -0.24972898 -0.24972898
  -0.24972898]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.23235963 -0.23235963 -0.23235963 ... -0.22459935 -0.22459935
  -0.22459935]
 [-0.22244799 -0.22244799 -0.22244799 ... -0.27799368 -0.27799368
  -0.27799368]
 [-0.23044007 -0.23044007 -0.23044007 ... -0.248617   -0.248617
  -0.

}

------------------------------------------------------------  step step   4848  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.31680614 -0.31680614 -0.31680614 ... -0.28701863 -0.28701863
  -0.28701863]
 [-0.23282227 -0.23282227 -0.23282227 ... -0.27554235 -0.27554235
  -0.27554235]
 [-0.2593755  -0.2593755  -0.2593755  ... -0.23999314 -0.23999314
  -0.23999314]
 ...
 [-0.23553501 -0.23553501 -0.23553501 ... -0.23144439 -0.23144439
  -0.23144439]
 [-0.20894432 -0.20894432 -0.20894432 ... -0.22256343 -0.22256343
  -0.22256343]
 [-0.1928213  -0.1928213  -0.1928213  ... -0.22651708 -0.22651708
  -0.22651708]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.19454062 -0.19454062 -0.19454062 ... -0.19287303 -0.19287303
  -0.19287303]
 [-0.18256648 -0.18256648 -0.18256648 ... -0.20586306 -0.20586306
  -0.20586306]
 [-0.28613934 -0.28613934 -0.28613934 ... -0.2544424  -0.2544424
  -0

}

------------------------------------------------------------  step step   4949  ------------------------------------------------------------

[INFO]: embedding_vector
[INFO]: embedding_vector
  PerReplica:{
  0: tf.Tensor(
[[-0.21762584 -0.21762584 -0.21762584 ... -0.22707182 -0.22707182
  -0.22707182]
 [-0.1870008  -0.1870008  -0.1870008  ... -0.25740552 -0.25740552
  -0.25740552]
 [-0.22921841 -0.22921841 -0.22921841 ... -0.21250372 -0.21250372
  -0.21250372]
 ...
 [-0.20703614 -0.20703614 -0.20703614 ... -0.26245713 -0.26245713
  -0.26245713]
 [-0.28084302 -0.28084302 -0.28084302 ... -0.25496072 -0.25496072
  -0.25496072]
 [-0.25558522 -0.25558522 -0.25558522 ... -0.27892774 -0.27892774
  -0.27892774]], shape=(4096, 40), dtype=float32),
  1: tf.Tensor(
[[-0.20606595 -0.20606595 -0.20606595 ... -0.19534972 -0.19534972
  -0.19534972]
 [-0.29221413 -0.29221413 -0.29221413 ... -0.21741131 -0.21741131
  -0.21741131]
 [-0.24992868 -0.24992868 -0.24992868 ... -0.28754765 -0.28754765
  -

}
[INFO]: dumpped items to file ./sok_embedding_vectors_0.file
[INFO]: dumpped items to file ./sok_embedding_vectors_1.file
[INFO] loadded from file ./random_samples_0.file
[INFO] loadded from file ./random_samples_0.file
[INFO] loadded from file ./random_samples_1.file
[INFO] loadded from file ./random_samples_1.file
------------------------------ 0 ------------------------------
------------------------------ 0 ------------------------------
[INFO]: embedding_vector:
 tf.Tensor(
[[1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]
 ...
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]], shape=(65536, 40), dtype=float32)
[INFO]: embedding_vector:
 tf.Tensor(
[[1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]
 ...
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]], shape=(65536, 40), dtype=float32)
------------------------------ 1 ------------------------------
[INFO]: embedding_vector:
 tf.

 [0.32692277 0.32692277 0.32692277 ... 0.32054782 0.32054782 0.32054782]], shape=(65536, 40), dtype=float32)
------------------------------ 7 ------------------------------
[INFO]: embedding_vector:
 tf.Tensor(
[[0.35938388 0.35938388 0.35938388 ... 0.32316637 0.32316637 0.32316637]
 [0.33918768 0.33918768 0.33918768 ... 0.33114496 0.33114496 0.33114496]
 [0.3267928  0.3267928  0.3267928  ... 0.34564686 0.34564686 0.34564686]
 ...
 [0.3233582  0.3233582  0.3233582  ... 0.32973903 0.32973903 0.32973903]
 [0.3314612  0.3314612  0.3314612  ... 0.33581683 0.33581683 0.33581683]
 [0.32692277 0.32692277 0.32692277 ... 0.32054782 0.32054782 0.32054782]], shape=(65536, 40), dtype=float32)
------------------------------ 8 ------------------------------
[INFO]: embedding_vector:
 tf.Tensor(
[[0.24688369 0.24688369 0.24688369 ... 0.26600617 0.26600617 0.26600617]
 [0.2480166  0.2480166  0.2480166  ... 0.2514321  0.2514321  0.2514321 ]
 [0.24435692 0.24435692 0.24435692 ... 0.24600014 0.24600014 0

  -0.1519821 ]], shape=(65536, 40), dtype=float32)
------------------------------ 14 ------------------------------
[INFO]: embedding_vector:
 tf.Tensor(
[[-0.1689989  -0.1689989  -0.1689989  ... -0.10777268 -0.10777268
  -0.10777268]
 [-0.13230377 -0.13230377 -0.13230377 ... -0.11240527 -0.11240527
  -0.11240527]
 [-0.12588634 -0.12588634 -0.12588634 ... -0.11294188 -0.11294188
  -0.11294188]
 ...
 [-0.12267219 -0.12267219 -0.12267219 ... -0.10800198 -0.10800198
  -0.10800198]
 [-0.1359811  -0.1359811  -0.1359811  ... -0.13542189 -0.13542189
  -0.13542189]
 [-0.13517365 -0.13517365 -0.13517365 ... -0.1519821  -0.1519821
  -0.1519821 ]], shape=(65536, 40), dtype=float32)
------------------------------ 15 ------------------------------
[INFO]: embedding_vector:
 tf.Tensor(
[[-0.16269596 -0.16269596 -0.16269596 ... -0.15430662 -0.15430662
  -0.15430662]
 [-0.1764975  -0.1764975  -0.1764975  ... -0.17849879 -0.17849879
  -0.17849879]
 [-0.152884   -0.152884   -0.152884   ... -0.16444835 -

  -0.21160543]], shape=(65536, 40), dtype=float32)
------------------------------ 27 ------------------------------
[INFO]: embedding_vector:
 ------------------------------ 28 ------------------------------
tf.Tensor(
[[-0.1761609  -0.1761609  -0.1761609  ... -0.23750332 -0.23750332
  -0.23750332]
 [-0.23230386 -0.23230386 -0.23230386 ... -0.17313842 -0.17313842
  -0.17313842]
 [-0.21483089 -0.21483089 -0.21483089 ... -0.17595741 -0.17595741
  -0.17595741]
 ...
 [-0.15503663 -0.15503663 -0.15503663 ... -0.20782913 -0.20782913
  -0.20782913]
 [-0.20479324 -0.20479324 -0.20479324 ... -0.1684965  -0.1684965
  -0.1684965 ]
 [-0.21528238 -0.21528238 -0.21528238 ... -0.21160541 -0.21160541
  -0.21160541]], shape=(65536, 40), dtype=float32)
[INFO]: embedding_vector:
 tf.Tensor(
[[-0.23930222 -0.23930222 -0.23930222 ... -0.16324466 -0.16324466
  -0.16324466]
 [-0.1713374  -0.1713374  -0.1713374  ... -0.17283335 -0.17283335
  -0.17283335]
 [-0.16150628 -0.16150628 -0.16150628 ... -0.17818265 -

  -0.08967178]], shape=(65536, 40), dtype=float32)
[INFO]: embedding_vector:
 tf.Tensor(
[[-0.13579781 -0.13579781 -0.13579781 ... -0.16283527 -0.16283527
  -0.16283527]
 [-0.16079614 -0.16079614 -0.16079614 ... -0.09351777 -0.09351777
  -0.09351777]
 [-0.11941607 -0.11941607 -0.11941607 ... -0.12348752 -0.12348752
  -0.12348752]
 ...
 [-0.07437872 -0.07437872 -0.07437872 ... -0.11933292 -0.11933292
  -0.11933292]
 [-0.14259903 -0.14259903 -0.14259903 ... -0.12376738 -0.12376738
  -0.12376738]
 [-0.12998739 -0.12998739 -0.12998739 ... -0.12171997 -0.12171997
  -0.12171997]], shape=(65536, 40), dtype=float32)
------------------------------ 34 ------------------------------
------------------------------[INFO]: embedding_vector:
  35 ------------------------------
tf.Tensor(
[[-0.13579783 -0.13579783 -0.13579783 ... -0.16283526 -0.16283526
  -0.16283526]
 [-0.16079615 -0.16079615 -0.16079615 ... -0.09351775 -0.09351775
  -0.09351775]
 [-0.11941606 -0.11941606 -0.11941606 ... -0.12348746 

  -0.2244833 ]], shape=(65536, 40), dtype=float32)------------------------------

[INFO]: embedding_vector:
 tf.Tensor(
[[-0.20925497 -0.20925497 -0.20925497 ... -0.18087214 -0.18087214
  -0.18087214]
 [-0.23436086 -0.23436086 -0.23436086 ... -0.21694495 -0.21694495
  -0.21694495]
 [-0.22611482 -0.22611482 -0.22611482 ... -0.2561999  -0.2561999
  -0.2561999 ]
 ...
 [-0.20812303 -0.20812303 -0.20812303 ... -0.23730367 -0.23730367
  -0.23730367]
 [-0.22107518 -0.22107518 -0.22107518 ... -0.19635016 -0.19635016
  -0.19635016]
 [-0.2128504  -0.2128504  -0.2128504  ... -0.29057276 -0.29057276
  -0.29057276]], shape=(65536, 40), dtype=float32)
------------------------------ 47 ------------------------------
[INFO]: embedding_vector:
 ------------------------------ 48 ------------------------------
tf.Tensor(
[[-0.2092549  -0.2092549  -0.2092549  ... -0.18087223 -0.18087223
  -0.18087223]
 [-0.23436081 -0.23436081 -0.23436081 ... -0.21694499 -0.21694499
  -0.21694499]
 [-0.22611476 -0.2261147

If no exceptions and their embedding vectors are totally consistent, then a sentence similar to the following one will be printed.
```shell
"[INFO]: With MultiWorkerMirroredStrategy, when 8 GPUs are used for each node and 16 GPUs in total, the embedding vectors obtained from TensorFlow and SOK are consistent for 50 iterations"
```

## Documents ##
We are writing detailed usage samples and API reference documents, and it will be available in the near future.