# Introduction

This notebook contains the same content as "criteo_keras.py" but in a notebook(interactive) form.

The dataset used here is from Criteo clicklog dataset. It's preprocessed by DLRM(https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow2/Recommendation/DLRM_and_DCNv2/preproc) ETL job on Spark.

We provide a small size sample data in `sample_data` folder.

The data schema after the DLRM ETL: also 40 columns -- 1 label column, 39 numerical feature columns. (Categorical columns are processed)

User can either take the same routine to get the data or use your own preprocessed data and change the DL model accordingly.

`Please note: The following demo is dedicated for DGX-2 machine(with V100 GPUs).` We optimized the whole workflow on DGX-2 and it's not guaranteed that it can run successfully on other type of machines.

### import necessary libraries

In [1]:
import argparse
import math
import pprint
import sys
# This needs to happen first to avoid pyarrow serialization errors.
from pyspark.sql import SparkSession

# Make sure pyarrow is referenced before anything else to avoid segfault due to conflict
# with TensorFlow libraries.  Use `pa` package reference to ensure it's loaded before
# functions like `deserialize_model` which are implemented at the top level.
# See https://jira.apache.org/jira/browse/ARROW-3346
import pyarrow as pa

import horovod
import horovod.tensorflow.keras as hvd
import tensorflow as tf
from horovod.spark.common.backend import SparkBackend
from tensorflow.keras.layers import BatchNormalization, Input, Embedding, Concatenate, Dense, Flatten
from tensorflow.keras.layers.experimental.preprocessing import CategoryEncoding

### set some macros

xxx_DATALOADER is the switch to control which dataloader we will use,
xxx_COLUMNS are only used in NVTabular dataloader.

In [1]:
PETASTORM_DATALOADER = 'petastorm'
NVTABULAR_DATALOADER = 'nvtabular'

CONTINUOUS_COLUMNS = [f'i{i}' for i in range(13)]
CATEGORICAL_COLUMNS = [f'c{c}' for c in range(26)]
ALL_COLUMNS = CONTINUOUS_COLUMNS + CATEGORICAL_COLUMNS
LABEL_COLUMNS = ['clicked']

The "dimensions" contains the count of distinct value for those categorical column 'after' DLRM ETL.

This will be used to build embedding layers for our model.

In [3]:
def get_category_dimensions(spark, data_dir):
    df = spark.read.csv(f'{data_dir}/dimensions/*.csv', header=True).toPandas()
    dimensions = df.to_dict('records')[0]
    pprint.pprint(dimensions)
    return dimensions

### Build the model

The model is composed by mainly 2 parts: embedding layers and fully-connected layers.

In [4]:
def build_model(dimensions, args):
    
    inputs = {
        **{i: Input(shape=(1,), name=i, dtype=tf.float32) for i in CONTINUOUS_COLUMNS},
        **{c: Input(shape=(1,), name=c, dtype=tf.int32) for c in CATEGORICAL_COLUMNS}
    }

    one_hots = []
    embeddings = []
    for c in CATEGORICAL_COLUMNS:
        dimension = int(dimensions[c]) + 1
        # dimension <= 128, smaller size for demo
        if dimension <= 8:
            one_hots.append(CategoryEncoding(num_tokens=dimension, name=f'one_hot_{c}')(inputs[c]))
        else:
            # embedding_size = int(math.floor(0.6 * dimension ** 0.25)), smaller model size for demo
            embedding_size = 8
            embeddings.append(Embedding(input_dim=dimension,
                                        output_dim=embedding_size,
                                        input_length=1,
                                        name=f'embedding_{c}')(inputs[c]))

    x = Concatenate(name='embeddings_concat')(embeddings)
    x = Flatten(name='embeddings_flatten')(x)
    x = Concatenate(name='inputs_concat')([x] + one_hots + [inputs[i] for i in CONTINUOUS_COLUMNS])
    x = BatchNormalization()(x)
    x = Dense(64, activation='relu')(x)
    x = BatchNormalization()(x)
    x = Dense(64, activation='relu')(x)
    x = BatchNormalization()(x)
    x = Dense(64, activation='relu')(x)
    x = BatchNormalization()(x)
    x = Dense(32, activation='relu')(x)
    output = Dense(1, activation='sigmoid', name='output')(x)
    model = tf.keras.Model(inputs=[inputs[c] for c in ALL_COLUMNS], outputs=output)
    if hvd.rank() == 0:
        model.summary()

    opt = tf.keras.optimizers.Adam(learning_rate=args.learning_rate)
    opt = hvd.DistributedOptimizer(opt)
    model.compile(optimizer=opt, loss='binary_crossentropy', metrics=[tf.keras.metrics.AUC()])

    return model

### Set train function

The train_fn is the function that will execute across all Horovod workers(also say Spark executors in our case).

We use `CUDA_VISIBLE_DEVICES` to control the GPU resources to avoid overlapping.

In [5]:
def train_fn(dimensions, train_rows, val_rows, args):
    # Make sure pyarrow is referenced before anything else to avoid segfault due to conflict
    # with TensorFlow libraries.  Use `pa` package reference to ensure it's loaded before
    # functions like `deserialize_model` which are implemented at the top level.
    # See https://jira.apache.org/jira/browse/ARROW-3346
    pa

    import atexit
    import horovod.tensorflow.keras as hvd
    from horovod.spark.task import get_available_devices
    import os
    import tempfile
    import tensorflow as tf
    import tensorflow.keras.backend as K
    import shutil

    gpus = get_available_devices()
    if gpus:
        os.environ['CUDA_VISIBLE_DEVICES'] = gpus[0]
    if args.dataloader == NVTABULAR_DATALOADER:
        os.environ['TF_MEMORY_ALLOCATION'] = '0.85'
        from nvtabular.loader.tensorflow import KerasSequenceLoader

    # Horovod: initialize Horovod inside the trainer.
    hvd.init()

    # Horovod: restore from checkpoint, use hvd.load_model under the hood.
    model = build_model(dimensions, args)

    # Horovod: adjust learning rate based on number of processes.
    scaled_lr = K.get_value(model.optimizer.lr) * hvd.size()
    K.set_value(model.optimizer.lr, scaled_lr)

    # Horovod: print summary logs on the first worker.
    verbose = 1 if hvd.rank() == 0 else 0

    callbacks = [
        # Horovod: broadcast initial variable states from rank 0 to all other processes.
        # This is necessary to ensure consistent initialization of all workers when
        # training is started with random weights or restored from a checkpoint.
        hvd.callbacks.BroadcastGlobalVariablesCallback(root_rank=0),

        # Horovod: average metrics among workers at the end of every epoch.
        #
        # Note: This callback must be in the list before the ReduceLROnPlateau,
        # TensorBoard, or other metrics-based callbacks.
        hvd.callbacks.MetricAverageCallback(),

        # Horovod: using `lr = 1.0 * hvd.size()` from the very beginning leads to worse final
        # accuracy. Scale the learning rate `lr = 1.0` ---> `lr = 1.0 * hvd.size()` during
        # the first five epochs. See https://arxiv.org/abs/1706.02677 for details.
        hvd.callbacks.LearningRateWarmupCallback(initial_lr=scaled_lr, warmup_epochs=5, verbose=verbose),

        # Reduce LR if the metric is not improved for 10 epochs, and stop training
        # if it has not improved for 20 epochs.
        tf.keras.callbacks.ReduceLROnPlateau(monitor='val_auc', patience=10, verbose=verbose),
        tf.keras.callbacks.EarlyStopping(monitor='val_auc', mode='min', patience=20, verbose=verbose),
        tf.keras.callbacks.TerminateOnNaN(),

        # Log Tensorboard events.
        tf.keras.callbacks.TensorBoard(log_dir=args.logs_dir, write_steps_per_second=True, update_freq=10)
    ]

    # Horovod: save checkpoints only on the first worker to prevent other workers from corrupting them.
    if hvd.rank() == 0:
        ckpt_dir = tempfile.mkdtemp()
        ckpt_file = os.path.join(ckpt_dir, 'checkpoint.h5')
        atexit.register(lambda: shutil.rmtree(ckpt_dir))
        callbacks.append(tf.keras.callbacks.ModelCheckpoint(
            ckpt_file, monitor='val_auc', mode='min', save_best_only=True))

    if args.dataloader == PETASTORM_DATALOADER:
        from petastorm import make_batch_reader
        from petastorm.tf_utils import make_petastorm_dataset

        # Make Petastorm readers.
        with make_batch_reader(f'{args.data_dir}/train',
                               num_epochs=None,
                               cur_shard=hvd.rank(),
                               shard_count=hvd.size(),
                               hdfs_driver='libhdfs') as train_reader:
            with make_batch_reader(f'{args.data_dir}/val',
                                   num_epochs=None,
                                   cur_shard=hvd.rank(),
                                   shard_count=hvd.size(),
                                   hdfs_driver='libhdfs') as val_reader:
                # Convert readers to tf.data.Dataset.
                train_ds = make_petastorm_dataset(train_reader) \
                    .unbatch() \
                    .shuffle(10 * args.batch_size) \
                    .batch(args.batch_size) \
                    .map(lambda x: (tuple(getattr(x, c) for c in ALL_COLUMNS), x.clicked))

                val_ds = make_petastorm_dataset(val_reader) \
                    .unbatch() \
                    .batch(args.batch_size) \
                    .map(lambda x: (tuple(getattr(x, c) for c in ALL_COLUMNS), x.clicked))

                history = model.fit(train_ds,
                                    validation_data=val_ds,
                                    steps_per_epoch=int(train_rows / args.batch_size / hvd.size()),
                                    validation_steps=int(val_rows / args.batch_size / hvd.size()),
                                    callbacks=callbacks,
                                    verbose=verbose,
                                    epochs=args.epochs)

    else:
        import cupy

        def seed_fn():
            """
            Generate consistent dataloader shuffle seeds across workers
            Reseeds each worker's dataloader each epoch to get fresh a shuffle
            that's consistent across workers.
            """
            min_int, max_int = tf.int32.limits
            max_rand = max_int // hvd.size()
            # Generate a seed fragment on each worker
            seed_fragment = cupy.random.randint(0, max_rand).get()
            # Aggregate seed fragments from all Horovod workers
            seed_tensor = tf.constant(seed_fragment)
            reduced_seed = hvd.allreduce(seed_tensor, name="shuffle_seed", op=hvd.Sum)
            return reduced_seed % max_rand

        train_ds = KerasSequenceLoader(
            f'{args.data_dir}/train',
            batch_size=args.batch_size,
            label_names=LABEL_COLUMNS,
            cat_names=CATEGORICAL_COLUMNS,
            cont_names=CONTINUOUS_COLUMNS,
            engine="parquet",
            shuffle=True,
            buffer_size=0.06,  # how many batches to load at once
            parts_per_chunk=1,
            global_size=hvd.size(),
            global_rank=hvd.rank(),
            seed_fn=seed_fn)

        val_ds = KerasSequenceLoader(
            f'{args.data_dir}/val',
            batch_size=args.batch_size,
            label_names=LABEL_COLUMNS,
            cat_names=CATEGORICAL_COLUMNS,
            cont_names=CONTINUOUS_COLUMNS,
            engine="parquet",
            shuffle=False,
            buffer_size=0.06,  # how many batches to load at once
            parts_per_chunk=1,
            global_size=hvd.size(),
            global_rank=hvd.rank())

        history = model.fit(train_ds,
                            validation_data=val_ds,
                            steps_per_epoch=int(train_rows / args.batch_size / hvd.size()),
                            validation_steps=int(val_rows / args.batch_size / hvd.size()),
                            callbacks=callbacks,
                            verbose=verbose,
                            epochs=args.epochs)

    if hvd.rank() == 0:
        return history.history

### Wrapper function to train

Here we call the `horovod.spark.run` to start the training process in Horovod on Spark.

In [6]:
def train(dimensions, train_rows, val_rows, args):
    # Horovod: run training.
    history = horovod.spark.run(train_fn,
                                args=(dimensions, train_rows, val_rows, args),
                                num_proc=args.num_proc,
                                extra_mpi_args='-mca btl_tcp_if_include enp134s0f0 -x NCCL_IB_GID_INDEX=3',
                                stdout=sys.stdout,
                                stderr=sys.stderr,
                                verbose=2,
                                nics={},
                                prefix_output_with_timestamp=True)[0]

    best_val_loss = min(history['val_loss'])
    print('Best Loss: %f' % best_val_loss)

## Use NVTabular

Here we set `--dataloader` to `nvtabular` to force NVTabular run.

In [7]:
def main():
    parser = argparse.ArgumentParser(description='Criteo Spark Keras Training Example',
                                     formatter_class=argparse.ArgumentDefaultsHelpFormatter)
    parser.add_argument('--data-dir', default='file:///opt/data/criteo/parquet',
                        help='location of the transformed Criteo dataset in Parquet format')
    parser.add_argument('--logs-dir', default='/opt/experiments/criteo', help='location of TensorFlow logs')
    parser.add_argument('--dataloader', default=PETASTORM_DATALOADER,
                        choices=[PETASTORM_DATALOADER, NVTABULAR_DATALOADER],
                        help='dataloader to use')
    parser.add_argument('--num-proc', type=int, default=1, help='number of worker processes for training')
    parser.add_argument('--learning-rate', type=float, default=0.0001, help='initial learning rate')
    parser.add_argument('--batch-size', type=int, default=64 * 1024, help='batch size')
    parser.add_argument('--epochs', type=int, default=3, help='number of epochs to train')
    parser.add_argument('--local-checkpoint-file', default='checkpoint', help='model checkpoint')
    args = parser.parse_args(args=['--num-proc', '16', '--data-dir', 'file:///raid/spark-team/criteo/parquet', 
                                   '--dataloader', 'nvtabular', '--learning-rate', '0.001',
                                   '--batch-size', '65535','--epochs', '1', '--logs-dir', 'tf_logs',
                                   '--local-checkpoint-file', 'ckpt_file'])
                                   

    dimensions = get_category_dimensions(spark, args.data_dir)

    train_df = spark.read.parquet(f'{args.data_dir}/train')
    val_df = spark.read.parquet(f'{args.data_dir}/val')
    test_df = spark.read.parquet(f'{args.data_dir}/test')
    train_rows, val_rows, test_rows = train_df.count(), val_df.count(), test_df.count()
    print('Training: %d' % train_rows)
    print('Validation: %d' % val_rows)
    print('Test: %d' % test_rows)

    train(dimensions, train_rows, val_rows, args)

    spark.stop()

In [8]:
main()

21/09/06 09:03:44 WARN package: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
                                                                                

{'c0': '7912888',
 'c1': '33822',
 'c10': '582468',
 'c11': '245827',
 'c12': '10',
 'c13': '2208',
 'c14': '10666',
 'c15': '103',
 'c16': '3',
 'c17': '967',
 'c18': '14',
 'c19': '8165895',
 'c2': '17138',
 'c20': '2675939',
 'c21': '7156452',
 'c22': '302515',
 'c23': '12021',
 'c24': '96',
 'c25': '34',
 'c3': '7338',
 'c4': '20045',
 'c5': '3',
 'c6': '7104',
 'c7': '1381',
 'c8': '62',
 'c9': '5554113'}


                                                                                

Training: 4195197692
Validation: 89137318
Test: 89137319


[Stage 11:>                                                       (0 + 16) / 16]

Checking whether extension tensorflow was built with MPI.
Extension tensorflow was built with MPI.
mpirun --allow-run-as-root --tag-output -np 16 -H dgx2h0194-a1adff968d508e8d1142986f3e2c42dc:16 -bind-to none -map-by slot -mca pml ob1 -mca btl ^openib --timestamp-output      -mca btl_tcp_if_include enp134s0f0 -x NCCL_IB_GID_INDEX=3 -x NCCL_DEBUG=INFO -mca plm_rsh_agent "/home/ngc-auth-ldap-allxu/miniconda3/bin/python -m horovod.spark.driver.mpirun_rsh gAWVcAEAAAAAAAB9lCiMAmxvlF2UjAkxMjcuMC4wLjGUTWIShpRhjAdlbnA1M3MwlF2UjAwxMC4xNDguMzAuNTmUTWIShpRhjAdlbnA1OHMwlF2UjAwxMC4xNDguOTQuNTmUTWIShpRhjAdlbnA4OHMwlF2UjAwxMC4xNDkuMzAuMzSUTWIShpRhjAdlbnA5M3MwlF2UjAwxMC4xNDkuOTQuNTeUTWIShpRhjAplbnAxMzRzMGYwlF2UjAsxMC4xNTAuMzAuMpRNYhKGlGGMCGVucDE4NHMwlF2UjA0xMC4xNDguMTU4LjU5lE1iEoaUYYwIZW5wMTg5czCUXZSMDTEwLjE0OC4yMjIuNTmUTWIShpRhjAhlbnAyMjVzMJRdlIwNMTAuMTQ5LjE1OC41N5RNYhKGlGGMCGVucDIzMHMwlF2UjA0xMC4xNDkuMjIyLjU3lE1iEoaUYYwHZG9ja2VyMJRdlIwKMTcyLjE3LjAuMZRNYhKGlGF1Lg== gAWVAwMAAAAAAACMI2hvcm92b2QucnVubmV

Mon Sep  6 09:03:56 2021[1,9]<stdout>:Changing cwd from /home/ngc-auth-ldap-allxu to /raid/spark-team/allen-dlrm/spark-3.1.2-bin-hadoop3.2/work/app-20210906090316-0000/1
Mon Sep  6 09:03:56 2021[1,12]<stdout>:Changing cwd from /home/ngc-auth-ldap-allxu to /raid/spark-team/allen-dlrm/spark-3.1.2-bin-hadoop3.2/work/app-20210906090316-0000/1


Mon Sep  6 09:04:16 2021[1,4]<stderr>:2021-09-06 09:04:16.101560: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
Mon Sep  6 09:04:16 2021[1,4]<stderr>:To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Mon Sep  6 09:04:16 2021[1,2]<stderr>:2021-09-06 09:04:16.101685: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
Mon Sep  6 09:04:16 2021[1,2]<stderr>:To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Mon Sep  6 09:04:16 2021[1,5]<stderr>:2021-09-06 09:04:16.101662: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is opt

Mon Sep  6 09:04:18 2021[1,11]<stderr>:2021-09-06 09:04:18.332073: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1504] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 27633 MB memory:  -> device: 0, name: Tesla V100-SXM3-32GB-H, pci bus id: 0000:3b:00.0, compute capability: 7.0
Mon Sep  6 09:04:18 2021[1,4]<stderr>:2021-09-06 09:04:18.333252: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1504] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 27633 MB memory:  -> device: 0, name: Tesla V100-SXM3-32GB-H, pci bus id: 0000:e2:00.0, compute capability: 7.0
Mon Sep  6 09:04:18 2021[1,3]<stderr>:2021-09-06 09:04:18.390624: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1504] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 27633 MB memory:  -> device: 0, name: Tesla V100-SXM3-32GB-H, pci bus id: 0000:e7:00.0, compute capability: 7.0
Mon Sep  6 09:04:18 2021[1,7]<stderr>:2021-09-06 09:04:18.391931: I tensorflow/core/common_runtime/

Mon Sep  6 09:04:19 2021[1,11]<stderr>:2021-09-06 09:04:19.064564: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
Mon Sep  6 09:04:19 2021[1,11]<stderr>:2021-09-06 09:04:19.064584: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
Mon Sep  6 09:04:19 2021[1,11]<stderr>:2021-09-06 09:04:19.064626: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1614] Profiler found 1 GPUs
Mon Sep  6 09:04:19 2021[1,3]<stderr>:2021-09-06 09:04:19.114647: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
Mon Sep  6 09:04:19 2021[1,3]<stderr>:2021-09-06 09:04:19.114675: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
Mon Sep  6 09:04:19 2021[1,3]<stderr>:2021-09-06 09:04:19.114721: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1614] Profiler found 1 GPUs
Mon Sep  6 09:04:19 2021[1,7]<stderr>:2021-09-06 09:04:19.130694: I tensorflow/core/profiler/li

Mon Sep  6 09:04:19 2021[1,0]<stdout>:Model: "model"
Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 09:04:19 2021[1,0]<stdout>:Layer (type)                    Output Shape         Param #     Connected to                     
Mon Sep  6 09:04:19 2021[1,0]<stdout>:c0 (InputLayer)                 [(None, 1)]          0           []                               
Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 09:04:19 2021[1,0]<stdout>:c1 (InputLayer)                 [(None, 1)]          0           []                               
Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 09:04:19 2021[1,0]<stdout>:c2 (InputLayer)                 [(None, 1)]          0           []                    

Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 09:04:19 2021[1,0]<stdout>:embedding_c6 (Embedding)        (None, 1, 8)         56840       ['c6[0][0]']                     
Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________




Mon Sep  6 09:04:19 2021[1,0]<stdout>:embedding_c7 (Embedding)        (None, 1, 8)         11056       ['c7[0][0]']                     
Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________


Mon Sep  6 09:04:19 2021[1,0]<stderr>:2021-09-06 09:04:19.311956: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.


Mon Sep  6 09:04:19 2021[1,0]<stdout>:embedding_c8 (Embedding)        (None, 1, 8)         504         ['c8[0][0]']                     


Mon Sep  6 09:04:19 2021[1,0]<stderr>:2021-09-06 09:04:19.311980: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.


Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________


Mon Sep  6 09:04:19 2021[1,0]<stderr>:2021-09-06 09:04:19.312035: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1614] Profiler found 1 GPUs


Mon Sep  6 09:04:19 2021[1,0]<stdout>:embedding_c9 (Embedding)        (None, 1, 8)         44432912    ['c9[0][0]']                     
Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 09:04:19 2021[1,0]<stdout>:embedding_c10 (Embedding)       (None, 1, 8)         4659752     ['c10[0][0]']                    
Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 09:04:19 2021[1,0]<stdout>:embedding_c11 (Embedding)       (None, 1, 8)         1966624     ['c11[0][0]']                    
Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 09:04:19 2021[1,0]<stdout>:embedding_c12 (Embedding)       (None, 1, 8)         88          ['c12[0][0]']                    
Mon Sep  6 09:04:19 2021[1,0]<stdout>:___

Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 09:04:19 2021[1,0]<stdout>:embeddings_flatten (Flatten)    (None, 192)          0           ['embeddings_concat[0][0]']      




Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 09:04:19 2021[1,0]<stdout>:one_hot_c5 (CategoryEncoding)   (None, 4)            0           ['c5[0][0]']                     




Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 09:04:19 2021[1,0]<stdout>:one_hot_c16 (CategoryEncoding)  (None, 4)            0           ['c16[0][0]']                    


Mon Sep  6 09:04:19 2021[1,10]<stderr>:2021-09-06 09:04:19.331721: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.


Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________


Mon Sep  6 09:04:19 2021[1,10]<stderr>:2021-09-06 09:04:19.331743: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
Mon Sep  6 09:04:19 2021[1,10]<stderr>:2021-09-06 09:04:19.331792: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1614] Profiler found 1 GPUs


Mon Sep  6 09:04:19 2021[1,0]<stdout>:i0 (InputLayer)                 [(None, 1)]          0           []                               
Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 09:04:19 2021[1,0]<stdout>:i1 (InputLayer)                 [(None, 1)]          0           []                               
Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 09:04:19 2021[1,0]<stdout>:i2 (InputLayer)                 [(None, 1)]          0           []                               
Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 09:04:19 2021[1,0]<stdout>:i3 (InputLayer)                 [(None, 1)]          0           []                               
Mon Sep  6 09:04:19 2021[1,0]<stdout>:___



Mon Sep  6 09:04:19 2021[1,0]<stdout>:i12 (InputLayer)                [(None, 1)]          0           []                               




Mon Sep  6 09:04:19 2021[1,0]<stdout>:__________________________________________________________________________________________________


Mon Sep  6 09:04:19 2021[1,6]<stderr>:2021-09-06 09:04:19.341050: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.


Mon Sep  6 09:04:19 2021[1,0]<stdout>:inputs_concat (Concatenate)     (None, 213)          0           ['embeddings_flatten[0][0]',     


Mon Sep  6 09:04:19 2021[1,6]<stderr>:2021-09-06 09:04:19.341072: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.


Mon Sep  6 09:04:19 2021[1,0]<stdout>:                                                                  'one_hot_c5[0][0]',             


Mon Sep  6 09:04:19 2021[1,6]<stderr>:2021-09-06 09:04:19.341122: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1614] Profiler found 1 GPUs


Mon Sep  6 09:04:19 2021[1,0]<stdout>:                                                                  'one_hot_c16[0][0]',            
Mon Sep  6 09:04:19 2021[1,0]<stdout>:                                                                  'i0[0][0]',                     
Mon Sep  6 09:04:19 2021[1,0]<stdout>:                                                                  'i1[0][0]',                     
Mon Sep  6 09:04:19 2021[1,0]<stdout>:                                                                  'i2[0][0]',                     
Mon Sep  6 09:04:19 2021[1,0]<stdout>:                                                                  'i3[0][0]',                     
Mon Sep  6 09:04:19 2021[1,0]<stdout>:                                                                  'i4[0][0]',                     
Mon Sep  6 09:04:19 2021[1,0]<stdout>:                                                                  'i5[0][0]',                     
Mon Sep  6 09:04:19 2021[1,0]<stdout>:   

Mon Sep  6 09:04:19 2021[1,14]<stderr>:2021-09-06 09:04:19.369461: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
Mon Sep  6 09:04:19 2021[1,14]<stderr>:2021-09-06 09:04:19.369483: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
Mon Sep  6 09:04:19 2021[1,14]<stderr>:2021-09-06 09:04:19.369533: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1614] Profiler found 1 GPUs
Mon Sep  6 09:04:19 2021[1,2]<stderr>:2021-09-06 09:04:19.714971: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down.
Mon Sep  6 09:04:19 2021[1,2]<stderr>:2021-09-06 09:04:19.719251: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1748] CUPTI activity buffer flushed
Mon Sep  6 09:04:19 2021[1,5]<stderr>:2021-09-06 09:04:19.726330: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down.
Mon Sep  6 09:04:19 2021[1,5]<stderr>:2021-09-06 09:04:19.730619: I tensorflow/core/prof

Mon Sep  6 09:04:45 2021[1,0]<stdout>:dgx2h0194:79712:80584 [0] NCCL INFO Bootstrap : Using enp53s0:10.148.30.59<0>
Mon Sep  6 09:04:45 2021[1,0]<stdout>:dgx2h0194:79712:80584 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
Mon Sep  6 09:04:46 2021[1,0]<stdout>:dgx2h0194:79712:80584 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [4]mlx5_4:1/RoCE [5]mlx5_5:1/RoCE [6]mlx5_6:1/RoCE [7]mlx5_7:1/RoCE [8]mlx5_8:1/RoCE [9]mlx5_9:1/RoCE ; OOB enp53s0:10.148.30.59<0>
Mon Sep  6 09:04:46 2021[1,0]<stdout>:dgx2h0194:79712:80584 [0] NCCL INFO Using network IB
Mon Sep  6 09:04:46 2021[1,0]<stdout>:NCCL version 2.10.3+cuda11.0
Mon Sep  6 09:04:46 2021[1,10]<stdout>:dgx2h0194:79743:80602 [0] NCCL INFO Bootstrap : Using enp53s0:10.148.30.59<0>
Mon Sep  6 09:04:46 2021[1,10]<stdout>:dgx2h0194:79743:80602 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
Mon Sep  6 09:04:

Mon Sep  6 09:04:46 2021[1,11]<stdout>:dgx2h0194:79744:80434 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
Mon Sep  6 09:04:46 2021[1,12]<stdout>:dgx2h0194:79745:80741 [0] NCCL INFO Bootstrap : Using enp53s0:10.148.30.59<0>
Mon Sep  6 09:04:46 2021[1,12]<stdout>:dgx2h0194:79745:80741 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
Mon Sep  6 09:04:46 2021[1,3]<stdout>:dgx2h0194:79727:80587 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [4]mlx5_4:1/RoCE [5]mlx5_5:1/RoCE [6]mlx5_6:1/RoCE [7]mlx5_7:1/RoCE [8]mlx5_8:1/RoCE [9]mlx5_9:1/RoCE ; OOB enp53s0:10.148.30.59<0>
Mon Sep  6 09:04:46 2021[1,3]<stdout>:dgx2h0194:79727:80587 [0] NCCL INFO Using network IB
Mon Sep  6 09:04:46 2021[1,11]<stdout>:dgx2h0194:79744:80434 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [4]mlx5_4:1/RoCE [5]mlx5_5:1/RoCE [6]mlx5

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 09:04:57 2021[1,13]<stdout>:dgx2h0194:79746:80607 [0] NCCL INFO Trees [0] 1/-1/-1->13->9 [1] 1/-1/-1->13->9 [2] 1/-1/-1->13->9 [3] 1/-1/-1->13->9 [4] 1/-1/-1->13->9 [5] 1/-1/-1->13->9 [6] 1/-1/-1->13->9 [7] 1/-1/-1->13->9 [8] 1/-1/-1->13->9 [9] 1/-1/-1->13->9 [10] 1/-1/-1->13->9 [11] 1/-1/-1->13->9
Mon Sep  6 09:04:57 2021[1,13]<stdout>:dgx2h0194:79746:80607 [0] NCCL INFO Setting affinity for GPU 4 to ff,ffff0000,00ffffff
Mon Sep  6 09:04:57 2021[1,12]<stdout>:dgx2h0194:79745:80741 [0] NCCL INFO Trees [0] 6/-1/-1->12->11 [1] 6/-1/-1->12->11 [2] 6/-1/-1->12->11 [3] 6/-1/-1->12->11 [4] 6/-1/-1->12->11 [5] 6/-1/-1->12->11 [6] 6/-1/-1->12->11 [7] 6/-1/-1->12->11 [8] 6/-1/-1->12->11 [9] 6/-1/-1->12->11 [10] 6/-1/-1->12->11 [11] 6/-1/-1->12->11
Mon Sep  6 09:04:57 2021[1,12]<stdout>:dgx2h0194:79745:80741 [0] NCCL INFO Setting affinity for GPU 2 to ff,ffff0000,00ffffff
Mon Sep  6 09:04:57 2021[1,15]<stdout>:dgx2h0194:79748:80424 [0] NCCL INFO Trees [0] 4/-1/-1->15->3 [1] 4/-1/-1->1

Mon Sep  6 09:04:57 2021[1,9]<stdout>:dgx2h0194:79742:80744 [0] NCCL INFO Setting affinity for GPU 5 to ff,ffff0000,00ffffff
Mon Sep  6 09:04:57 2021[1,10]<stdout>:dgx2h0194:79743:80602 [0] NCCL INFO Trees [0] 9/-1/-1->10->6 [1] 9/-1/-1->10->6 [2] 9/-1/-1->10->6 [3] 9/-1/-1->10->6 [4] 9/-1/-1->10->6 [5] 9/-1/-1->10->6 [6] 9/-1/-1->10->6 [7] 9/-1/-1->10->6 [8] 9/-1/-1->10->6 [9] 9/-1/-1->10->6 [10] 9/-1/-1->10->6 [11] 9/-1/-1->10->6
Mon Sep  6 09:04:57 2021[1,10]<stdout>:dgx2h0194:79743:80602 [0] NCCL INFO Setting affinity for GPU 7 to ff,ffff0000,00ffffff
Mon Sep  6 09:04:57 2021[1,15]<stdout>:dgx2h0194:79748:80424 [0] NCCL INFO Channel 00 : 15[e5000] -> 4[e2000] via P2P/IPC
Mon Sep  6 09:04:57 2021[1,6]<stdout>:dgx2h0194:79739:80608 [0] NCCL INFO Channel 00 : 6[5c000] -> 10[5e000] via P2P/IPC
Mon Sep  6 09:04:57 2021[1,8]<stdout>:dgx2h0194:79741:80747 [0] NCCL INFO Channel 00 : 8[34000] -> 11[3b000] via P2P/IPC
Mon Sep  6 09:04:57 2021[1,9]<stdout>:dgx2h0194:79742:80744 [0] NCCL INFO 

Mon Sep  6 09:04:57 2021[1,15]<stdout>:dgx2h0194:79748:80424 [0] NCCL INFO Channel 09 : 15[e5000] -> 4[e2000] via P2P/IPC
Mon Sep  6 09:04:57 2021[1,6]<stdout>:dgx2h0194:79739:80608 [0] NCCL INFO Channel 09 : 6[5c000] -> 10[5e000] via P2P/IPC
Mon Sep  6 09:04:57 2021[1,9]<stdout>:dgx2h0194:79742:80744 [0] NCCL INFO Channel 09 : 9[59000] -> 13[57000] via P2P/IPC
Mon Sep  6 09:04:57 2021[1,8]<stdout>:dgx2h0194:79741:80747 [0] NCCL INFO Channel 09 : 8[34000] -> 11[3b000] via P2P/IPC
Mon Sep  6 09:04:57 2021[1,11]<stdout>:dgx2h0194:79744:80434 [0] NCCL INFO Channel 09 : 11[3b000] -> 12[39000] via P2P/IPC
Mon Sep  6 09:04:57 2021[1,14]<stdout>:dgx2h0194:79747:80615 [0] NCCL INFO Channel 09 : 14[e0000] -> 0[36000] via P2P/IPC
Mon Sep  6 09:04:57 2021[1,2]<stdout>:dgx2h0194:79722:80271 [0] NCCL INFO Channel 09 : 2[b7000] -> 7[b9000] via P2P/IPC
Mon Sep  6 09:04:57 2021[1,15]<stdout>:dgx2h0194:79748:80424 [0] NCCL INFO Channel 10 : 15[e5000] -> 4[e2000] via P2P/IPC
Mon Sep  6 09:04:57 2021[1,6

Mon Sep  6 09:04:58 2021[1,12]<stdout>:dgx2h0194:79745:80741 [0] NCCL INFO Channel 07 : 12[39000] -> 6[5c000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,10]<stdout>:dgx2h0194:79743:80602 [0] NCCL INFO Channel 04 : 10[5e000] -> 9[59000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,12]<stdout>:dgx2h0194:79745:80741 [0] NCCL INFO Channel 08 : 12[39000] -> 6[5c000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,10]<stdout>:dgx2h0194:79743:80602 [0] NCCL INFO Channel 05 : 10[5e000] -> 9[59000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,12]<stdout>:dgx2h0194:79745:80741 [0] NCCL INFO Channel 09 : 12[39000] -> 6[5c000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,10]<stdout>:dgx2h0194:79743:80602 [0] NCCL INFO Channel 06 : 10[5e000] -> 9[59000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,7]<stdout>:dgx2h0194:79740:80605 [0] NCCL INFO Channel 00 : 7[b9000] -> 3[e7000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,12]<stdout>:dgx2h0194:79745:80741 [0] NCCL INFO Channel 10 : 12[39000] -> 6[5c000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1

Mon Sep  6 09:04:58 2021[1,5]<stdout>:dgx2h0194:79738:80437 [0] NCCL INFO Channel 10 : 5[bc000] -> 2[b7000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,11]<stdout>:dgx2h0194:79744:80434 [0] NCCL INFO Channel 07 : 11[3b000] -> 8[34000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,6]<stdout>:dgx2h0194:79739:80608 [0] NCCL INFO Channel 01 : 6[5c000] -> 12[39000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,5]<stdout>:dgx2h0194:79738:80437 [0] NCCL INFO Channel 11 : 5[bc000] -> 2[b7000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,11]<stdout>:dgx2h0194:79744:80434 [0] NCCL INFO Channel 08 : 11[3b000] -> 8[34000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,6]<stdout>:dgx2h0194:79739:80608 [0] NCCL INFO Channel 02 : 6[5c000] -> 12[39000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,12]<stdout>:dgx2h0194:79745:80741 [0] NCCL INFO Connected all rings
Mon Sep  6 09:04:58 2021[1,11]<stdout>:dgx2h0194:79744:80434 [0] NCCL INFO Channel 09 : 11[3b000] -> 8[34000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,6]<stdout>:dgx2h0194:79739:806

Mon Sep  6 09:04:58 2021[1,13]<stdout>:dgx2h0194:79746:80607 [0] NCCL INFO Channel 08 : 13[57000] -> 9[59000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,1]<stdout>:dgx2h0194:79717:80738 [0] NCCL INFO Channel 11 : 1[be000] -> 13[57000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,15]<stdout>:dgx2h0194:79748:80424 [0] NCCL INFO Channel 11 : 15[e5000] -> 3[e7000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,14]<stdout>:dgx2h0194:79747:80615 [0] NCCL INFO Channel 02 : 14[e0000] -> 4[e2000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,12]<stdout>:dgx2h0194:79745:80741 [0] NCCL INFO Channel 11 : 12[39000] -> 11[3b000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,9]<stdout>:dgx2h0194:79742:80744 [0] NCCL INFO Connected all rings
Mon Sep  6 09:04:58 2021[1,13]<stdout>:dgx2h0194:79746:80607 [0] NCCL INFO Channel 09 : 13[57000] -> 9[59000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,14]<stdout>:dgx2h0194:79747:80615 [0] NCCL INFO Channel 03 : 14[e0000] -> 4[e2000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,13]<stdout>:dgx2h0194:797

Mon Sep  6 09:04:58 2021[1,10]<stdout>:dgx2h0194:79743:80602 [0] NCCL INFO Channel 02 : 10[5e000] -> 6[5c000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,4]<stdout>:dgx2h0194:79732:80427 [0] NCCL INFO Channel 06 : 4[e2000] -> 15[e5000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,10]<stdout>:dgx2h0194:79743:80602 [0] NCCL INFO Channel 03 : 10[5e000] -> 6[5c000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,5]<stdout>:dgx2h0194:79738:80437 [0] NCCL INFO Channel 00 : 5[bc000] -> 1[be000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,4]<stdout>:dgx2h0194:79732:80427 [0] NCCL INFO Channel 07 : 4[e2000] -> 15[e5000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,10]<stdout>:dgx2h0194:79743:80602 [0] NCCL INFO Channel 04 : 10[5e000] -> 6[5c000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,5]<stdout>:dgx2h0194:79738:80437 [0] NCCL INFO Channel 01 : 5[bc000] -> 1[be000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,4]<stdout>:dgx2h0194:79732:80427 [0] NCCL INFO Channel 08 : 4[e2000] -> 15[e5000] via P2P/IPC
Mon Sep  6 09:04:58 2021[1,10]<

Mon Sep  6 09:04:59 2021[1,1]<stdout>:dgx2h0194:79717:80738 [0] NCCL INFO 12 coll channels, 16 p2p channels, 16 p2p channels per peer
Mon Sep  6 09:04:59 2021[1,7]<stdout>:dgx2h0194:79740:80605 [0] NCCL INFO Channel 06 : 7[b9000] -> 2[b7000] via P2P/IPC
Mon Sep  6 09:04:59 2021[1,3]<stdout>:dgx2h0194:79727:80587 [0] NCCL INFO Connected all trees
Mon Sep  6 09:04:59 2021[1,3]<stdout>:dgx2h0194:79727:80587 [0] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512
Mon Sep  6 09:04:59 2021[1,7]<stdout>:dgx2h0194:79740:80605 [0] NCCL INFO Channel 07 : 7[b9000] -> 2[b7000] via P2P/IPC
Mon Sep  6 09:04:59 2021[1,7]<stdout>:dgx2h0194:79740:80605 [0] NCCL INFO Channel 08 : 7[b9000] -> 2[b7000] via P2P/IPC
Mon Sep  6 09:04:59 2021[1,3]<stdout>:dgx2h0194:79727:80587 [0] NCCL INFO 12 coll channels, 16 p2p channels, 16 p2p channels per peer
Mon Sep  6 09:04:59 2021[1,7]<stdout>:dgx2h0194:79740:80605 [0] NCCL INFO Channel 09 : 7[b9000] -> 2[b7000] via P2P/IPC
Mon Sep  6 09:04:59 2021[1,7]<stdout>:d

Mon Sep  6 09:05:08 2021[1,3]<stderr>:2021-09-06 09:05:08.906524: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
Mon Sep  6 09:05:08 2021[1,3]<stderr>:2021-09-06 09:05:08.906568: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
Mon Sep  6 09:05:08 2021[1,9]<stderr>:2021-09-06 09:05:08.907713: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
Mon Sep  6 09:05:08 2021[1,9]<stderr>:2021-09-06 09:05:08.907738: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
Mon Sep  6 09:05:08 2021[1,11]<stderr>:2021-09-06 09:05:08.915421: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
Mon Sep  6 09:05:08 2021[1,11]<stderr>:2021-09-06 09:05:08.915453: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
Mon Sep  6 09:05:08 2021[1,1]<stderr>:2021-09-06 09:05:08.920370: I tensorflow/core/profiler/li

   1/4000 [..............................]Mon Sep  6 09:04:59 2021[1,0]<stdout>: - ETA: 31:27:06 - loss: 0.9112 - auc: 0.4892Mon Sep  6 09:05:10 2021[1,0]<stdout>:

Mon Sep  6 09:05:10 2021[1,11]<stderr>:2021-09-06 09:05:10.342218: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
Mon Sep  6 09:05:10 2021[1,8]<stderr>:2021-09-06 09:05:10.342267: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
Mon Sep  6 09:05:10 2021[1,1]<stderr>:2021-09-06 09:05:10.342284: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
Mon Sep  6 09:05:10 2021[1,14]<stderr>:2021-09-06 09:05:10.342547: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
Mon Sep  6 09:05:10 2021[1,0]<stderr>:2021-09-06 09:05:10.342667: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
Mon Sep  6 09:05:10 2021[1,11]<stderr>:2021-09-06 09:05:10.343662: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1748] CUPTI activity buffer flushed
Mon Sep  6 09:05:10 2021[1,1]<stderr>:2021-09-06 09:05:10.343704

Mon Sep  6 09:05:10 2021[1,5]<stderr>:
Mon Sep  6 09:05:10 2021[1,9]<stderr>:2021-09-06 09:05:10.468653: I tensorflow/core/profiler/rpc/client/save_profile.cc:136] Creating directory: tf_logs/plugins/profile/2021_09_06_09_05_10
Mon Sep  6 09:05:10 2021[1,9]<stderr>:
Mon Sep  6 09:05:10 2021[1,6]<stderr>:2021-09-06 09:05:10.469858: I tensorflow/core/profiler/rpc/client/save_profile.cc:136] Creating directory: tf_logs/plugins/profile/2021_09_06_09_05_10
Mon Sep  6 09:05:10 2021[1,6]<stderr>:
Mon Sep  6 09:05:10 2021[1,15]<stderr>:2021-09-06 09:05:10.471366: I tensorflow/core/profiler/rpc/client/save_profile.cc:136] Creating directory: tf_logs/plugins/profile/2021_09_06_09_05_10
Mon Sep  6 09:05:10 2021[1,15]<stderr>:
Mon Sep  6 09:05:10 2021[1,4]<stderr>:2021-09-06 09:05:10.477411: I tensorflow/core/profiler/rpc/client/save_profile.cc:136] Creating directory: tf_logs/plugins/profile/2021_09_06_09_05_10
Mon Sep  6 09:05:10 2021[1,4]<stderr>:
Mon Sep  6 09:05:10 2021[1,14]<stderr>:2021-09-

Mon Sep  6 09:05:10 2021[1,2]<stderr>:Dumped tool data for overview_page.pb to tf_logs/plugins/profile/2021_09_06_09_05_10/dgx2h0194.overview_page.pb
Mon Sep  6 09:05:10 2021[1,2]<stderr>:Dumped tool data for input_pipeline.pb to tf_logs/plugins/profile/2021_09_06_09_05_10/dgx2h0194.input_pipeline.pb
Mon Sep  6 09:05:10 2021[1,2]<stderr>:Dumped tool data for tensorflow_stats.pb to tf_logs/plugins/profile/2021_09_06_09_05_10/dgx2h0194.tensorflow_stats.pb
Mon Sep  6 09:05:10 2021[1,2]<stderr>:Dumped tool data for kernel_stats.pb to tf_logs/plugins/profile/2021_09_06_09_05_10/dgx2h0194.kernel_stats.pb
Mon Sep  6 09:05:10 2021[1,2]<stderr>:
Mon Sep  6 09:05:10 2021[1,13]<stderr>:2021-09-06 09:05:10.555408: I tensorflow/core/profiler/rpc/client/capture_profile.cc:251] Creating directory: tf_logs/plugins/profile/2021_09_06_09_05_10
Mon Sep  6 09:05:10 2021[1,13]<stderr>:Dumped tool data for xplane.pb to tf_logs/plugins/profile/2021_09_06_09_05_10/dgx2h0194.xplane.pb
Mon Sep  6 09:05:10 2021[

Mon Sep  6 09:05:10 2021[1,11]<stderr>:2021-09-06 09:05:10.585794: I tensorflow/core/profiler/rpc/client/save_profile.cc:136] Creating directory: tf_logs/plugins/profile/2021_09_06_09_05_10
Mon Sep  6 09:05:10 2021[1,11]<stderr>:
Mon Sep  6 09:05:10 2021[1,15]<stderr>:2021-09-06 09:05:10.586581: I tensorflow/core/profiler/rpc/client/save_profile.cc:136] Creating directory: tf_logs/plugins/profile/2021_09_06_09_05_10
Mon Sep  6 09:05:10 2021[1,15]<stderr>:
Mon Sep  6 09:05:10 2021[1,3]<stderr>:2021-09-06 09:05:10.587215: I tensorflow/core/profiler/rpc/client/save_profile.cc:142] Dumped gzipped tool data for memory_profile.json.gz to tf_logs/plugins/profile/2021_09_06_09_05_10/dgx2h0194.memory_profile.json.gz
Mon Sep  6 09:05:10 2021[1,12]<stderr>:2021-09-06 09:05:10.589198: I tensorflow/core/profiler/rpc/client/capture_profile.cc:251] Creating directory: tf_logs/plugins/profile/2021_09_06_09_05_10
Mon Sep  6 09:05:10 2021[1,12]<stderr>:Dumped tool data for xplane.pb to tf_logs/plugins/p

Mon Sep  6 09:05:10 2021[1,8]<stderr>:
Mon Sep  6 09:05:10 2021[1,8]<stderr>:2021-09-06 09:05:10.609086: I tensorflow/core/profiler/rpc/client/save_profile.cc:142] Dumped gzipped tool data for memory_profile.json.gz to tf_logs/plugins/profile/2021_09_06_09_05_10/dgx2h0194.memory_profile.json.gz
Mon Sep  6 09:05:10 2021[1,10]<stderr>:2021-09-06 09:05:10.609124: I tensorflow/core/profiler/rpc/client/save_profile.cc:142] Dumped gzipped tool data for memory_profile.json.gz to tf_logs/plugins/profile/2021_09_06_09_05_10/dgx2h0194.memory_profile.json.gz
Mon Sep  6 09:05:10 2021[1,1]<stderr>:2021-09-06 09:05:10.609444: I tensorflow/core/profiler/rpc/client/save_profile.cc:142] Dumped gzipped tool data for memory_profile.json.gz to tf_logs/plugins/profile/2021_09_06_09_05_10/dgx2h0194.memory_profile.json.gz
Mon Sep  6 09:05:10 2021[1,8]<stderr>:2021-09-06 09:05:10.614234: I tensorflow/core/profiler/rpc/client/capture_profile.cc:251] Creating directory: tf_logs/plugins/profile/2021_09_06_09_05_

   5/4000 [..............................]Mon Sep  6 09:05:11 2021[1,0]<stdout>: - ETA: 3:15:09 - loss: 0.8028 - auc: 0.4982Mon Sep  6 09:05:11 2021[1,0]<stdout10 Mon Sep  6 09:05:10 2021[1,0]<stdout>Mon Sep  6 09:05:10 2021[1,0]<stdout>:



 212/4000 [>.............................]Mon Sep  6 09:05:52 2021[1,0]<stdout>: - ETA: 15:58 - loss: 0.1657 - auc: 0.6865Mon Sep  6 09:05:52 2021[1,0]<stdout49Mon Sep  6 09:05:52 2021[1,0]<stdoutMon Sep  6 09:05:51 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

 513/4000 [==>...........................]Mon Sep  6 09:06:52 2021[1,0]<stdout>: - ETA: 12:52 - loss: 0.1432 - auc: 0.7375Mon Sep  6 09:06:52 2021[1,0]<stdout57Mon Sep  6 09:06:50 2021[1,0]<stdout

[Stage 11:>                                                       (0 + 16) / 16]

 799/4000 [====>.........................]Mon Sep  6 09:07:50 2021[1,0]<stdout>: - ETA: 11:25 - loss: 0.1371 - auc: 0.7529Mon Sep  6 09:07:52 2021[1,0]<stdout26Mon Sep  6 09:07:49 2021[1,0]<stdout

[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



                                                                                

Best Loss: 0.127242


## Use Petastorm

Here we set `--dataloader` to `petastorm` to force Petastorm run.

In [9]:
def main_petastorm():
    parser = argparse.ArgumentParser(description='Criteo Spark Keras Training Example',
                                     formatter_class=argparse.ArgumentDefaultsHelpFormatter)
    parser.add_argument('--data-dir', default='file:///opt/data/criteo/parquet',
                        help='location of the transformed Criteo dataset in Parquet format')
    parser.add_argument('--logs-dir', default='/opt/experiments/criteo', help='location of TensorFlow logs')
    parser.add_argument('--dataloader', default=PETASTORM_DATALOADER,
                        choices=[PETASTORM_DATALOADER, NVTABULAR_DATALOADER],
                        help='dataloader to use')
    parser.add_argument('--num-proc', type=int, default=1, help='number of worker processes for training')
    parser.add_argument('--learning-rate', type=float, default=0.0001, help='initial learning rate')
    parser.add_argument('--batch-size', type=int, default=64 * 1024, help='batch size')
    parser.add_argument('--epochs', type=int, default=3, help='number of epochs to train')
    parser.add_argument('--local-checkpoint-file', default='checkpoint', help='model checkpoint')
    args = parser.parse_args(args=['--num-proc', '16', '--data-dir', 'file:///raid/spark-team/criteo/parquet', 
                                   '--dataloader', 'petastorm', '--learning-rate', '0.001',
                                   '--batch-size', '65535','--epochs', '1', '--logs-dir', 'tf_logs',
                                   '--local-checkpoint-file', 'ckpt_file'])
                                   

    dimensions = get_category_dimensions(spark, args.data_dir)

    train_df = spark.read.parquet(f'{args.data_dir}/train')
    val_df = spark.read.parquet(f'{args.data_dir}/val')
    test_df = spark.read.parquet(f'{args.data_dir}/test')
    train_rows, val_rows, test_rows = train_df.count(), val_df.count(), test_df.count()
    print('Training: %d' % train_rows)
    print('Validation: %d' % val_rows)
    print('Test: %d' % test_rows)

    train(dimensions, train_rows, val_rows, args)

    spark.stop()

In [10]:
main_petastorm()

21/09/06 11:44:44 WARN package: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
                                                                                

{'c0': '7912888',
 'c1': '33822',
 'c10': '582468',
 'c11': '245827',
 'c12': '10',
 'c13': '2208',
 'c14': '10666',
 'c15': '103',
 'c16': '3',
 'c17': '967',
 'c18': '14',
 'c19': '8165895',
 'c2': '17138',
 'c20': '2675939',
 'c21': '7156452',
 'c22': '302515',
 'c23': '12021',
 'c24': '96',
 'c25': '34',
 'c3': '7338',
 'c4': '20045',
 'c5': '3',
 'c6': '7104',
 'c7': '1381',
 'c8': '62',
 'c9': '5554113'}


                                                                                

Training: 4195197692
Validation: 89137318
Test: 89137319


[Stage 11:>                                                       (0 + 16) / 16]

Checking whether extension tensorflow was built with MPI.
Extension tensorflow was built with MPI.
mpirun --allow-run-as-root --tag-output -np 16 -H dgx2h0194-a1adff968d508e8d1142986f3e2c42dc:16 -bind-to none -map-by slot -mca pml ob1 -mca btl ^openib --timestamp-output      -mca btl_tcp_if_include enp134s0f0 -x NCCL_IB_GID_INDEX=3 -x NCCL_DEBUG=INFO -mca plm_rsh_agent "/home/ngc-auth-ldap-allxu/miniconda3/bin/python -m horovod.spark.driver.mpirun_rsh gAWVcAEAAAAAAAB9lCiMAmxvlF2UjAkxMjcuMC4wLjGUTU5+hpRhjAdlbnA1M3MwlF2UjAwxMC4xNDguMzAuNTmUTU5+hpRhjAdlbnA1OHMwlF2UjAwxMC4xNDguOTQuNTmUTU5+hpRhjAdlbnA4OHMwlF2UjAwxMC4xNDkuMzAuMzSUTU5+hpRhjAdlbnA5M3MwlF2UjAwxMC4xNDkuOTQuNTeUTU5+hpRhjAplbnAxMzRzMGYwlF2UjAsxMC4xNTAuMzAuMpRNTn6GlGGMCGVucDE4NHMwlF2UjA0xMC4xNDguMTU4LjU5lE1OfoaUYYwIZW5wMTg5czCUXZSMDTEwLjE0OC4yMjIuNTmUTU5+hpRhjAhlbnAyMjVzMJRdlIwNMTAuMTQ5LjE1OC41N5RNTn6GlGGMCGVucDIzMHMwlF2UjA0xMC4xNDkuMjIyLjU3lE1OfoaUYYwHZG9ja2VyMJRdlIwKMTcyLjE3LjAuMZRNTn6GlGF1Lg== gAWVAwMAAAAAAACMI2hvcm92b2QucnVubmV

Mon Sep  6 11:44:58 2021[1,12]<stdout>:Changing cwd from /home/ngc-auth-ldap-allxu to /raid/spark-team/allen-dlrm/spark-3.1.2-bin-hadoop3.2/work/app-20210906114426-0004/1
Mon Sep  6 11:44:58 2021[1,13]<stdout>:Changing cwd from /home/ngc-auth-ldap-allxu to /raid/spark-team/allen-dlrm/spark-3.1.2-bin-hadoop3.2/work/app-20210906114426-0004/1


Mon Sep  6 11:45:19 2021[1,14]<stderr>:2021-09-06 11:45:19.644980: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
Mon Sep  6 11:45:19 2021[1,14]<stderr>:To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Mon Sep  6 11:45:19 2021[1,5]<stderr>:2021-09-06 11:45:19.658879: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
Mon Sep  6 11:45:19 2021[1,5]<stderr>:To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Mon Sep  6 11:45:19 2021[1,2]<stderr>:2021-09-06 11:45:19.677330: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is o

Mon Sep  6 11:45:23 2021[1,7]<stderr>:2021-09-06 11:45:23.107462: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1504] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30964 MB memory:  -> device: 0, name: Tesla V100-SXM3-32GB-H, pci bus id: 0000:5c:00.0, compute capability: 7.0
Mon Sep  6 11:45:23 2021[1,6]<stderr>:2021-09-06 11:45:23.108097: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1504] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30964 MB memory:  -> device: 0, name: Tesla V100-SXM3-32GB-H, pci bus id: 0000:b7:00.0, compute capability: 7.0
Mon Sep  6 11:45:23 2021[1,10]<stderr>:2021-09-06 11:45:23.112989: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1504] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30964 MB memory:  -> device: 0, name: Tesla V100-SXM3-32GB-H, pci bus id: 0000:e0:00.0, compute capability: 7.0
Mon Sep  6 11:45:23 2021[1,3]<stderr>:2021-09-06 11:45:23.203766: I tensorflow/core/common_runtime/

Mon Sep  6 11:45:23 2021[1,10]<stderr>:2021-09-06 11:45:23.941176: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
Mon Sep  6 11:45:23 2021[1,10]<stderr>:2021-09-06 11:45:23.941195: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
Mon Sep  6 11:45:23 2021[1,10]<stderr>:2021-09-06 11:45:23.941240: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1614] Profiler found 1 GPUs
Mon Sep  6 11:45:23 2021[1,5]<stderr>:2021-09-06 11:45:23.950215: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down.
Mon Sep  6 11:45:23 2021[1,5]<stderr>:2021-09-06 11:45:23.952931: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1748] CUPTI activity buffer flushed
Mon Sep  6 11:45:23 2021[1,6]<stderr>:2021-09-06 11:45:23.984430: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
Mon Sep  6 11:45:23 2021[1,6]<stderr>:2021-09-06 11:45:23.984453: I tensorflow/core/p

Mon Sep  6 11:45:24 2021[1,0]<stdout>:Model: "model"
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:Layer (type)                    Output Shape         Param #     Connected to                     
Mon Sep  6 11:45:24 2021[1,0]<stdout>:c0 (InputLayer)                 [(None, 1)]          0           []                               
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:c1 (InputLayer)                 [(None, 1)]          0           []                               
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:c2 (InputLayer)                 [(None, 1)]          0           []                    



Mon Sep  6 11:45:24 2021[1,0]<stdout>:c25 (InputLayer)                [(None, 1)]          0           []                               
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________


Mon Sep  6 11:45:24 2021[1,1]<stderr>:2021-09-06 11:45:24.144886: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.


Mon Sep  6 11:45:24 2021[1,0]<stdout>:embedding_c0 (Embedding)        (None, 1, 8)         63303112    ['c0[0][0]']                     


Mon Sep  6 11:45:24 2021[1,1]<stderr>:2021-09-06 11:45:24.144909: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
Mon Sep  6 11:45:24 2021[1,1]<stderr>:2021-09-06 11:45:24.144956: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1614] Profiler found 1 GPUs


Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:embedding_c1 (Embedding)        (None, 1, 8)         270584      ['c1[0][0]']                     
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:embedding_c2 (Embedding)        (None, 1, 8)         137112      ['c2[0][0]']                     
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:embedding_c3 (Embedding)        (None, 1, 8)         58712       ['c3[0][0]']                     
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:emb



Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________


Mon Sep  6 11:45:24 2021[1,0]<stderr>:2021-09-06 11:45:24.151145: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.


Mon Sep  6 11:45:24 2021[1,0]<stdout>:embedding_c11 (Embedding)       (None, 1, 8)         1966624     ['c11[0][0]']                    


Mon Sep  6 11:45:24 2021[1,0]<stderr>:2021-09-06 11:45:24.151167: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.


Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________


Mon Sep  6 11:45:24 2021[1,0]<stderr>:2021-09-06 11:45:24.151215: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1614] Profiler found 1 GPUs


Mon Sep  6 11:45:24 2021[1,0]<stdout>:embedding_c12 (Embedding)       (None, 1, 8)         88          ['c12[0][0]']                    
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:embedding_c13 (Embedding)       (None, 1, 8)         17672       ['c13[0][0]']                    
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:embedding_c14 (Embedding)       (None, 1, 8)         85336       ['c14[0][0]']                    
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:embedding_c15 (Embedding)       (None, 1, 8)         832         ['c15[0][0]']                    
Mon Sep  6 11:45:24 2021[1,0]<stdout>:___

Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:i0 (InputLayer)                 [(None, 1)]          0           []                               
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:i1 (InputLayer)                 [(None, 1)]          0           []                               
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:i2 (InputLayer)                 [(None, 1)]          0           []                               
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:i3 

Mon Sep  6 11:45:24 2021[1,0]<stdout>:rmalization)                                                                                      
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:dense_3 (Dense)                 (None, 32)           2080        ['batch_normalization_3[0][0]']  
Mon Sep  6 11:45:24 2021[1,0]<stdout>:__________________________________________________________________________________________________
Mon Sep  6 11:45:24 2021[1,0]<stdout>:output (Dense)                  (None, 1)            33          ['dense_3[0][0]']                
Mon Sep  6 11:45:24 2021[1,0]<stdout>:Total params: 261,698,789
Mon Sep  6 11:45:24 2021[1,0]<stdout>:Trainable params: 261,697,979
Mon Sep  6 11:45:24 2021[1,0]<stdout>:Non-trainable params: 810
Mon Sep  6 11:45:24 2021[1,0]<stdout>:_________________________________________________________________________________

Mon Sep  6 11:45:24 2021[1,12]<stderr>:2021-09-06 11:45:24.193332: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
Mon Sep  6 11:45:24 2021[1,12]<stderr>:2021-09-06 11:45:24.193353: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
Mon Sep  6 11:45:24 2021[1,12]<stderr>:2021-09-06 11:45:24.193402: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1614] Profiler found 1 GPUs
Mon Sep  6 11:45:24 2021[1,14]<stderr>:2021-09-06 11:45:24.238850: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down.
Mon Sep  6 11:45:24 2021[1,14]<stderr>:2021-09-06 11:45:24.241542: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1748] CUPTI activity buffer flushed
Mon Sep  6 11:45:24 2021[1,9]<stderr>:2021-09-06 11:45:24.250555: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
Mon Sep  6 11:45:24 2021[1,9]<stderr>:2021-09-06 11:45:24.250576: I tensorflow/core

Mon Sep  6 11:45:26 2021[1,13]<stderr>:Cause: could not parse the source code of <function train_fn.<locals>.<lambda> at 0x7f3cc12349d0>: no matching AST found
Mon Sep  6 11:45:26 2021[1,14]<stderr>:Cause: could not parse the source code of <function train_fn.<locals>.<lambda> at 0x7f7bf4174040>: no matching AST found
Mon Sep  6 11:45:26 2021[1,2]<stderr>:Cause: could not parse the source code of <function train_fn.<locals>.<lambda> at 0x7f53c4044160>: no matching AST found
Mon Sep  6 11:45:27 2021[1,11]<stderr>:Cause: could not parse the source code of <function train_fn.<locals>.<lambda> at 0x7f27cc054280>: no matching AST found
Mon Sep  6 11:45:27 2021[1,13]<stderr>:Cause: could not parse the source code of <function train_fn.<locals>.<lambda> at 0x7f3c04103280>: no matching AST found
Mon Sep  6 11:45:27 2021[1,7]<stderr>:Cause: could not parse the source code of <function train_fn.<locals>.<lambda> at 0x7fdbaa897550>: no matching AST found
Mon Sep  6 11:45:27 2021[1,6]<stderr>:Caus

Mon Sep  6 11:45:28 2021[1,12]<stderr>:Cause: could not parse the source code of <function train_fn.<locals>.<lambda> at 0x7fce00127ca0>: no matching AST found
Mon Sep  6 11:45:28 2021[1,8]<stderr>:Cause: could not parse the source code of <function train_fn.<locals>.<lambda> at 0x7fd7b01be430>: no matching AST found
Mon Sep  6 11:45:28 2021[1,4]<stderr>:Cause: could not parse the source code of <function train_fn.<locals>.<lambda> at 0x7f2a840e49d0>: no matching AST found
Mon Sep  6 11:45:28 2021[1,3]<stderr>:Cause: could not parse the source code of <function train_fn.<locals>.<lambda> at 0x7f3254018280>: no matching AST found
Mon Sep  6 11:45:28 2021[1,1]<stderr>:Cause: could not parse the source code of <function train_fn.<locals>.<lambda> at 0x7fd91404c280>: no matching AST found
Mon Sep  6 11:45:28 2021[1,12]<stderr>:Cause: could not parse the source code of <function train_fn.<locals>.<lambda> at 0x7fcdac026280>: no matching AST found
Mon Sep  6 11:45:28 2021[1,15]<stderr>:Cause

Mon Sep  6 11:45:37 2021[1,3]<stderr>:  column_as_pandas = column.data.chunks[0].to_pandas()
Mon Sep  6 11:45:37 2021[1,1]<stderr>:  column_as_pandas = column.data.chunks[0].to_pandas()
Mon Sep  6 11:45:37 2021[1,12]<stderr>:  column_as_pandas = column.data.chunks[0].to_pandas()
Mon Sep  6 11:45:37 2021[1,15]<stderr>:  column_as_pandas = column.data.chunks[0].to_pandas()
Mon Sep  6 11:45:37 2021[1,9]<stderr>:  column_as_pandas = column.data.chunks[0].to_pandas()
Mon Sep  6 11:45:46 2021[1,14]<stderr>:2021-09-06 11:45:46.163581: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:338] Filling up shuffle buffer (this may take a while): 633106 of 655350
Mon Sep  6 11:45:46 2021[1,2]<stderr>:2021-09-06 11:45:46.173808: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:338] Filling up shuffle buffer (this may take a while): 625961 of 655350
Mon Sep  6 11:45:46 2021[1,11]<stderr>:2021-09-06 11:45:46.470637: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:338] Filling up shuffle buff

Mon Sep  6 11:45:53 2021[1,0]<stdout>:dgx2h0194:41886:42730 [0] NCCL INFO Bootstrap : Using enp53s0:10.148.30.59<0>
Mon Sep  6 11:45:53 2021[1,0]<stdout>:dgx2h0194:41886:42730 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
Mon Sep  6 11:45:53 2021[1,0]<stdout>:dgx2h0194:41886:42730 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [4]mlx5_4:1/RoCE [5]mlx5_5:1/RoCE [6]mlx5_6:1/RoCE [7]mlx5_7:1/RoCE [8]mlx5_8:1/RoCE [9]mlx5_9:1/RoCE ; OOB enp53s0:10.148.30.59<0>
Mon Sep  6 11:45:53 2021[1,0]<stdout>:dgx2h0194:41886:42730 [0] NCCL INFO Using network IB
Mon Sep  6 11:45:53 2021[1,0]<stdout>:NCCL version 2.10.3+cuda11.0
Mon Sep  6 11:45:53 2021[1,6]<stdout>:dgx2h0194:41892:42733 [0] NCCL INFO Bootstrap : Using enp53s0:10.148.30.59<0>
Mon Sep  6 11:45:53 2021[1,11]<stdout>:dgx2h0194:41897:42858 [0] NCCL INFO Bootstrap : Using enp53s0:10.148.30.59<0>
Mon Sep  6 11:45:53 2021[1,6]<stdout>:dgx2h0194:41892

Mon Sep  6 11:45:53 2021[1,10]<stdout>:dgx2h0194:41896:42859 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [4]mlx5_4:1/RoCE [5]mlx5_5:1/RoCE [6]mlx5_6:1/RoCE [7]mlx5_7:1/RoCE [8]mlx5_8:1/RoCE [9]mlx5_9:1/RoCE ; OOB enp53s0:10.148.30.59<0>
Mon Sep  6 11:45:53 2021[1,10]<stdout>:dgx2h0194:41896:42859 [0] NCCL INFO Using network IB
Mon Sep  6 11:45:53 2021[1,15]<stdout>:dgx2h0194:41901:42857 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [4]mlx5_4:1/RoCE [5]mlx5_5:1/RoCE [6]mlx5_6:1/RoCE [7]mlx5_7:1/RoCE [8]mlx5_8:1/RoCE [9]mlx5_9:1/RoCE ; OOB enp53s0:10.148.30.59<0>
Mon Sep  6 11:45:53 2021[1,15]<stdout>:dgx2h0194:41901:42857 [0] NCCL INFO Using network IB
Mon Sep  6 11:45:53 2021[1,2]<stdout>:dgx2h0194:41888:42361 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [4]mlx5_4:1/RoCE [5]mlx5_5:1/RoCE [6]mlx5_6:1/RoCE [7]mlx5_7:1/RoCE [8]mlx5_8:1

Mon Sep  6 11:45:58 2021[1,0]<stdout>:dgx2h0194:41886:42730 [0] NCCL INFO Channel 10/12 :    0  13   1  14   5  12   7  11   2   9   3  10   4   8   6  15
Mon Sep  6 11:45:58 2021[1,0]<stdout>:dgx2h0194:41886:42730 [0] NCCL INFO Channel 11/12 :    0  13   1  14   5  12   7  11   2   9   3  10   4   8   6  15
Mon Sep  6 11:45:58 2021[1,0]<stdout>:dgx2h0194:41886:42730 [0] NCCL INFO Trees [0] 13/-1/-1->0->-1 [1] 13/-1/-1->0->-1 [2] 13/-1/-1->0->-1 [3] 13/-1/-1->0->-1 [4] 13/-1/-1->0->-1 [5] 13/-1/-1->0->-1 [6] 13/-1/-1->0->-1 [7] 13/-1/-1->0->-1 [8] 13/-1/-1->0->-1 [9] 13/-1/-1->0->-1 [10] 13/-1/-1->0->-1 [11] 13/-1/-1->0->-1
Mon Sep  6 11:45:58 2021[1,0]<stdout>:dgx2h0194:41886:42730 [0] NCCL INFO Setting affinity for GPU 1 to ff,ffff0000,00ffffff
Mon Sep  6 11:45:58 2021[1,6]<stdout>:dgx2h0194:41892:42733 [0] NCCL INFO Trees [0] 15/-1/-1->6->8 [1] 15/-1/-1->6->8 [2] 15/-1/-1->6->8 [3] 15/-1/-1->6->8 [4] 15/-1/-1->6->8 [5] 15/-1/-1->6->8 [6] 15/-1/-1->6->8 [7] 15/-1/-1->6->8 [8] 15/-1/-

Mon Sep  6 11:45:58 2021[1,14]<stdout>:dgx2h0194:41900:42734 [0] NCCL INFO Channel 04 : 14[3b000] -> 5[59000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,7]<stdout>:dgx2h0194:41893:42743 [0] NCCL INFO Channel 04 : 7[5c000] -> 11[5e000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,15]<stdout>:dgx2h0194:41901:42857 [0] NCCL INFO Channel 04 : 15[b9000] -> 0[36000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,4]<stdout>:dgx2h0194:41890:42739 [0] NCCL INFO Channel 04 : 4[bc000] -> 8[be000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,3]<stdout>:dgx2h0194:41889:42722 [0] NCCL INFO Channel 04 : 3[e2000] -> 10[e0000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,6]<stdout>:dgx2h0194:41892:42733 [0] NCCL INFO Channel 04 : 6[b7000] -> 15[b9000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,13]<stdout>:dgx2h0194:41899:42721 [0] NCCL INFO Channel 05 : 13[34000] -> 1[39000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,14]<stdout>:dgx2h0194:41900:42734 [0] NCCL INFO Channel 05 : 14[3b000] -> 5[59000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,7]

Mon Sep  6 11:45:58 2021[1,5]<stdout>:dgx2h0194:41891:42723 [0] NCCL INFO Channel 07 : 5[59000] -> 12[57000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,2]<stdout>:dgx2h0194:41888:42361 [0] NCCL INFO Channel 06 : 2[e5000] -> 9[e7000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,5]<stdout>:dgx2h0194:41891:42723 [0] NCCL INFO Channel 08 : 5[59000] -> 12[57000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,2]<stdout>:dgx2h0194:41888:42361 [0] NCCL INFO Channel 07 : 2[e5000] -> 9[e7000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,5]<stdout>:dgx2h0194:41891:42723 [0] NCCL INFO Channel 09 : 5[59000] -> 12[57000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,2]<stdout>:dgx2h0194:41888:42361 [0] NCCL INFO Channel 08 : 2[e5000] -> 9[e7000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,5]<stdout>:dgx2h0194:41891:42723 [0] NCCL INFO Channel 10 : 5[59000] -> 12[57000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,2]<stdout>:dgx2h0194:41888:42361 [0] NCCL INFO Channel 09 : 2[e5000] -> 9[e7000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,5]<stdout

Mon Sep  6 11:45:58 2021[1,10]<stdout>:dgx2h0194:41896:42859 [0] NCCL INFO Channel 09 : 10[e0000] -> 4[bc000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,12]<stdout>:dgx2h0194:41898:42413 [0] NCCL INFO Channel 00 : 12[57000] -> 7[5c000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,10]<stdout>:dgx2h0194:41896:42859 [0] NCCL INFO Channel 10 : 10[e0000] -> 4[bc000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,12]<stdout>:dgx2h0194:41898:42413 [0] NCCL INFO Channel 01 : 12[57000] -> 7[5c000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,10]<stdout>:dgx2h0194:41896:42859 [0] NCCL INFO Channel 11 : 10[e0000] -> 4[bc000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,12]<stdout>:dgx2h0194:41898:42413 [0] NCCL INFO Channel 02 : 12[57000] -> 7[5c000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,12]<stdout>:dgx2h0194:41898:42413 [0] NCCL INFO Channel 03 : 12[57000] -> 7[5c000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1,9]<stdout>:dgx2h0194:41895:42860 [0] NCCL INFO Channel 00 : 9[e7000] -> 3[e2000] via P2P/IPC
Mon Sep  6 11:45:58 2021[1

Mon Sep  6 11:45:59 2021[1,0]<stdout>:dgx2h0194:41886:42730 [0] NCCL INFO Connected all trees
Mon Sep  6 11:45:59 2021[1,0]<stdout>:dgx2h0194:41886:42730 [0] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512
Mon Sep  6 11:45:59 2021[1,4]<stdout>:dgx2h0194:41890:42739 [0] NCCL INFO Channel 06 : 4[bc000] -> 10[e0000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,14]<stdout>:dgx2h0194:41900:42734 [0] NCCL INFO Channel 03 : 14[3b000] -> 1[39000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,5]<stdout>:dgx2h0194:41891:42723 [0] NCCL INFO Channel 01 : 5[59000] -> 14[3b000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,4]<stdout>:dgx2h0194:41890:42739 [0] NCCL INFO Channel 07 : 4[bc000] -> 10[e0000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,14]<stdout>:dgx2h0194:41900:42734 [0] NCCL INFO Channel 04 : 14[3b000] -> 1[39000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,0]<stdout>:dgx2h0194:41886:42730 [0] NCCL INFO 12 coll channels, 16 p2p channels, 16 p2p channels per peer
Mon Sep  6 11:45:59 2021[1,5]<stdout>:dgx2h019

Mon Sep  6 11:45:59 2021[1,6]<stdout>:dgx2h0194:41892:42733 [0] NCCL INFO Channel 00 : 6[b7000] -> 8[be000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,1]<stdout>:dgx2h0194:41887:42486 [0] NCCL INFO Channel 03 : 1[39000] -> 13[34000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,3]<stdout>:dgx2h0194:41889:42722 [0] NCCL INFO Channel 10 : 3[e2000] -> 9[e7000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,10]<stdout>:dgx2h0194:41896:42859 [0] NCCL INFO Channel 09 : 10[e0000] -> 3[e2000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,6]<stdout>:dgx2h0194:41892:42733 [0] NCCL INFO Channel 01 : 6[b7000] -> 8[be000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,1]<stdout>:dgx2h0194:41887:42486 [0] NCCL INFO Channel 04 : 1[39000] -> 13[34000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,3]<stdout>:dgx2h0194:41889:42722 [0] NCCL INFO Channel 11 : 3[e2000] -> 9[e7000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,10]<stdout>:dgx2h0194:41896:42859 [0] NCCL INFO Channel 10 : 10[e0000] -> 3[e2000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,6]<stdo

Mon Sep  6 11:45:59 2021[1,2]<stdout>:dgx2h0194:41888:42361 [0] NCCL INFO Channel 11 : 2[e5000] -> 11[5e000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,9]<stdout>:dgx2h0194:41895:42860 [0] NCCL INFO Channel 10 : 9[e7000] -> 2[e5000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,8]<stdout>:dgx2h0194:41894:42489 [0] NCCL INFO Channel 01 : 8[be000] -> 4[bc000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,9]<stdout>:dgx2h0194:41895:42860 [0] NCCL INFO Channel 11 : 9[e7000] -> 2[e5000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,8]<stdout>:dgx2h0194:41894:42489 [0] NCCL INFO Channel 02 : 8[be000] -> 4[bc000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,12]<stdout>:dgx2h0194:41898:42413 [0] NCCL INFO Connected all trees
Mon Sep  6 11:45:59 2021[1,12]<stdout>:dgx2h0194:41898:42413 [0] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512
Mon Sep  6 11:45:59 2021[1,8]<stdout>:dgx2h0194:41894:42489 [0] NCCL INFO Channel 03 : 8[be000] -> 4[bc000] via P2P/IPC
Mon Sep  6 11:45:59 2021[1,8]<stdout>:dgx2h0194:41894:42489 [0] 

Mon Sep  6 11:45:59 2021[1,1]<stdout>:dgx2h0194:41887:42486 [0] NCCL INFO comm 0x7fda3462aef0 rank 1 nranks 16 cudaDev 0 busId 39000 - Init COMPLETE
Mon Sep  6 11:45:59 2021[1,6]<stdout>:dgx2h0194:41892:42733 [0] NCCL INFO comm 0x7f257462aab0 rank 6 nranks 16 cudaDev 0 busId b7000 - Init COMPLETE
Mon Sep  6 11:45:59 2021[1,2]<stdout>:dgx2h0194:41888:42361 [0] NCCL INFO comm 0x7f54e462ac10 rank 2 nranks 16 cudaDev 0 busId e5000 - Init COMPLETE
Mon Sep  6 11:45:59 2021[1,8]<stdout>:dgx2h0194:41894:42489 [0] NCCL INFO comm 0x7fd8c862b120 rank 8 nranks 16 cudaDev 0 busId be000 - Init COMPLETE
Mon Sep  6 11:45:59 2021[1,3]<stdout>:dgx2h0194:41889:42722 [0] NCCL INFO comm 0x7f33b062a600 rank 3 nranks 16 cudaDev 0 busId e2000 - Init COMPLETE
Mon Sep  6 11:45:59 2021[1,5]<stdout>:dgx2h0194:41891:42723 [0] NCCL INFO comm 0x7fcf1462aa00 rank 5 nranks 16 cudaDev 0 busId 59000 - Init COMPLETE
Mon Sep  6 11:45:59 2021[1,9]<stdout>:dgx2h0194:41895:42860 [0] NCCL INFO comm 0x7fb7d062a3c0 rank 9 nrank

Mon Sep  6 11:46:11 2021[1,11]<stderr>:2021-09-06 11:46:11.401104: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
Mon Sep  6 11:46:11 2021[1,11]<stderr>:2021-09-06 11:46:11.401164: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
Mon Sep  6 11:46:11 2021[1,3]<stderr>:2021-09-06 11:46:11.440290: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
Mon Sep  6 11:46:11 2021[1,3]<stderr>:2021-09-06 11:46:11.440344: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
Mon Sep  6 11:46:11 2021[1,9]<stderr>:2021-09-06 11:46:11.442416: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
Mon Sep  6 11:46:11 2021[1,9]<stderr>:2021-09-06 11:46:11.442453: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
Mon Sep  6 11:46:11 2021[1,5]<stderr>:2021-09-06 11:46:11.445379: I tensorflow/core/profiler/li

Mon Sep  6 11:45:59 2021[1,0]<stdout>:   1/4000 [..............................]Mon Sep  6 11:45:59 2021[1,0]<stdout>: - ETA: 34:42:44 - loss: 1.3348 - auc: 0.5421Mon Sep  6 11:46:12 2021[1,0]<stdout>:Mon Sep  6 11:46:12 2021[1,0]<stdout>:

Mon Sep  6 11:46:12 2021[1,5]<stderr>:2021-09-06 11:46:12.494098: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
Mon Sep  6 11:46:12 2021[1,9]<stderr>:2021-09-06 11:46:12.494182: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
Mon Sep  6 11:46:12 2021[1,12]<stderr>:2021-09-06 11:46:12.494154: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
Mon Sep  6 11:46:12 2021[1,15]<stderr>:2021-09-06 11:46:12.494339: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
Mon Sep  6 11:46:12 2021[1,8]<stderr>:2021-09-06 11:46:12.494413: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
Mon Sep  6 11:46:12 2021[1,3]<stderr>:2021-09-06 11:46:12.494461: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
Mon Sep  6 11:46:12 2021[1,1]<stderr>:2021-09-06 11:46:12.494477: I 

Mon Sep  6 11:46:12 2021[1,4]<stderr>:2021-09-06 11:46:12.908955: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down.
Mon Sep  6 11:46:12 2021[1,5]<stderr>:2021-09-06 11:46:12.909348: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down.
Mon Sep  6 11:46:12 2021[1,15]<stderr>:2021-09-06 11:46:12.911988: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down.
Mon Sep  6 11:46:12 2021[1,6]<stderr>:2021-09-06 11:46:12.913338: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down.
Mon Sep  6 11:46:12 2021[1,8]<stderr>:2021-09-06 11:46:12.919742: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down.
Mon Sep  6 11:46:12 2021[1,1]<stderr>:2021-09-06 11:46:12.921926: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down.
Mon Sep  6 11:46:12 2021[1,10]<stderr>:2021-09-06 11:46:12.971783: I tensorflow/core/profiler/rpc/c

Mon Sep  6 11:46:13 2021[1,3]<stderr>:2021-09-06 11:46:13.705164: I tensorflow/core/profiler/rpc/client/save_profile.cc:136] Creating directory: tf_logs/plugins/profile/2021_09_06_11_46_12
Mon Sep  6 11:46:13 2021[1,3]<stderr>:
Mon Sep  6 11:46:13 2021[1,2]<stderr>:2021-09-06 11:46:13.727985: I tensorflow/core/profiler/rpc/client/save_profile.cc:136] Creating directory: tf_logs/plugins/profile/2021_09_06_11_46_12
Mon Sep  6 11:46:13 2021[1,2]<stderr>:
Mon Sep  6 11:46:13 2021[1,14]<stderr>:2021-09-06 11:46:13.770514: I tensorflow/core/profiler/rpc/client/save_profile.cc:142] Dumped gzipped tool data for trace.json.gz to tf_logs/plugins/profile/2021_09_06_11_46_12/dgx2h0194.trace.json.gz
Mon Sep  6 11:46:13 2021[1,0]<stderr>:2021-09-06 11:46:13.770742: I tensorflow/core/profiler/rpc/client/save_profile.cc:136] Creating directory: tf_logs/plugins/profile/2021_09_06_11_46_12
Mon Sep  6 11:46:13 2021[1,0]<stderr>:
Mon Sep  6 11:46:13 2021[1,0]<stderr>:2021-09-06 11:46:13.777869: I tensorfl

Mon Sep  6 11:46:14 2021[1,4]<stderr>:2021-09-06 11:46:14.031628: I tensorflow/core/profiler/rpc/client/save_profile.cc:142] Dumped gzipped tool data for trace.json.gz to tf_logs/plugins/profile/2021_09_06_11_46_12/dgx2h0194.trace.json.gz
Mon Sep  6 11:46:14 2021[1,13]<stderr>:2021-09-06 11:46:14.035052: I tensorflow/core/profiler/rpc/client/save_profile.cc:142] Dumped gzipped tool data for memory_profile.json.gz to tf_logs/plugins/profile/2021_09_06_11_46_12/dgx2h0194.memory_profile.json.gz
Mon Sep  6 11:46:14 2021[1,13]<stderr>:2021-09-06 11:46:14.044101: I tensorflow/core/profiler/rpc/client/capture_profile.cc:251] Creating directory: tf_logs/plugins/profile/2021_09_06_11_46_12
Mon Sep  6 11:46:14 2021[1,13]<stderr>:Dumped tool data for xplane.pb to tf_logs/plugins/profile/2021_09_06_11_46_12/dgx2h0194.xplane.pb
Mon Sep  6 11:46:14 2021[1,13]<stderr>:Dumped tool data for overview_page.pb to tf_logs/plugins/profile/2021_09_06_11_46_12/dgx2h0194.overview_page.pb
Mon Sep  6 11:46:14 20

Mon Sep  6 11:46:14 2021[1,15]<stderr>:
Mon Sep  6 11:46:14 2021[1,11]<stderr>:2021-09-06 11:46:14.192490: I tensorflow/core/profiler/rpc/client/capture_profile.cc:251] Creating directory: tf_logs/plugins/profile/2021_09_06_11_46_12
Mon Sep  6 11:46:14 2021[1,11]<stderr>:Dumped tool data for xplane.pb to tf_logs/plugins/profile/2021_09_06_11_46_12/dgx2h0194.xplane.pb
Mon Sep  6 11:46:14 2021[1,11]<stderr>:Dumped tool data for overview_page.pb to tf_logs/plugins/profile/2021_09_06_11_46_12/dgx2h0194.overview_page.pb
Mon Sep  6 11:46:14 2021[1,11]<stderr>:Dumped tool data for input_pipeline.pb to tf_logs/plugins/profile/2021_09_06_11_46_12/dgx2h0194.input_pipeline.pb
Mon Sep  6 11:46:14 2021[1,11]<stderr>:Dumped tool data for tensorflow_stats.pb to tf_logs/plugins/profile/2021_09_06_11_46_12/dgx2h0194.tensorflow_stats.pb
Mon Sep  6 11:46:14 2021[1,11]<stderr>:Dumped tool data for kernel_stats.pb to tf_logs/plugins/profile/2021_09_06_11_46_12/dgx2h0194.kernel_stats.pb
Mon Sep  6 11:46:14 

Mon Sep  6 11:46:16 2021[1,0]<stdout>:   5/4000 [..............................]Mon Sep  6 11:46:16 2021[1,0]<stdout>: - ETA: 4:44:11 - loss: 1.1971 - auc: 0.5447Mon Sep  6 11:46:21 2021[1,0]<stdoutMon Sep  6 11:46:21 2021[1,0]<stdout>:>:



Mon Sep  6 11:46:45 2021[1,0]<stdout>:  11/4000 [..............................]Mon Sep  6 11:46:45 2021[1,0]<stdout>: - ETA: 5:01:18 - loss: 0.9724 - auc: 0.5378Mon Sep  6 11:46:49 2021[1,0]<stdoutMon Sep  6 11:46:49 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 11:47:41 2021[1,0]<stdout>:  21/4000 [..............................]Mon Sep  6 11:47:41 2021[1,0]<stdout>: - ETA: 5:37:25 - loss: 0.7117 - auc: 0.5280Mon Sep  6 11:47:48 2021[1,0]<stdoutMon Sep  6 11:47:48 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 11:48:44 2021[1,0]<stdout>:  31/4000 [..............................]Mon Sep  6 11:48:44 2021[1,0]<stdout>: - ETA: 6:02:07 - loss: 0.5560 - auc: 0.5265Mon Sep  6 11:48:49 2021[1,0]<stdoutMon Sep  6 11:48:49 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 11:49:44 2021[1,0]<stdout>:  41/4000 [..............................]Mon Sep  6 11:49:44 2021[1,0]<stdout>: - ETA: 6:10:48 - loss: 0.4611 - auc: 0.5275Mon Sep  6 11:49:51 2021[1,0]<stdoutMon Sep  6 11:49:51 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 11:50:42 2021[1,0]<stdout>:  50/4000 [..............................]Mon Sep  6 11:50:42 2021[1,0]<stdout>: - ETA: 6:20:05 - loss: 0.4050 - auc: 0.5309Mon Sep  6 11:50:49 2021[1,0]<stdoutMon Sep  6 11:50:49 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 11:51:41 2021[1,0]<stdout>:  59/4000 [..............................]Mon Sep  6 11:51:41 2021[1,0]<stdout>: - ETA: 6:26:39 - loss: 0.3650 - auc: 0.5373Mon Sep  6 11:51:48 2021[1,0]<stdoutMon Sep  6 11:51:41 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 11:52:44 2021[1,0]<stdout>:  68/4000 [..............................]Mon Sep  6 11:52:44 2021[1,0]<stdout>: - ETA: 6:35:30 - loss: 0.3352 - auc: 0.5462Mon Sep  6 11:52:51 2021[1,0]<stdoutMon Sep  6 11:52:51 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 11:53:40 2021[1,0]<stdout>:  76/4000 [..............................]Mon Sep  6 11:53:40 2021[1,0]<stdout>: - ETA: 6:41:37 - loss: 0.3143 - auc: 0.5561Mon Sep  6 11:53:47 2021[1,0]<stdoutMon Sep  6 11:53:47 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 11:54:38 2021[1,0]<stdout>:  84/4000 [..............................]Mon Sep  6 11:54:38 2021[1,0]<stdout>: - ETA: 6:48:12 - loss: 0.2972 - auc: 0.5662Mon Sep  6 11:54:46 2021[1,0]<stdoutMon Sep  6 11:54:46 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 11:55:44 2021[1,0]<stdout>:  93/4000 [..............................]Mon Sep  6 11:55:44 2021[1,0]<stdout>: - ETA: 6:53:56 - loss: 0.2815 - auc: 0.5771Mon Sep  6 11:55:52 2021[1,0]<stdoutMon Sep  6 11:55:52 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 11:56:37 2021[1,0]<stdout>: 100/4000 [..............................]Mon Sep  6 11:56:37 2021[1,0]<stdout>: - ETA: 6:58:55 - loss: 0.2710 - auc: 0.5854Mon Sep  6 11:56:45 2021[1,0]<stdoutMon Sep  6 11:56:45 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 11:57:39 2021[1,0]<stdout>: 108/4000 [..............................]Mon Sep  6 11:57:39 2021[1,0]<stdout>: - ETA: 7:04:18 - loss: 0.2609 - auc: 0.5939Mon Sep  6 11:57:47 2021[1,0]<stdoutMon Sep  6 11:57:47 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 11:58:42 2021[1,0]<stdout>: 116/4000 [..............................]Mon Sep  6 11:58:42 2021[1,0]<stdout>: - ETA: 7:09:03 - loss: 0.2521 - auc: 0.6023Mon Sep  6 11:58:49 2021[1,0]<stdoutMon Sep  6 11:58:49 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 11:59:37 2021[1,0]<stdout>: 123/4000 [..............................]Mon Sep  6 11:59:37 2021[1,0]<stdout>: - ETA: 7:13:04 - loss: 0.2452 - auc: 0.6090Mon Sep  6 11:59:45 2021[1,0]<stdoutMon Sep  6 11:59:45 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:00:42 2021[1,0]<stdout>: 131/4000 [..............................]Mon Sep  6 12:00:42 2021[1,0]<stdout>: - ETA: 7:17:40 - loss: 0.2382 - auc: 0.6162Mon Sep  6 12:00:50 2021[1,0]<stdoutMon Sep  6 12:00:50 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:01:39 2021[1,0]<stdout>: 138/4000 [>.............................]Mon Sep  6 12:01:39 2021[1,0]<stdout>: - ETA: 7:21:24 - loss: 0.2328 - auc: 0.6220Mon Sep  6 12:01:47 2021[1,0]<stdoutMon Sep  6 12:01:47 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:02:37 2021[1,0]<stdout>: 145/4000 [>.............................]Mon Sep  6 12:02:37 2021[1,0]<stdout>: - ETA: 7:25:19 - loss: 0.2279 - auc: 0.6274Mon Sep  6 12:02:46 2021[1,0]<stdoutMon Sep  6 12:02:46 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:03:37 2021[1,0]<stdout>: 152/4000 [>.............................]Mon Sep  6 12:03:37 2021[1,0]<stdout>: - ETA: 7:29:15 - loss: 0.2233 - auc: 0.6327Mon Sep  6 12:03:46 2021[1,0]<stdoutMon Sep  6 12:03:46 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:04:36 2021[1,0]<stdout>: 159/4000 [>.............................]Mon Sep  6 12:04:36 2021[1,0]<stdout>: - ETA: 7:32:18 - loss: 0.2192 - auc: 0.6377Mon Sep  6 12:04:44 2021[1,0]<stdoutMon Sep  6 12:04:44 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:05:43 2021[1,0]<stdout>: 167/4000 [>.............................]Mon Sep  6 12:05:43 2021[1,0]<stdout>: - ETA: 7:35:26 - loss: 0.2149 - auc: 0.6427Mon Sep  6 12:05:51 2021[1,0]<stdoutMon Sep  6 12:05:51 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:06:42 2021[1,0]<stdout>: 174/4000 [>.............................]Mon Sep  6 12:06:42 2021[1,0]<stdout>: - ETA: 7:38:11 - loss: 0.2115 - auc: 0.6469Mon Sep  6 12:06:51 2021[1,0]<stdoutMon Sep  6 12:06:51 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:07:42 2021[1,0]<stdout>: 181/4000 [>.............................]Mon Sep  6 12:07:42 2021[1,0]<stdout>: - ETA: 7:40:34 - loss: 0.2084 - auc: 0.6507Mon Sep  6 12:07:50 2021[1,0]<stdoutMon Sep  6 12:07:50 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:08:41 2021[1,0]<stdout>: 188/4000 [>.............................]Mon Sep  6 12:08:41 2021[1,0]<stdout>: - ETA: 7:42:41 - loss: 0.2055 - auc: 0.6545Mon Sep  6 12:08:50 2021[1,0]<stdoutMon Sep  6 12:08:50 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:09:40 2021[1,0]<stdout>: 195/4000 [>.............................] - ETA: 7:44:22 - loss: 0.2027 - auc: 0.6581Mon Sep  6 12:09:49 2021[1,0]<stdoutMon Sep  6 12:09:49 2021[1,0]<stdout>:Mon Sep  6 12:09:32 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:10:40 2021[1,0]<stdout>: 202/4000 [>.............................]Mon Sep  6 12:10:40 2021[1,0]<stdout>: - ETA: 7:46:12 - loss: 0.2002 - auc: 0.6614Mon Sep  6 12:10:48 2021[1,0]<stdoutMon Sep  6 12:10:48 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:11:40 2021[1,0]<stdout>: 209/4000 [>.............................]Mon Sep  6 12:11:40 2021[1,0]<stdout>: - ETA: 7:47:50 - loss: 0.1978 - auc: 0.6646Mon Sep  6 12:11:48 2021[1,0]<stdoutMon Sep  6 12:11:48 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:12:39 2021[1,0]<stdout>: 216/4000 [>.............................]Mon Sep  6 12:12:39 2021[1,0]<stdout>: - ETA: 7:49:20 - loss: 0.1956 - auc: 0.6675Mon Sep  6 12:12:48 2021[1,0]<stdoutMon Sep  6 12:12:48 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:13:40 2021[1,0]<stdout>: 223/4000 [>.............................]Mon Sep  6 12:13:40 2021[1,0]<stdout>: - ETA: 7:50:49 - loss: 0.1935 - auc: 0.6702Mon Sep  6 12:13:47 2021[1,0]<stdoutMon Sep  6 12:13:47 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:14:38 2021[1,0]<stdout>: 230/4000 [>.............................]Mon Sep  6 12:14:38 2021[1,0]<stdout>: - ETA: 7:51:37 - loss: 0.1916 - auc: 0.6730Mon Sep  6 12:14:47 2021[1,0]<stdoutMon Sep  6 12:14:47 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:15:38 2021[1,0]<stdout>: 237/4000 [>.............................]Mon Sep  6 12:15:38 2021[1,0]<stdout>: - ETA: 7:52:41 - loss: 0.1897 - auc: 0.6754Mon Sep  6 12:15:47 2021[1,0]<stdoutMon Sep  6 12:15:47 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:16:38 2021[1,0]<stdout>: 244/4000 [>.............................]Mon Sep  6 12:16:38 2021[1,0]<stdout>: - ETA: 7:53:46 - loss: 0.1880 - auc: 0.6779Mon Sep  6 12:16:47 2021[1,0]<stdoutMon Sep  6 12:16:47 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:17:38 2021[1,0]<stdout>: 251/4000 [>.............................]Mon Sep  6 12:17:38 2021[1,0]<stdout>: - ETA: 7:54:34 - loss: 0.1864 - auc: 0.6802Mon Sep  6 12:17:47 2021[1,0]<stdoutMon Sep  6 12:17:38 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:18:37 2021[1,0]<stdout>: 258/4000 [>.............................]Mon Sep  6 12:18:37 2021[1,0]<stdout>: - ETA: 7:55:03 - loss: 0.1848 - auc: 0.6824Mon Sep  6 12:18:46 2021[1,0]<stdoutMon Sep  6 12:18:46 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:19:37 2021[1,0]<stdout>: 265/4000 [>.............................]Mon Sep  6 12:19:37 2021[1,0]<stdout>: - ETA: 7:55:43 - loss: 0.1832 - auc: 0.6846Mon Sep  6 12:19:45 2021[1,0]<stdoutMon Sep  6 12:19:45 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

 272/4000 [=>............................]Mon Sep  6 12:20:37 2021[1,0]<stdout>: - ETA: 7:56:22 - loss: 0.1818 - auc: 0.6867Mon Sep  6 12:20:46 2021[1,0]<stdoutMon Sep  6 12:20:46 2021[1,0]<stdout>:Mon Sep  6 12:20:20 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

 279/4000 [=>............................]Mon Sep  6 12:21:38 2021[1,0]<stdout>: - ETA: 7:57:08 - loss: 0.1804 - auc: 0.6887Mon Sep  6 12:21:47 2021[1,0]<stdoutMon Sep  6 12:21:47 2021[1,0]<stdout>:Mon Sep  6 12:21:29 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:22:38 2021[1,0]<stdout>: 286/4000 [=>............................]Mon Sep  6 12:22:38 2021[1,0]<stdout>: - ETA: 7:57:26 - loss: 0.1791 - auc: 0.6906Mon Sep  6 12:22:46 2021[1,0]<stdoutMon Sep  6 12:22:46 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:23:38 2021[1,0]<stdout>: 293/4000 [=>............................]Mon Sep  6 12:23:38 2021[1,0]<stdout>: - ETA: 7:57:59 - loss: 0.1779 - auc: 0.6924Mon Sep  6 12:23:47 2021[1,0]<stdoutMon Sep  6 12:23:47 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:24:40 2021[1,0]<stdout>: 300/4000 [=>............................]Mon Sep  6 12:24:40 2021[1,0]<stdout>: - ETA: 7:58:35 - loss: 0.1767 - auc: 0.6941Mon Sep  6 12:24:48 2021[1,0]<stdoutMon Sep  6 12:24:48 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:25:40 2021[1,0]<stdout>: 307/4000 [=>............................]Mon Sep  6 12:25:40 2021[1,0]<stdout>: - ETA: 7:58:54 - loss: 0.1756 - auc: 0.6957Mon Sep  6 12:25:49 2021[1,0]<stdoutMon Sep  6 12:25:49 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:26:40 2021[1,0]<stdout>: 314/4000 [=>............................]Mon Sep  6 12:26:40 2021[1,0]<stdout>: - ETA: 7:59:03 - loss: 0.1745 - auc: 0.6973Mon Sep  6 12:26:48 2021[1,0]<stdoutMon Sep  6 12:26:48 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:27:40 2021[1,0]<stdout>: 321/4000 [=>............................]Mon Sep  6 12:27:40 2021[1,0]<stdout>: - ETA: 7:59:04 - loss: 0.1735 - auc: 0.6988Mon Sep  6 12:27:48 2021[1,0]<stdoutMon Sep  6 12:27:48 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:28:40 2021[1,0]<stdout>: 328/4000 [=>............................]Mon Sep  6 12:28:40 2021[1,0]<stdout>: - ETA: 7:59:18 - loss: 0.1726 - auc: 0.7003Mon Sep  6 12:28:49 2021[1,0]<stdoutMon Sep  6 12:28:49 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:29:42 2021[1,0]<stdout>: 335/4000 [=>............................]Mon Sep  6 12:29:42 2021[1,0]<stdout>: - ETA: 7:59:35 - loss: 0.1716 - auc: 0.7017Mon Sep  6 12:29:50 2021[1,0]<stdoutMon Sep  6 12:29:50 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:30:42 2021[1,0]<stdout>: 342/4000 [=>............................]Mon Sep  6 12:30:42 2021[1,0]<stdout>: - ETA: 7:59:39 - loss: 0.1707 - auc: 0.7031Mon Sep  6 12:30:50 2021[1,0]<stdoutMon Sep  6 12:30:50 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:31:42 2021[1,0]<stdout>: 349/4000 [=>............................]Mon Sep  6 12:31:42 2021[1,0]<stdout>: - ETA: 7:59:34 - loss: 0.1698 - auc: 0.7044Mon Sep  6 12:31:51 2021[1,0]<stdoutMon Sep  6 12:31:51 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:32:43 2021[1,0]<stdout>: 356/4000 [=>............................]Mon Sep  6 12:32:43 2021[1,0]<stdout>: - ETA: 7:59:36 - loss: 0.1689 - auc: 0.7058Mon Sep  6 12:32:51 2021[1,0]<stdoutMon Sep  6 12:32:51 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:33:36 2021[1,0]<stdout>: 362/4000 [=>............................]Mon Sep  6 12:33:36 2021[1,0]<stdout>: - ETA: 7:59:43 - loss: 0.1683 - auc: 0.7069Mon Sep  6 12:33:44 2021[1,0]<stdoutMon Sep  6 12:33:44 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:34:37 2021[1,0]<stdout>: 369/4000 [=>............................]Mon Sep  6 12:34:37 2021[1,0]<stdout>: - ETA: 7:59:48 - loss: 0.1674 - auc: 0.7082Mon Sep  6 12:34:46 2021[1,0]<stdoutMon Sep  6 12:34:46 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:35:38 2021[1,0]<stdout>: 376/4000 [=>............................]Mon Sep  6 12:35:38 2021[1,0]<stdout>: - ETA: 7:59:40 - loss: 0.1667 - auc: 0.7093Mon Sep  6 12:35:46 2021[1,0]<stdoutMon Sep  6 12:35:46 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:36:37 2021[1,0]<stdout>: 383/4000 [=>............................]Mon Sep  6 12:36:37 2021[1,0]<stdout>: - ETA: 7:59:26 - loss: 0.1660 - auc: 0.7106Mon Sep  6 12:36:46 2021[1,0]<stdoutMon Sep  6 12:36:46 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:37:39 2021[1,0]<stdout>: 390/4000 [=>............................]Mon Sep  6 12:37:39 2021[1,0]<stdout>: - ETA: 7:59:27 - loss: 0.1653 - auc: 0.7117Mon Sep  6 12:37:48 2021[1,0]<stdoutMon Sep  6 12:37:39 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:38:40 2021[1,0]<stdout>: 397/4000 [=>............................]Mon Sep  6 12:38:40 2021[1,0]<stdout>: - ETA: 7:59:20 - loss: 0.1646 - auc: 0.7127Mon Sep  6 12:38:49 2021[1,0]<stdoutMon Sep  6 12:38:49 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:39:42 2021[1,0]<stdout>: 404/4000 [==>...........................]Mon Sep  6 12:39:42 2021[1,0]<stdout>: - ETA: 7:59:13 - loss: 0.1640 - auc: 0.7137Mon Sep  6 12:39:50 2021[1,0]<stdoutMon Sep  6 12:39:50 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:40:42 2021[1,0]<stdout>: 411/4000 [==>...........................]Mon Sep  6 12:40:42 2021[1,0]<stdout>: - ETA: 7:58:54 - loss: 0.1633 - auc: 0.7148Mon Sep  6 12:40:50 2021[1,0]<stdoutMon Sep  6 12:40:50 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:41:44 2021[1,0]<stdout>: 418/4000 [==>...........................]Mon Sep  6 12:41:44 2021[1,0]<stdout>: - ETA: 7:58:49 - loss: 0.1627 - auc: 0.7157Mon Sep  6 12:41:53 2021[1,0]<stdoutMon Sep  6 12:41:53 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:42:36 2021[1,0]<stdout>: 424/4000 [==>...........................]Mon Sep  6 12:42:36 2021[1,0]<stdout>: - ETA: 7:58:35 - loss: 0.1622 - auc: 0.7165Mon Sep  6 12:42:45 2021[1,0]<stdoutMon Sep  6 12:42:45 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:43:36 2021[1,0]<stdout>: 431/4000 [==>...........................]Mon Sep  6 12:43:36 2021[1,0]<stdout>: - ETA: 7:58:09 - loss: 0.1617 - auc: 0.7174Mon Sep  6 12:43:45 2021[1,0]<stdoutMon Sep  6 12:43:45 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:44:39 2021[1,0]<stdout>: 438/4000 [==>...........................]Mon Sep  6 12:44:39 2021[1,0]<stdout>: - ETA: 7:58:06 - loss: 0.1611 - auc: 0.7183Mon Sep  6 12:44:47 2021[1,0]<stdoutMon Sep  6 12:44:30 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:45:40 2021[1,0]<stdout>: 445/4000 [==>...........................]Mon Sep  6 12:45:40 2021[1,0]<stdout>: - ETA: 7:57:50 - loss: 0.1606 - auc: 0.7192Mon Sep  6 12:45:49 2021[1,0]<stdoutMon Sep  6 12:45:22 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:46:42 2021[1,0]<stdout>: 452/4000 [==>...........................]Mon Sep  6 12:46:42 2021[1,0]<stdout>: - ETA: 7:57:34 - loss: 0.1601 - auc: 0.7201Mon Sep  6 12:46:50 2021[1,0]<stdoutMon Sep  6 12:46:50 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:47:44 2021[1,0]<stdout>: 459/4000 [==>...........................]Mon Sep  6 12:47:44 2021[1,0]<stdout>: - ETA: 7:57:19 - loss: 0.1596 - auc: 0.7209Mon Sep  6 12:47:52 2021[1,0]<stdoutMon Sep  6 12:47:52 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:48:36 2021[1,0]<stdout>: 465/4000 [==>...........................]Mon Sep  6 12:48:36 2021[1,0]<stdout>: - ETA: 7:56:58 - loss: 0.1591 - auc: 0.7215Mon Sep  6 12:48:44 2021[1,0]<stdoutMon Sep  6 12:48:36 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:49:37 2021[1,0]<stdout>: 472/4000 [==>...........................]Mon Sep  6 12:49:37 2021[1,0]<stdout>: - ETA: 7:56:33 - loss: 0.1587 - auc: 0.7223Mon Sep  6 12:49:45 2021[1,0]<stdoutMon Sep  6 12:49:45 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:50:36 2021[1,0]<stdout>: 479/4000 [==>...........................]Mon Sep  6 12:50:36 2021[1,0]<stdout>: - ETA: 7:55:58 - loss: 0.1582 - auc: 0.7231Mon Sep  6 12:50:45 2021[1,0]<stdoutMon Sep  6 12:50:45 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:51:37 2021[1,0]<stdout>: 486/4000 [==>...........................]Mon Sep  6 12:51:37 2021[1,0]<stdout>: - ETA: 7:55:26 - loss: 0.1577 - auc: 0.7239Mon Sep  6 12:51:45 2021[1,0]<stdoutMon Sep  6 12:51:19 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:52:38 2021[1,0]<stdout>: 493/4000 [==>...........................]Mon Sep  6 12:52:38 2021[1,0]<stdout>: - ETA: 7:55:02 - loss: 0.1573 - auc: 0.7246Mon Sep  6 12:52:47 2021[1,0]<stdoutMon Sep  6 12:52:38 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:53:40 2021[1,0]<stdout>: 500/4000 [==>...........................]Mon Sep  6 12:53:40 2021[1,0]<stdout>: - ETA: 7:54:37 - loss: 0.1568 - auc: 0.7253Mon Sep  6 12:53:48 2021[1,0]<stdoutMon Sep  6 12:53:48 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:54:40 2021[1,0]<stdout>: 507/4000 [==>...........................]Mon Sep  6 12:54:40 2021[1,0]<stdout>: - ETA: 7:54:04 - loss: 0.1564 - auc: 0.7261Mon Sep  6 12:54:48 2021[1,0]<stdoutMon Sep  6 12:54:48 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:55:40 2021[1,0]<stdout>: 514/4000 [==>...........................]Mon Sep  6 12:55:40 2021[1,0]<stdout>: - ETA: 7:53:27 - loss: 0.1560 - auc: 0.7268Mon Sep  6 12:55:49 2021[1,0]<stdoutMon Sep  6 12:55:49 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:56:40 2021[1,0]<stdout>: 521/4000 [==>...........................]Mon Sep  6 12:56:40 2021[1,0]<stdout>: - ETA: 7:52:50 - loss: 0.1556 - auc: 0.7274Mon Sep  6 12:56:48 2021[1,0]<stdoutMon Sep  6 12:56:48 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:57:39 2021[1,0]<stdout>: 528/4000 [==>...........................]Mon Sep  6 12:57:39 2021[1,0]<stdout>: - ETA: 7:52:08 - loss: 0.1553 - auc: 0.7281Mon Sep  6 12:57:48 2021[1,0]<stdoutMon Sep  6 12:57:48 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:58:40 2021[1,0]<stdout>: 535/4000 [===>..........................]Mon Sep  6 12:58:40 2021[1,0]<stdout>: - ETA: 7:51:35 - loss: 0.1549 - auc: 0.7287Mon Sep  6 12:58:49 2021[1,0]<stdoutMon Sep  6 12:58:40 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 12:59:41 2021[1,0]<stdout>: 542/4000 [===>..........................]Mon Sep  6 12:59:41 2021[1,0]<stdout>: - ETA: 7:51:05 - loss: 0.1545 - auc: 0.7293Mon Sep  6 12:59:50 2021[1,0]<stdoutMon Sep  6 12:59:50 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:00:42 2021[1,0]<stdout>: 549/4000 [===>..........................]Mon Sep  6 13:00:42 2021[1,0]<stdout>: - ETA: 7:50:26 - loss: 0.1542 - auc: 0.7299Mon Sep  6 13:00:50 2021[1,0]<stdoutMon Sep  6 13:00:50 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

 556/4000 [===>..........................]Mon Sep  6 13:01:41 2021[1,0]<stdout>: - ETA: 7:49:40 - loss: 0.1538 - auc: 0.7305Mon Sep  6 13:01:49 2021[1,0]<stdout04Mon Sep  6 13:01:41 2021[1,0]<stdoutMon Sep  6 13:01:41 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:02:41 2021[1,0]<stdout>: 563/4000 [===>..........................]Mon Sep  6 13:02:41 2021[1,0]<stdout>: - ETA: 7:49:01 - loss: 0.1535 - auc: 0.7311Mon Sep  6 13:02:50 2021[1,0]<stdoutMon Sep  6 13:02:50 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:03:40 2021[1,0]<stdout>: 570/4000 [===>..........................]Mon Sep  6 13:03:40 2021[1,0]<stdout>: - ETA: 7:48:17 - loss: 0.1531 - auc: 0.7316Mon Sep  6 13:03:49 2021[1,0]<stdoutMon Sep  6 13:03:49 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:04:38 2021[1,0]<stdout>: 577/4000 [===>..........................]Mon Sep  6 13:04:38 2021[1,0]<stdout>: - ETA: 7:47:19 - loss: 0.1528 - auc: 0.7322Mon Sep  6 13:04:46 2021[1,0]<stdoutMon Sep  6 13:04:38 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:05:42 2021[1,0]<stdout>: 585/4000 [===>..........................]Mon Sep  6 13:05:42 2021[1,0]<stdout>: - ETA: 7:46:07 - loss: 0.1524 - auc: 0.7328Mon Sep  6 13:05:51 2021[1,0]<stdoutMon Sep  6 13:05:51 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:06:43 2021[1,0]<stdout>: 592/4000 [===>..........................]Mon Sep  6 13:06:43 2021[1,0]<stdout>: - ETA: 7:45:28 - loss: 0.1521 - auc: 0.7333Mon Sep  6 13:06:51 2021[1,0]<stdoutMon Sep  6 13:06:51 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:07:43 2021[1,0]<stdout>: 599/4000 [===>..........................]Mon Sep  6 13:07:43 2021[1,0]<stdout>: - ETA: 7:44:49 - loss: 0.1518 - auc: 0.7338Mon Sep  6 13:07:52 2021[1,0]<stdoutMon Sep  6 13:07:52 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:08:44 2021[1,0]<stdout>: 606/4000 [===>..........................]Mon Sep  6 13:08:44 2021[1,0]<stdout>: - ETA: 7:44:09 - loss: 0.1515 - auc: 0.7343Mon Sep  6 13:08:52 2021[1,0]<stdoutMon Sep  6 13:08:52 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:09:43 2021[1,0]<stdout>: 613/4000 [===>..........................]Mon Sep  6 13:09:43 2021[1,0]<stdout>: - ETA: 7:43:22 - loss: 0.1513 - auc: 0.7348Mon Sep  6 13:09:52 2021[1,0]<stdoutMon Sep  6 13:09:52 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:10:41 2021[1,0]<stdout>: 620/4000 [===>..........................]Mon Sep  6 13:10:41 2021[1,0]<stdout>: - ETA: 7:42:25 - loss: 0.1510 - auc: 0.7353Mon Sep  6 13:10:49 2021[1,0]<stdoutMon Sep  6 13:10:49 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:11:38 2021[1,0]<stdout>: 627/4000 [===>..........................]Mon Sep  6 13:11:38 2021[1,0]<stdout>: - ETA: 7:41:25 - loss: 0.1507 - auc: 0.7358Mon Sep  6 13:11:46 2021[1,0]<stdoutMon Sep  6 13:11:46 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:12:44 2021[1,0]<stdout>: 635/4000 [===>..........................]Mon Sep  6 13:12:44 2021[1,0]<stdout>: - ETA: 7:40:24 - loss: 0.1504 - auc: 0.7363Mon Sep  6 13:12:52 2021[1,0]<stdoutMon Sep  6 13:12:52 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:13:44 2021[1,0]<stdout>: 642/4000 [===>..........................]Mon Sep  6 13:13:44 2021[1,0]<stdout>: - ETA: 7:39:39 - loss: 0.1502 - auc: 0.7367Mon Sep  6 13:13:53 2021[1,0]<stdoutMon Sep  6 13:13:53 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:14:45 2021[1,0]<stdout>: 649/4000 [===>..........................]Mon Sep  6 13:14:45 2021[1,0]<stdout>: - ETA: 7:38:58 - loss: 0.1499 - auc: 0.7372Mon Sep  6 13:14:54 2021[1,0]<stdoutMon Sep  6 13:14:54 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

 656/4000 [===>..........................]Mon Sep  6 13:15:45 2021[1,0]<stdout>: - ETA: 7:38:14 - loss: 0.1497 - auc: 0.7377Mon Sep  6 13:15:54 2021[1,0]<stdoutMon Sep  6 13:15:54 2021[1,0]<stdout>:Mon Sep  6 13:15:28 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:16:44 2021[1,0]<stdout>: 663/4000 [===>..........................]Mon Sep  6 13:16:44 2021[1,0]<stdout>: - ETA: 7:37:26 - loss: 0.1494 - auc: 0.7381Mon Sep  6 13:16:53 2021[1,0]<stdoutMon Sep  6 13:16:53 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:17:36 2021[1,0]<stdout>: 669/4000 [====>.........................]Mon Sep  6 13:17:36 2021[1,0]<stdout>: - ETA: 7:36:48 - loss: 0.1492 - auc: 0.7385Mon Sep  6 13:17:45 2021[1,0]<stdoutMon Sep  6 13:17:45 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

 676/4000 [====>.........................]Mon Sep  6 13:18:38 2021[1,0]<stdout>: - ETA: 7:36:12 - loss: 0.1490 - auc: 0.7389Mon Sep  6 13:18:47 2021[1,0]<stdoutMon Sep  6 13:18:47 2021[1,0]<stdout>:Mon Sep  6 13:18:38 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

 684/4000 [====>.........................]Mon Sep  6 13:19:45 2021[1,0]<stdout>: - ETA: 7:35:13 - loss: 0.1487 - auc: 0.7394Mon Sep  6 13:19:54 2021[1,0]<stdoutMon Sep  6 13:19:54 2021[1,0]<stdout>:Mon Sep  6 13:19:37 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:20:45 2021[1,0]<stdout>: 691/4000 [====>.........................]Mon Sep  6 13:20:45 2021[1,0]<stdout>: - ETA: 7:34:27 - loss: 0.1485 - auc: 0.7398Mon Sep  6 13:20:53 2021[1,0]<stdoutMon Sep  6 13:20:53 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:21:43 2021[1,0]<stdout>: 698/4000 [====>.........................]Mon Sep  6 13:21:43 2021[1,0]<stdout>: - ETA: 7:33:32 - loss: 0.1483 - auc: 0.7402Mon Sep  6 13:21:52 2021[1,0]<stdoutMon Sep  6 13:21:52 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:22:41 2021[1,0]<stdout>: 705/4000 [====>.........................]Mon Sep  6 13:22:41 2021[1,0]<stdout>: - ETA: 7:32:35 - loss: 0.1481 - auc: 0.7406Mon Sep  6 13:22:49 2021[1,0]<stdoutMon Sep  6 13:22:49 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:23:46 2021[1,0]<stdout>: 713/4000 [====>.........................]Mon Sep  6 13:23:46 2021[1,0]<stdout>: - ETA: 7:31:21 - loss: 0.1478 - auc: 0.7410Mon Sep  6 13:23:53 2021[1,0]<stdoutMon Sep  6 13:23:46 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:24:43 2021[1,0]<stdout>: 720/4000 [====>.........................]Mon Sep  6 13:24:43 2021[1,0]<stdout>: - ETA: 7:30:24 - loss: 0.1476 - auc: 0.7414Mon Sep  6 13:24:52 2021[1,0]<stdoutMon Sep  6 13:24:52 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:25:43 2021[1,0]<stdout>: 727/4000 [====>.........................]Mon Sep  6 13:25:43 2021[1,0]<stdout>: - ETA: 7:29:35 - loss: 0.1474 - auc: 0.7417Mon Sep  6 13:25:52 2021[1,0]<stdoutMon Sep  6 13:25:52 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:26:44 2021[1,0]<stdout>: 734/4000 [====>.........................]Mon Sep  6 13:26:44 2021[1,0]<stdout>: - ETA: 7:28:51 - loss: 0.1472 - auc: 0.7421Mon Sep  6 13:26:53 2021[1,0]<stdoutMon Sep  6 13:26:53 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:27:37 2021[1,0]<stdout>: 740/4000 [====>.........................]Mon Sep  6 13:27:37 2021[1,0]<stdout>: - ETA: 7:28:19 - loss: 0.1470 - auc: 0.7424Mon Sep  6 13:27:45 2021[1,0]<stdoutMon Sep  6 13:27:45 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:28:38 2021[1,0]<stdout>: 747/4000 [====>.........................]Mon Sep  6 13:28:38 2021[1,0]<stdout>: - ETA: 7:27:33 - loss: 0.1469 - auc: 0.7427Mon Sep  6 13:28:47 2021[1,0]<stdoutMon Sep  6 13:28:47 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:29:39 2021[1,0]<stdout>: 754/4000 [====>.........................]Mon Sep  6 13:29:39 2021[1,0]<stdout>: - ETA: 7:26:52 - loss: 0.1467 - auc: 0.7431Mon Sep  6 13:29:48 2021[1,0]<stdoutMon Sep  6 13:29:39 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:30:38 2021[1,0]<stdout>: 761/4000 [====>.........................]Mon Sep  6 13:30:38 2021[1,0]<stdout>: - ETA: 7:25:59 - loss: 0.1465 - auc: 0.7434Mon Sep  6 13:30:47 2021[1,0]<stdoutMon Sep  6 13:30:47 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

 768/4000 [====>.........................]Mon Sep  6 13:31:38 2021[1,0]<stdout>: - ETA: 7:25:08 - loss: 0.1463 - auc: 0.7437Mon Sep  6 13:31:46 2021[1,0]<stdout37Mon Sep  6 13:31:38 2021[1,0]<stdoutMon Sep  6 13:31:38 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

 775/4000 [====>.........................]Mon Sep  6 13:32:37 2021[1,0]<stdout>: - ETA: 7:24:18 - loss: 0.1461 - auc: 0.7441Mon Sep  6 13:32:46 2021[1,0]<stdoutMon Sep  6 13:32:46 2021[1,0]<stdout>:Mon Sep  6 13:32:29 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:33:37 2021[1,0]<stdout>: 782/4000 [====>.........................]Mon Sep  6 13:33:37 2021[1,0]<stdout>: - ETA: 7:23:28 - loss: 0.1459 - auc: 0.7444Mon Sep  6 13:33:46 2021[1,0]<stdoutMon Sep  6 13:33:46 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:34:37 2021[1,0]<stdout>: 789/4000 [====>.........................]Mon Sep  6 13:34:37 2021[1,0]<stdout>: - ETA: 7:22:40 - loss: 0.1458 - auc: 0.7447Mon Sep  6 13:34:46 2021[1,0]<stdoutMon Sep  6 13:34:46 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:35:45 2021[1,0]<stdout>: 797/4000 [====>.........................]Mon Sep  6 13:35:45 2021[1,0]<stdout>: - ETA: 7:21:40 - loss: 0.1456 - auc: 0.7451Mon Sep  6 13:35:54 2021[1,0]<stdoutMon Sep  6 13:35:54 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

 803/4000 [=====>........................]Mon Sep  6 13:36:38 2021[1,0]<stdout>: - ETA: 7:21:02 - loss: 0.1454 - auc: 0.7454Mon Sep  6 13:36:46 2021[1,0]<stdoutMon Sep  6 13:36:38 2021[1,0]<stdout>:Mon Sep  6 13:36:20 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:37:46 2021[1,0]<stdout>: 811/4000 [=====>........................]Mon Sep  6 13:37:46 2021[1,0]<stdout>: - ETA: 7:20:03 - loss: 0.1452 - auc: 0.7457Mon Sep  6 13:37:54 2021[1,0]<stdoutMon Sep  6 13:37:54 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:38:45 2021[1,0]<stdout>: 818/4000 [=====>........................]Mon Sep  6 13:38:45 2021[1,0]<stdout>: - ETA: 7:19:11 - loss: 0.1451 - auc: 0.7460Mon Sep  6 13:38:53 2021[1,0]<stdoutMon Sep  6 13:38:53 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:39:43 2021[1,0]<stdout>: 825/4000 [=====>........................]Mon Sep  6 13:39:43 2021[1,0]<stdout>: - ETA: 7:18:13 - loss: 0.1449 - auc: 0.7463Mon Sep  6 13:39:51 2021[1,0]<stdoutMon Sep  6 13:39:51 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:40:40 2021[1,0]<stdout>: 832/4000 [=====>........................]Mon Sep  6 13:40:40 2021[1,0]<stdout>: - ETA: 7:17:10 - loss: 0.1448 - auc: 0.7466Mon Sep  6 13:40:48 2021[1,0]<stdoutMon Sep  6 13:40:48 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

 839/4000 [=====>........................]Mon Sep  6 13:41:39 2021[1,0]<stdout>: - ETA: 7:16:15 - loss: 0.1446 - auc: 0.7469Mon Sep  6 13:41:47 2021[1,0]<stdoutMon Sep  6 13:41:47 2021[1,0]<stdout>:Mon Sep  6 13:41:21 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:42:39 2021[1,0]<stdout>: 846/4000 [=====>........................]Mon Sep  6 13:42:39 2021[1,0]<stdout>: - ETA: 7:15:25 - loss: 0.1444 - auc: 0.7472Mon Sep  6 13:42:47 2021[1,0]<stdoutMon Sep  6 13:42:47 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:43:44 2021[1,0]<stdout>: 854/4000 [=====>........................]Mon Sep  6 13:43:44 2021[1,0]<stdout>: - ETA: 7:14:13 - loss: 0.1443 - auc: 0.7475Mon Sep  6 13:43:52 2021[1,0]<stdoutMon Sep  6 13:43:52 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:44:42 2021[1,0]<stdout>: 861/4000 [=====>........................]Mon Sep  6 13:44:42 2021[1,0]<stdout>: - ETA: 7:13:19 - loss: 0.1441 - auc: 0.7478Mon Sep  6 13:44:51 2021[1,0]<stdoutMon Sep  6 13:44:51 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:45:43 2021[1,0]<stdout>: 868/4000 [=====>........................]Mon Sep  6 13:45:43 2021[1,0]<stdout>: - ETA: 7:12:32 - loss: 0.1440 - auc: 0.7481Mon Sep  6 13:45:52 2021[1,0]<stdoutMon Sep  6 13:45:52 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:46:43 2021[1,0]<stdout>: 875/4000 [=====>........................]Mon Sep  6 13:46:43 2021[1,0]<stdout>: - ETA: 7:11:40 - loss: 0.1438 - auc: 0.7483Mon Sep  6 13:46:52 2021[1,0]<stdoutMon Sep  6 13:46:52 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:47:43 2021[1,0]<stdout>: 882/4000 [=====>........................]Mon Sep  6 13:47:43 2021[1,0]<stdout>: - ETA: 7:10:49 - loss: 0.1437 - auc: 0.7486Mon Sep  6 13:47:52 2021[1,0]<stdoutMon Sep  6 13:47:35 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:48:44 2021[1,0]<stdout>: 889/4000 [=====>........................]Mon Sep  6 13:48:44 2021[1,0]<stdout>: - ETA: 7:10:00 - loss: 0.1435 - auc: 0.7489Mon Sep  6 13:48:53 2021[1,0]<stdoutMon Sep  6 13:48:44 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:49:43 2021[1,0]<stdout>: 896/4000 [=====>........................]Mon Sep  6 13:49:43 2021[1,0]<stdout>: - ETA: 7:09:06 - loss: 0.1434 - auc: 0.7491Mon Sep  6 13:49:51 2021[1,0]<stdoutMon Sep  6 13:49:51 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:50:41 2021[1,0]<stdout>: 903/4000 [=====>........................]Mon Sep  6 13:50:41 2021[1,0]<stdout>: - ETA: 7:08:08 - loss: 0.1433 - auc: 0.7494Mon Sep  6 13:50:50 2021[1,0]<stdoutMon Sep  6 13:50:50 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:51:39 2021[1,0]<stdout>: 910/4000 [=====>........................]Mon Sep  6 13:51:39 2021[1,0]<stdout>: - ETA: 7:07:11 - loss: 0.1431 - auc: 0.7496Mon Sep  6 13:51:48 2021[1,0]<stdoutMon Sep  6 13:51:48 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:52:41 2021[1,0]<stdout>: 917/4000 [=====>........................]Mon Sep  6 13:52:41 2021[1,0]<stdout>: - ETA: 7:06:25 - loss: 0.1430 - auc: 0.7498Mon Sep  6 13:52:50 2021[1,0]<stdoutMon Sep  6 13:52:50 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

Mon Sep  6 13:53:42 2021[1,0]<stdout>: 924/4000 [=====>........................]Mon Sep  6 13:53:42 2021[1,0]<stdout>: - ETA: 7:05:35 - loss: 0.1429 - auc: 0.7501Mon Sep  6 13:53:50 2021[1,0]<stdoutMon Sep  6 13:53:50 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]

 931/4000 [=====>........................]Mon Sep  6 13:54:43 2021[1,0]<stdout>: - ETA: 7:04:47 - loss: 0.1427 - auc: 0.7503Mon Sep  6 13:54:51 2021[1,0]<stdoutMon Sep  6 13:54:51 2021[1,0]<stdout>:Mon Sep  6 13:54:34 2021[1,0]<stdout>:

[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



[Stage 11:>                                                       (0 + 16) / 16]



                                                                                

Best Loss: 0.128178


### Conclusion

From cell[8] and cell[10], we can see the total training time respectively: 

- NVTabular: 845s
- Petastorm: 33843s 

So the speedup : 33843 / 845 = `40.05`