Keras & data.Dataset : "Your dataset iterator ran out of data" #25254

andsteing · 2019-01-28T13:57:39Z

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Colab
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: n/a
TensorFlow installed from (source or binary): Colab
TensorFlow version (use command below): 1.12
Python version: 3.6.7
Bazel version (if compiling from source): n/a
GCC/Compiler version (if compiling from source): n/a
CUDA/cuDNN version: n/a
GPU model and memory: n/a

Describe the current behavior
Keras model.fit() does not reset validation dataset iterator between epochs. Thus, when specifying validation_steps < validation_dataset_size / batch_size, then every evaluation will be performed on a different set of examples.

Describe the expected behavior
I would expect that model.fit() restarts from the beginning in the validation dataset after every epoch of training. This way the validation dataset could be used without .repeat() and the evaluation would be performed on the same set of examples.

Code to reproduce the issue
https://colab.research.google.com/drive/1UjKNbX38UC4EG6EPm6xLzQ1AmFV8HWe5

Other info / logs

WARNING:tensorflow:Your dataset iterator ran out of data interrupting testing. Make sure that your dataset can generate at least `steps` batches (in this case, 100 batches). You may need to use the repeat() function when building your dataset.

The text was updated successfully, but these errors were encountered:

ghost · 2019-01-29T17:44:53Z

I think it would be nice to have tf.keras consider one epoch when the dataset runs out of data, it would make the steps_per_epoch and validation_steps undeeded.

suphoff · 2019-02-01T18:40:54Z

As a quick workaround the validation dataset could be trimmed to validation_steps * batch_size using .take() before .repeat().

mrry · 2019-02-21T15:46:28Z

Reassigning this to @omalleyt12, since I think he has been improving the validation path lately, and I believe this feature would need to be implemented at the Keras level (but feel free to reassign, Tom!).

omalleyt12 · 2019-03-08T21:08:03Z

We now allow users to not pass in validation_steps or steps_per_epoch for datasets, like in @cassianocasagrande 's suggestion

felixnext · 2019-03-22T17:40:42Z

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: n/a
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 1.13 / 2.0-alpha
Python version: 3.5.2
Bazel version (if compiling from source): n/a
GCC/Compiler version (if compiling from source): n/a
CUDA/cuDNN version: 10.0/7.5
GPU model and memory: 1080 Ti / 32 GB

I still encounter similar problems with TF 2.0 alpha and TF 1.13.

TF 1.13

gen_train = # ... generator function ...
gen_test = # ... generator function ...
# creation of datasets
types = (tf.float32, tf.int32)
shapes = ((512, 512, 3), (2,))
ds_train = tf.data.Dataset.from_generator(lambda: gen_train, types, shapes).shuffle(1000).repeat().batch(32)
ds_test = tf.data.Dataset.from_generator(lambda: gen_test, types, shapes).shuffle(100).repeat().batch(32)

# usage in model
model.fit(ds_train, steps_per_epoch=188, validation_data=ds_test, validation_steps=20, epochs=10, verbose=True, callbacks=[visualize, tensorboard])

gen_train provide a tuple of an image and a one_hot vector. steps_per_epoch are set to the exact number of batches in the dataset.
However once I reach batch 156 (the one where the dataset would be required to load the next iteration for shuffling) the system stops. I have medium CPU usage from python (25-35%) and no progress at all in the learning.

TF 2.0

gen_train = # ... generator function ...
gen_test = # ... generator function ...
# creation of datasets
types = (tf.float32, tf.int32)
shapes = ((512, 512, 3), (2,))
ds_train = tf.data.Dataset.from_generator(lambda: gen_train, types, shapes).shuffle(1000).batch(32)
ds_test = tf.data.Dataset.from_generator(lambda: gen_test, types, shapes).shuffle(100).batch(32)

# usage in model
model.fit(ds_train, validation_data=ds_test, epochs=10, verbose=True, callbacks=[visualize, tensorboard])

In this case the system completes the first epoch and the evaluation. However beginning of the second epoch I get the following error:

W0322 18:36:04.919457 140678915827456 training_generator.py:228] Your dataset ran out of data; interrupting training. Make sure that your dataset can generate at least `steps_per_epoch * epochs` batches (in this case, 1880 batches). You may need to use the repeat() function when building your dataset.

If I am using .repeat() on the datasets (and provide steps_per_epoch and validation_steps args) I run into the same problem as with TF 1.13.

@omalleyt12 Am I right in the assumption that the change is already included in the tf-2.0-alpha release? (As TF 1.13 would raise an error if I do not provide steps_per_epoch and validation_steps, while TF2 does not)

omalleyt12 · 2019-03-22T18:03:51Z

I'm guessing the generator runs out of Data, I don't think repeat works with generators

felixnext · 2019-03-25T17:06:12Z

Thanks for the note.
It actually does work, but the lambda function has to create the generator. So the code would look like this:

# creation of datasets
types = (tf.float32, tf.int32)
shapes = ((512, 512, 3), (2,))
ds_train = tf.data.Dataset.from_generator(lambda: fct_to_create_train_gen(), types, shapes).shuffle(1000).batch(32)
ds_test = tf.data.Dataset.from_generator(lambda: fct_to_create_test_gen(), types, shapes).shuffle(100).batch(32)

# usage in model
model.fit(ds_train, validation_data=ds_test, epochs=10, verbose=True, callbacks=[visualize, tensorboard])

Alternative option would be to create a generator that loops infinitely over the data, but that would require to provide steps_per_epoch and validation_steps.

ahmedanis03 · 2019-07-08T06:55:06Z

I think this bug still exists when it is used in multiworkerdistributed mode,
i launched workers using kubernetes, first epoch ran correctly then I got this error message, if I used steps_per_epoch, and repeat() everything works fine

tfversion: 2.0.0-beta1

2019-07-07 06:48:48.821000: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:111] Filling up shuffle buffer (this may take a while): 28884 of 100000000
2019-07-07 06:48:49.144075: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:162] Shuffle buffer filled.
2019-07-07 06:48:49.161736: E tensorflow/core/common_runtime/ring_alg.cc:279] Aborting RingReduce with Out of range: [_Derived_]End of sequence
         [[{{node IteratorGetNext}}]]
2019-07-07 06:48:49.161820: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Out of range: [_Derived_]End of sequence
         [[{{node IteratorGetNext}}]]
2019-07-07 06:48:49.162250: E tensorflow/core/common_runtime/ring_alg.cc:279] Aborting RingReduce with Out of range: [_Derived_]End of sequence
         [[{{node IteratorGetNext}}]]
2019-07-07 06:48:49.162329: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Out of range: [_Derived_]End of sequence
         [[{{node IteratorGetNext}}]]
2019-07-07 06:48:49.163848: W tensorflow/core/framework/op_kernel.cc:1546] OP_REQUIRES failed at collective_ops.cc:223 : Out of range: [_Derived_]End of sequence
         [[{{node IteratorGetNext}}]]
2019-07-07 06:48:49.163910: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Out of range: [_Derived_]End of sequence
         [[{{node IteratorGetNext}}]]
         [[metrics/accuracy/div_no_nan/allreduce_1/CollectiveReduce]]
2019-07-07 06:48:49.164582: W tensorflow/core/framework/op_kernel.cc:1546] OP_REQUIRES failed at collective_ops.cc:223 : Out of range: [_Derived_]End of sequence
         [[{{node IteratorGetNext}}]]
2019-07-07 06:48:49.169979: E tensorflow/core/common_runtime/ring_alg.cc:279] Aborting RingReduce with Out of range: [_Derived_]End of sequence
         [[{{node IteratorGetNext}}]]
2019-07-07 06:48:49.170069: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Out of range: [_Derived_]End of sequence
         [[{{node IteratorGetNext}}]]
2019-07-07 06:48:49.170203: W tensorflow/core/framework/op_kernel.cc:1546] OP_REQUIRES failed at collective_ops.cc:223 : Out of range: [_Derived_]End of sequence
         [[{{node IteratorGetNext}}]]
2019-07-07 06:48:49.179226: E tensorflow/core/common_runtime/ring_alg.cc:279] Aborting RingReduce with Out of range: [_Derived_]End of sequence
         [[{{node IteratorGetNext}}]]
2019-07-07 06:48:49.179355: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Out of range: [_Derived_]End of sequence
         [[{{node IteratorGetNext}}]]
2019-07-07 06:48:49.179457: W tensorflow/core/framework/op_kernel.cc:1546] OP_REQUIRES failed at collective_ops.cc:223 : Out of range: [_Derived_]End of sequence
         [[{{node IteratorGetNext}}]]
W0707 06:48:49.193020 140662389356352 training_arrays.py:309] Your dataset ran out of data; interrupting training. Make sure that your dataset can generate at least `steps_per_epoch * epochs` batches (in this case, 1404 batches). You may need to use the repeat() function when building your dataset.

train.py

from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow_datasets as tfds
import tensorflow as tf
import argparse
import json
import os
tfds.disable_progress_bar()


parser = argparse.ArgumentParser(description='Welcome')
parser.add_argument('-workers', type=str, default='dummy')
parser.add_argument('-type', type=str, default='dummy')
parser.add_argument('-index', type=str, default='dummy')

args = parser.parse_args()

os.environ['TF_CONFIG'] = json.dumps({
    'cluster': {
        'worker': args.workers.split(','),
    },
    'task': {'type': args.type, 'index': int(args.index)}
})


strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy()

BUFFER_SIZE = 10000

# Scaling MNIST data from (0, 255] to (0., 1.]
def scale(image, label):
  image = tf.cast(image, tf.float32)
  image /= 255
  return image, label

datasets, info = tfds.load(name='mnist',
                           with_info=True,
                           as_supervised=True)

train_datasets_unbatched = datasets['train'].map(scale).shuffle(BUFFER_SIZE)
#train_datasets = train_datasets_unbatched.batch(BATCH_SIZE)

def build_and_compile_cnn_model():
  model = tf.keras.Sequential([
      tf.keras.layers.Conv2D(32, 3, activation='relu', input_shape=(28, 28, 1)),
      tf.keras.layers.MaxPooling2D(),
      tf.keras.layers.Flatten(),
      tf.keras.layers.Dense(64, activation='relu'),
      tf.keras.layers.Dense(10, activation='softmax')
  ])
  model.compile(
      loss=tf.keras.losses.sparse_categorical_crossentropy,
      optimizer=tf.keras.optimizers.SGD(learning_rate=0.001),
      metrics=['accuracy'])
  return model



NUM_WORKERS = 2
# Here the batch size scales up by number of workers since 
# `tf.data.Dataset.batch` expects the global batch size. Previously we used 64, 
# and now this becomes 128.
GLOBAL_BATCH_SIZE = 64 * NUM_WORKERS
train_datasets = train_datasets_unbatched.batch(GLOBAL_BATCH_SIZE, drop_remainder=True)
with strategy.scope():
  multi_worker_model = build_and_compile_cnn_model()
multi_worker_model.fit(train_datasets, epochs=3)

benhe2011 · 2019-07-10T00:05:56Z

Yes, even running the Multi-worker distributed training with Keras code example on the official TensorFlow Documentation website has this error. How do you get this to work over data loaded in from MNIST for example? This is the code example I was talking about, and it's identical to the one shown by ahmedanis03: https://www.tensorflow.org/beta/tutorials/distribute/multi_worker_with_keras.

edwardyehuang · 2019-07-10T04:00:42Z

same issue

snowkylin · 2019-07-12T11:08:05Z

same for me.

Path-A · 2019-07-23T18:38:53Z

I am having the same issue as @ahmedanis03 and @benhe2011, but even for one machine two GPU setup. I modified the old code from the multi_gpu_model documentation and used the required MirroredStrategy.

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10.0.18362 Build 18362
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: n/a
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.0.0-beta1
Python version: 3.6.8
Bazel version (if compiling from source): n/a
GCC/Compiler version (if compiling from source): n/a
CUDA/cuDNN version: 10.0/7.6.1
GPU model and memory: 2x MSI GeForce RTX 2080 Ti GAMING X TRIO (no NVlink)

import tensorflow as tf
from tensorflow.keras.applications import Xception
import numpy as np

num_samples = 1000
height = 224
width = 224
num_classes = 1000

strategy = tf.distribute.MirroredStrategy(devices=["/gpu:0", "/gpu:1"], 
                                          cross_device_ops=tf.distribute.HierarchicalCopyAllReduce())
with strategy.scope():
    parallel_model = Xception(weights=None,
                              input_shape=(height, width, 3),
                              classes=num_classes)
    parallel_model.compile(loss='categorical_crossentropy', optimizer='rmsprop')

# Generate dummy data.
x = np.random.random((num_samples, height, width, 3))
y = np.random.random((num_samples, num_classes))

parallel_model.summary()
# This `fit` call will be distributed on 2 GPUs.
# Since the batch size is 64, each GPU will process 32 samples.
parallel_model.fit(x, y, epochs=10, batch_size=64)

It runs perfectly until the final epoch. It always ends with:

Epoch 8/10
16/16 [==============================] - 5s 334ms/step - loss: 3590.6503
Epoch 9/10
16/16 [==============================] - 5s 332ms/step - loss: 3597.1092
Epoch 10/10
12/16 [=====================>........] - ETA: 1s - loss: 3603.6067
W0723 14:30:47.582621   232 training_arrays.py:309] Your dataset ran out of data; 
interrupting training. Make sure that your dataset can generate at least `steps_per_epoch * epochs` 
batches (in this case, 160 batches). You may need to use the repeat() function when 
building your dataset.
12/16 [=====================>........] - ETA: 1s - loss: 3603.6067

This is only an issue with MirroredStrategy. When I train on a single GPU, there is no issue.

Edit: I'm also getting this output too!

2019-07-23 16:52:04.173982: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Out of range: End of sequence
         [[{{node IteratorGetNext_1}}]]
         [[GroupCrossDeviceControlEdges_0/RMSprop/RMSprop/update_0/Const/_355]]
2019-07-23 16:52:04.183310: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Out of range: End of sequence
         [[{{node IteratorGetNext_1}}]]
         [[Identity_1/_376]]
2019-07-23 16:52:04.189139: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Out of range: End of sequence
         [[{{node IteratorGetNext_1}}]]

TEMPORARY SOLUTION: I converted the numpy arrays to a tf Dataset and used .repeat() while providing the proper number of steps per epoch within fit:

train_dataset = tf.data.Dataset.from_tensor_slices((x, y))

BATCH_SIZE = 64
BUFFER_SIZE = 10000

train_dataset = train_dataset.shuffle(BUFFER_SIZE).repeat().batch(BATCH_SIZE)

if BUFFER_SIZE % BATCH_SIZE != 0:
    parallel_steps = BUFFER_SIZE // BATCH_SIZE + 1
else:
    parallel_steps = BUFFER_SIZE // BATCH_SIZE

# This `fit` call will be distributed on 2 GPUs.
# Since the batch size is 64, each GPU will process 32 samples.
parallel_model.fit(train_dataset, epochs=10, steps_per_epoch = parallel_steps)

ghost · 2019-09-02T15:14:54Z

Any news on this topic?
Have the same issues

samkitjain · 2019-10-10T20:54:28Z

I am having similar issue. @Path-A 's solution works like charm ! Thanks. I tried it and I confirm it works.

kkanellis · 2019-10-12T01:23:29Z

I had a similar issue and setting drop_remainder=True in the tf.Dataset batch method worked for me.

menatte · 2019-12-09T05:36:54Z

Any updates? I experience the same with MultiWorkerMirroredStrategy.

.repeat() + setting the correct number of samples works as a workaround. Nevertheless, it'd be nicer if one can just use the entire validation set without providing the correct number of samples.

davidlrobinson · 2020-01-15T13:38:02Z

I get the same issue when using tensorflow.keras.preprocessing.image.ImageDataGenerator with model.fit() when trying to specify a steps_per_epoch greater than the length of the generator.

wang1ang · 2020-04-07T04:02:42Z

Got the same issue on TF 2.1.0

jackd · 2020-04-26T23:43:16Z

This still occurs in 2.2 (tf-nightly) if your epochs have varying lengths. I accept this is a rare occurance, but e.g. for graph neural networks, computation / memory requirements are generally dependent on the number of nodes, so batches can have dynamic sizes to accomodate this, which can lead to slightly varying numbers of batches per epoch. If epochs beyond the first are even one step shorter than the first epoch, this issue still arises.

…orflow/tensorflow#25254

liqinglin54951 · 2020-09-18T05:56:47Z

Solution: Put the repeat(epochs) in the front of batch( batch_size )

https://blog.csdn.net/Linli522362242/article/details/108396485

sorushsaghari · 2020-11-06T15:18:09Z

def gen():
    df = read_csv(train_file_path,
                  dtype=data_types, chunksize=2048)
    for d in df:
        y = d['like_gt']
        d.drop(['tweet_id', 'engaging_id', 'reply_gt', 'retweet_gt',
                'quote_gt', 'like_gt'], axis=1, inplace=True)

        yield np.asarray(d), np.asarray(y)

model.fit(gen(), batch_size=64, epochs=5)

What's the problem with this code that skips all epochs greater than 1?

eromoe · 2020-11-24T02:56:19Z

tf.data.Dataset.from_generator has a big problem when using funcational api .
It force convert a list of tensors to tensor .
For example , I have a multi-input models

import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

data = pd.DataFrame(np.random.uniform(size=(1000,3)), columns=['Sales', 'SalesDiff7', 'SalesAggMean7'])

multi_inputs = []
multi_outputs = []
window_size = 1

for i in range(data.shape[1]):
    ti = keras.Input(shape=(window_size, 1), name=f't{i}')
    tlstm = layers.LSTM(32)(ti)
    tp = keras.layers.Dense(units=1)(tlstm)
    multi_inputs.append(ti)
    multi_outputs.append(tp)
    
r = tf.concat(multi_outputs, -1)
c = keras.layers.Flatten()(r)
result = keras.layers.Dense(units=1)(c)

model = keras.Model(
    inputs=multi_inputs,
    outputs=result,
)

Here, the model need input of a list of 3 tensor .

But Dataset.map return only tensor

def split_multi_window(features):
  inputs = features[:, slice(0, 1, None), :]
  inputs = tf.split(inputs, num_or_size_splits=features.shape[-1], axis=len(features.shape)-1)
    
  labels = features[:, slice(1, None, None), slice(0,1, None) ]
  return inputs, labels

data = pd.DataFrame(np.random.uniform(size=(1000,3)), columns=['Sales', 'SalesDiff7', 'SalesAggMean7']).to_numpy()
ds = tf.keras.preprocessing.timeseries_dataset_from_array(
  data=data,
  targets=None,
  sequence_length=1,
  sequence_stride=1,
  shuffle=True,
  batch_size=32,)

ds2  = ds.map(split_multi_window)
ds2
#  <MapDataset shapes: ((3, None, None, 1), (None, None, 1)), types: (tf.float64, tf.float64)>

Try to reformat the dataset

ds2 = tf.data.Dataset.from_generator(lambda : ((list(x), y) for x, y in ds2), (list, tf.float32))

would get error : TypeError: Cannot convert value <class 'list'> to a TensorFlow DType.

mevol · 2021-03-04T15:00:05Z

I think the issue with not having enough data in the last batch still exists. For me it happens when running model.predict() as my data is passed in through a generator. Here my problem:

len(X_test) = 567
batch_size = 5
number of steps = 567 / 5 = 113.4

When using this calculation I have 113 steps, as 0.4 or the last two samples are being ignored. This causes the warning below, which is correct:

113/113 [============================>.] - ETA: 0sWARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches (in this case, 113.4 batches). You may need to use the repeat() function when building your dataset.
113/113 [============================>.] - 7s 59ms/step

When increasing the step size to 114 by applying "steps=math.ceil(len(X_test) / batch_size)"
the new step size is still ignored and only 113 of 114 steps are executed. See warning below:

113/114 [============================>.] - ETA: 0sWARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches (in this case, 114 batches). You may need to use the repeat() function when building your dataset.
113/114 [============================>.] - 7s 60ms/step

I don't see these problems when running model.fit() and setting "steps_per_epoch" and "validation_steps"

I am using Singularity container: tensorflow_2.3.2-gpu-jupyter.sif

samijaba · 2021-03-31T08:13:49Z

I got the same isssue in training with retinanet. I resolved it by using repeat inside "fit"

`train_steps_per_epoch = dataset_info.splits["train[:95%]"].num_examples // batch_size
valid_steps_per_epoch = dataset_info.splits["train[95%:]"].num_examples // batch_size

train_steps = 4*100000
epochs = train_steps // train_steps_per_epoch
print (valid_steps_per_epoch)

import random

random.seed(10)
history = model.fit(
train_dataset.repeat(),
batch_size=batch_size,
validation_data=val_dataset.repeat(),
steps_per_epoch=train_steps_per_epoch,
validation_steps=valid_steps_per_epoch,
epochs=epochs,
callbacks=callbacks_list,
verbose=1,
)_`

zhenghh04 · 2021-05-20T17:53:53Z

I encountered the same issue.
I had a python generator: gen
tf.data.Dataset.from_generator(lambda: gen, ....)
It does not support repeat. But if I define a lambda generator outside of tf.data.from_generator(gen_lambda, ...) as follows

gen_lambda = lambda: gen
tf.data.Dataset.from_generator(gen_lambda, ....) solved the issue.
It supports repeat method.

samijaba · 2021-05-20T18:02:43Z

The issue was resolved in my case. The porblem was in my dataset that contain some image with no labels. the number of generated samples < size of the set, so the generator will try to generate data that don't exist!

jayagami · 2021-10-10T13:50:34Z

Providing a case may help somebody.

I create a dataset with image_dataset_from_directory , initialized with batch_size.
Then I passed steps_per_epoch and validation_steps into model.fit , and what I get is early stopping.

The right way is removing steps_per_epoch and validation_steps parameters in model.fit, because the steps was initiated when batch_size passed into image_dataset_from_directory

Chaos......

jvishnuvardhan self-assigned this Jan 29, 2019

jvishnuvardhan added comp:keras Keras related issues comp:data tf.data related issues type:bug Bug labels Jan 29, 2019

jvishnuvardhan assigned mrry and unassigned jvishnuvardhan Jan 29, 2019

jvishnuvardhan added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jan 29, 2019

mrry assigned omalleyt12 and unassigned mrry Feb 21, 2019

tensorflowbutler removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Feb 22, 2019

omalleyt12 closed this as completed Mar 8, 2019

mr-ubik mentioned this issue May 29, 2019

tf.data.Dataset.from_generator() empties after use even with repeat() #29123

Closed

edwardyehuang mentioned this issue Jul 12, 2019

[Mirrored Strategy] You dataset iterator ran out of data; interrupting training. #30636

Closed

notha99y added a commit to notha99y/gis that referenced this issue Jun 15, 2020

TF2.0 have some problem with generator length > steps_per_epoch: tens…

c7519aa

…orflow/tensorflow#25254

rajeev921 mentioned this issue Sep 10, 2020

scaling issue by using multiworkerstrategy for CPU #42888

Closed

williamfzc mentioned this issue Oct 31, 2020

WARNING:tensorflow:Your input ran out of data; interrupting training. williamfzc/stagesepx#145

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keras & data.Dataset : "Your dataset iterator ran out of data" #25254

Keras & data.Dataset : "Your dataset iterator ran out of data" #25254

andsteing commented Jan 28, 2019

ghost commented Jan 29, 2019

suphoff commented Feb 1, 2019

mrry commented Feb 21, 2019

omalleyt12 commented Mar 8, 2019

felixnext commented Mar 22, 2019 •

edited

omalleyt12 commented Mar 22, 2019

felixnext commented Mar 25, 2019

ahmedanis03 commented Jul 8, 2019 •

edited

benhe2011 commented Jul 10, 2019

edwardyehuang commented Jul 10, 2019

snowkylin commented Jul 12, 2019

Path-A commented Jul 23, 2019 •

edited

ghost commented Sep 2, 2019

samkitjain commented Oct 10, 2019

kkanellis commented Oct 12, 2019 •

edited

menatte commented Dec 9, 2019

davidlrobinson commented Jan 15, 2020

wang1ang commented Apr 7, 2020

jackd commented Apr 26, 2020

liqinglin54951 commented Sep 18, 2020 •

edited

sorushsaghari commented Nov 6, 2020

eromoe commented Nov 24, 2020 •

edited

mevol commented Mar 4, 2021

samijaba commented Mar 31, 2021

zhenghh04 commented May 20, 2021

samijaba commented May 20, 2021

jayagami commented Oct 10, 2021 •

edited

Keras & data.Dataset : "Your dataset iterator ran out of data" #25254

Keras & data.Dataset : "Your dataset iterator ran out of data" #25254

Comments

andsteing commented Jan 28, 2019

ghost commented Jan 29, 2019

suphoff commented Feb 1, 2019

mrry commented Feb 21, 2019

omalleyt12 commented Mar 8, 2019

felixnext commented Mar 22, 2019 • edited

omalleyt12 commented Mar 22, 2019

felixnext commented Mar 25, 2019

ahmedanis03 commented Jul 8, 2019 • edited

benhe2011 commented Jul 10, 2019

edwardyehuang commented Jul 10, 2019

snowkylin commented Jul 12, 2019

Path-A commented Jul 23, 2019 • edited

ghost commented Sep 2, 2019

samkitjain commented Oct 10, 2019

kkanellis commented Oct 12, 2019 • edited

menatte commented Dec 9, 2019

davidlrobinson commented Jan 15, 2020

wang1ang commented Apr 7, 2020

jackd commented Apr 26, 2020

liqinglin54951 commented Sep 18, 2020 • edited

sorushsaghari commented Nov 6, 2020

eromoe commented Nov 24, 2020 • edited

mevol commented Mar 4, 2021

samijaba commented Mar 31, 2021

zhenghh04 commented May 20, 2021

samijaba commented May 20, 2021

jayagami commented Oct 10, 2021 • edited

felixnext commented Mar 22, 2019 •

edited

ahmedanis03 commented Jul 8, 2019 •

edited

Path-A commented Jul 23, 2019 •

edited

kkanellis commented Oct 12, 2019 •

edited

liqinglin54951 commented Sep 18, 2020 •

edited

eromoe commented Nov 24, 2020 •

edited

jayagami commented Oct 10, 2021 •

edited