Use data generator to federated framework when train on large dataset #793

zm17943 · 2020-01-21T06:56:19Z

Hi!

I was very glad to customize my own data and model to federated interfaces and the training converged!

Now I am confused about an issue that in an images classification task, the whole dataset is extreme large and it can't be stored in a single federated_train_data nor be imported to memory for one time. So I need to load the dataset from the hard disk in batches to memory real-timely and use Keras model.fit_generator instead of model.fit during training, the approach people use to deal with large data.

I suppose in iterative_process shown in image classification tutorial, the model is fitted on a fixed set of data. Is there any way to adjust the code to let it fit to a data generator?I have looked into the source codes but still quite confused. Would be incredibly grateful for any hints.

The text was updated successfully, but these errors were encountered:

jkr26 · 2020-01-22T17:50:38Z

Thanks for asking on SO! Dropping a link here, will close when there is an accepted answer.

zm17943 · 2020-02-10T07:13:45Z

Hi! Thanks so much for your reply on SO! Unfortunately I still can't work it out. Do you know any way to change the model training process on local clients?

ZacharyGarrett · 2020-02-20T18:06:25Z

@zm17943 could you take a look at the example in this StackOverflow answer? This does not load all clients at once, only the clients in one round of competition are used at a time, and then the tf.data.Dataset also are only loading a subset of their entirely data, as-needed, for training.

zm17943 · 2020-03-02T01:40:57Z

Thank you! I have looked into the StackOverflow answer, and adjusted my code to load each client at one time. However, I am still confused about the use of real-time data augmentation, for example, can I use tf.Data.Dataset.from_generator to load data into Federated?

aslfu · 2020-03-02T13:21:16Z

Hi, I tried to use tf.data.Dataset.from_generator to train federated model. But this step took forever.

iterative_process.next

I tried to reduce the batch_size and trainable parameters to get it fast, but still. I was wondering how to diagnose the training process?

zm17943 · 2020-03-02T17:29:52Z

I have exactly

Hi, I tried to use tf.data.Dataset.from_generator to train federated model. But this step took forever.
iterative_process.next
I tried to reduce the batch_size and trainable parameters to get it fast, but still. I was wondering how to diagnose the training process?

I have the exactly same issue!

jkr26 · 2020-03-02T18:15:10Z

One thing that I might investigate here: try adding a ds.take(1) to your dataset constructors, or raise a StopIteration exception from your generator after yielding e.g. a single element.

If TFF is given an infinite tf.data.Dataset, it will likely reduce forever, keep pulling elements out of the dataset.

I am thinking this way because if your generator never raises StopIteration, I believe from_generator will treat the dataset as infinite. The docs just linked reference Python's iterator protocol, which states:

If there are no further items, raise the StopIteration exception.

which therefore implies: if there is no StopIteration, there are further items.

aslfu · 2020-03-03T16:23:10Z

Yes, I am using ImageDataGenerator to create my generator, and it produces infinite batches.
How to add StopIteration in ImageDataGenerator ? Or do I need to use another generator.

jkr26 · 2020-03-04T01:22:58Z

We seem to be getting this question along multiple channel, so I think for ease of discoverability we would prefer to consolidate to stackoverflow. Please see the discussion here, and open a question there if that does not suit your needs.

Thanks!

jkr26 self-assigned this Jan 22, 2020

aslfu mentioned this issue Feb 19, 2020

InvalidArgumentError: TypeError: endswith first arg must be bytes or a tuple of bytes, not str #797

Closed

jkr26 closed this as completed Mar 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use data generator to federated framework when train on large dataset #793

Use data generator to federated framework when train on large dataset #793

zm17943 commented Jan 21, 2020

jkr26 commented Jan 22, 2020

zm17943 commented Feb 10, 2020

ZacharyGarrett commented Feb 20, 2020

zm17943 commented Mar 2, 2020

aslfu commented Mar 2, 2020

zm17943 commented Mar 2, 2020

jkr26 commented Mar 2, 2020

aslfu commented Mar 3, 2020

jkr26 commented Mar 4, 2020 •

edited

Use data generator to federated framework when train on large dataset #793

Use data generator to federated framework when train on large dataset #793

Comments

zm17943 commented Jan 21, 2020

jkr26 commented Jan 22, 2020

zm17943 commented Feb 10, 2020

ZacharyGarrett commented Feb 20, 2020

zm17943 commented Mar 2, 2020

aslfu commented Mar 2, 2020

zm17943 commented Mar 2, 2020

jkr26 commented Mar 2, 2020

aslfu commented Mar 3, 2020

jkr26 commented Mar 4, 2020 • edited

jkr26 commented Mar 4, 2020 •

edited