Multi io #30

atremblay · 2019-04-26T16:50:03Z

The core modification was to create a new TensorData that would recursively fetch the proper index value and a function that can aggregate batches of output as one big output.

I disabled one test test_disable_batch_size_warning as I am not sure if it's required anymore. I modified _get_batch_size to account for this new input/output structure, but I am not sure what the original intention was.

freud14 · 2019-04-27T20:49:01Z

Hi,
From what I could see, here are a few comments. I might do more later.

Since we put back x, y, we should put back validation_x, validation_y in fit_generator.
Could you remove type documentation in the *_on_batch and *_generator methods since we don't constrain them anyway.
In TensorDataset, there must be cleaner way than with nonlocal to assert the length of the tensors (like return the length in the recursion and asserting that the lengths in the list/tuple are all the same for instance).
In the test for TensorDataset, could you test with not only zeros?

Thank you.
Frédérik

atremblay · 2019-04-29T12:50:20Z

Since we put back x, y, we should put back validation_x, validation_y in fit_generator.

in fit_generator? Do you mean fit? If so, Keras has validation_data that contains a tuple of (x, y). I can change it to validation_{x|y}, we don't have to stick to Keras, but keeping as many similarities as possible is nice.

Could you remove type documentation in the *_on_batch and *_generator methods since we don't constrain them anyway.

We constrain it a little bit, no dictionaries. But it is getting a bit too verbose. I'll remove it. Should I remove it in fit as well?

In TensorDataset, there must be cleaner way than with nonlocal to assert the length of the tensors (like return the length in the recursion and asserting that the lengths in the list/tuple are all the same for instance).

I didn't see an elegant way the first time around to make sure everything everywhere has the same batch size, but I'll look into it again.

In the test for TensorDataset, could you test with not only zeros?

Sure thing.

freud14 · 2019-04-29T15:58:08Z

Since we put back x, y, we should put back validation_x, validation_y in fit_generator.

in fit_generator? Do you mean fit? If so, Keras has validation_data that contains a tuple of (x, y). I can change it to validation_{x|y}, we don't have to stick to Keras, but keeping as many similarities as possible is nice.

Yeah, I meant fit. Hmmm, let me think about it.

Could you remove type documentation in the *_on_batch and *_generator methods since we don't constrain them anyway.

We constrain it a little bit, no dictionaries. But it is getting a bit too verbose. I'll remove it. Should I remove it in fit as well?

We constrain the types in fit, evaluate and predict but we do not in the *_on_batch and *_generator methods as far as I can tell.

In TensorDataset, there must be cleaner way than with nonlocal to assert the length of the tensors (like return the length in the recursion and asserting that the lengths in the list/tuple are all the same for instance).

I didn't see an elegant way the first time around to make sure everything everywhere has the same batch size, but I'll look into it again.

The change you did is fine.

atremblay · 2019-04-29T17:04:05Z

Could you remove type documentation in the *_on_batch and *_generator methods since we don't constrain them anyway.

We constrain it a little bit, no dictionaries. But it is getting a bit too verbose. I'll remove it. Should I remove it in fit as well?

We constrain the types in fit, evaluate and predict but we do not in the *_on_batch and *_generator methods as far as I can tell.

Do we directly constrain it? I may be having a major brain fart, I don't see where it's done. Those types are assumed, but not directly enforced, are they?

freud14 · 2019-04-29T19:07:56Z

Do we directly constrain it? I may be having a major brain fart, I don't see where it's done. Those types are assumed, but not directly enforced, are they?

In the fit, evaluate and predict methods, your _concat function constrains (or assumes if you will) y and TensorDataset constrains x. In the other methods, there are no constraints.

atremblay · 2019-04-29T19:43:02Z

Ahh gotcha! Ok yeah makes sense.

freud14 · 2019-04-29T20:18:19Z

Hi,
Few last points (I think):

In the doc, could you specify that validation_data is a tuple (x, y) and say that they are the same thing as x and y previously described? Right now, the type documentation for validation data is different than x and y.
In predict_generator, could you specify that the generator should not yield the y contrary to the other *_generator methods ? This is because this confuses people when they try to pass a generator that yields (x, y) and there is an error.
In tests/test_utils.py, could you remove the automatically added comment header?
In TensorDataset, just put self._len = _rabbit_hole(self.tensors) instead of two lines.

Thank you.
Frédérik Paradis

atremblay · 2019-04-29T21:47:26Z

Sure thing. And damn vim plugin, a bit too eager on the file header.

freud14 · 2019-04-30T13:41:44Z

Pull request merged! I might be a good idea to try make a pull request to PyTorch with the extended version of the TensorDataset class.

Merci bien pour ta collaboration.

Alexis Tremblay added 7 commits April 24, 2019 08:00

Splitting input and output in all methods and functions. Tests Passed

4f9b9bb

More tests

dce5af0

MOAR tests

4d1e795

Test for TensorDataset

7b40ef1

Documentation change and pylint

81c9c50

add _concat tests

25045c9

A bit of pylint

699a2a7

change some tests in TensorDataset to include ranges

677cddf

Alexis Tremblay added 2 commits April 29, 2019 09:28

Change TensorDataset length assertion

a1563ec

Change Tensordataset tests for real this time

ba3f323

Alexis Tremblay added 2 commits April 29, 2019 15:34

Chance documentation

8d7ddbf

Pylint

8a9cc4c

Following comment on PR

de5a08f

freud14 merged commit 804cb97 into GRAAL-Research:master Apr 30, 2019

atremblay deleted the multi_IO branch April 30, 2019 13:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi io #30

Multi io #30

atremblay commented Apr 26, 2019

freud14 commented Apr 27, 2019

atremblay commented Apr 29, 2019

freud14 commented Apr 29, 2019 •

edited

atremblay commented Apr 29, 2019

freud14 commented Apr 29, 2019

atremblay commented Apr 29, 2019

freud14 commented Apr 29, 2019

atremblay commented Apr 29, 2019

freud14 commented Apr 30, 2019

Multi io #30

Multi io #30

Conversation

atremblay commented Apr 26, 2019

freud14 commented Apr 27, 2019

atremblay commented Apr 29, 2019

freud14 commented Apr 29, 2019 • edited

atremblay commented Apr 29, 2019

freud14 commented Apr 29, 2019

atremblay commented Apr 29, 2019

freud14 commented Apr 29, 2019

atremblay commented Apr 29, 2019

freud14 commented Apr 30, 2019

freud14 commented Apr 29, 2019 •

edited