Skip to content
This repository has been archived by the owner on Jan 19, 2019. It is now read-only.

Multi input time distributed #207

Merged
merged 8 commits into from Feb 20, 2017

Conversation

BeckySharp
Copy link
Contributor

I'd love you to:

  • check that the tests for the call method are sufficient to really make sure things are ok

  • check the compute_mask method -- I tried to borrow from the TD with Mask class and extend it parallel to how Mark helped me extend the call method of the multi Input one

  • help me write a test for the compute_mask (right now, it's a bunch of copied code from other tests that I half modified before quitting)

  • make sure I didn't ruin things with the move to wrappers/ with separate files...

@@ -0,0 +1,16 @@
from keras.layers import Layer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to put debug in the name. Just call this output_mask.py.

from keras import backend as K
from deep_qa.layers.wrappers.fixed_time_distributed import FixedTimeDistributed


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only two blank lines here.

@@ -0,0 +1,37 @@
from keras import backend as K
from keras.layers import TimeDistributed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from keras import backend as K
from keras.layers import TimeDistributed

class FixedTimeDistributed(TimeDistributed):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should call this TimeDistributed, so that if/when Keras fixes its class, we just have to change an import in the model code (does that make sense?). I've been meaning to do this myself, but haven't yet. So can you do the import above as from keras.layers import TimeDistributed as KerasTimeDistributed, and call this class TimeDistributed? This is similar to how we did the Highway layer. And then you'd want to change the name of the file, too.



class MultiInputTimeDistributed(FixedTimeDistributed):
r"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this r here if you're not doing math.

class MultiInputTimeDistributed(FixedTimeDistributed):
r"""
This class extends the previous TimeDistributed fix (i.e., ``FixedTimeDistributed``)
to allow for inputs that are lists of tensors. Currently, while masking is supported masking (i.e., if any of the inputs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this second sentence is saying.

has a mask) is not supported.

Inputs:
- probabilities, shape ``(batch, ..., N, ...)``,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the rest of this docstring needs updating.

@overrides
def build(self, input_shape):
num_inputs = len(input_shape)
input_specs = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just do self.input_spec = [] here, and avoid the assignment on line 42.


@overrides
def build(self, input_shape):
num_inputs = len(input_shape)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should check for tuple vs. list here, like you do in get_output_shape_for.

raise Exception("Invalid input_shape type.")

@overrides
def compute_mask(self, x, input_mask=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think I would just recommend punting on this for now, because we should put the call and other code directly in our TimeDistributed class. Then each of our subclasses that handle masks differently can incorporate the multi-input into their compute_mask methods. The trouble with having a single implementation here is that different subclasses do different mask computations, so you can't really have a single general method.

If you want to include this in this PR, instead of punting on it, you can probably move what you have here to the TimeDistributedWithMask class.

@matt-gardner
Copy link
Contributor

Thanks for working on this, it's a really helpful change.

@matt-gardner
Copy link
Contributor

I'm going to try to finish this now, because it'd be easier for me than for you, and so you can get back to doing model design sooner. If I finish this fast enough, I'll work on the suggestions I made to your other PR, too. You've gotten them most of the way there, so it shouldn't take me too long to finish it.

@BeckySharp
Copy link
Contributor Author

BeckySharp commented Feb 19, 2017 via email

@matt-gardner
Copy link
Contributor

No need to apologize for not working on a Saturday. It's not expected at all. I'm just trying to help you get back to stuff that's more interesting for you. Improving the library probably isn't the most exciting thing for you to be working on.

Rebecca Sharp and others added 5 commits February 19, 2017 07:07
This makes the API cleaner; you can just use TimeDistributed, and it
will handle all of the inputs that you want.  I also moved a couple of
other files around.
@matt-gardner
Copy link
Contributor

Ok, I think this is done. It turned out it was more complicated than I originally thought it was (which makes me happy that I could do it for you, because it would have taken quite a bit of back-and-forth to figure out the issues and the solutions to them, and that would have just been a distraction for your internship project). But it should work now; you should be able to just use our TimeDistributed class on multiple inputs, with masks, and it should just work.

@matt-gardner
Copy link
Contributor

matt-gardner commented Feb 19, 2017

And I'm going to add you as a reviewer to the PR - can you look over the changes I made? Oh, I can't add you as a reviewer, because you submitted the PR... But you can still look it over =).

@BeckySharp
Copy link
Contributor Author

thanks again for all your help on this! looks great to me, and seems like you cleaned it all up by combining use cases, but then again, take my input with the appropriately sized grain of salt!

@BeckySharp BeckySharp merged commit 241e5d1 into allenai:master Feb 20, 2017
@BeckySharp BeckySharp deleted the multi_input_time_distributed branch February 24, 2017 19:09
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants