Multi input time distributed #207
Multi input time distributed #207
Conversation
@@ -0,0 +1,16 @@ | |||
from keras.layers import Layer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to put debug
in the name. Just call this output_mask.py
.
from keras import backend as K | ||
from deep_qa.layers.wrappers.fixed_time_distributed import FixedTimeDistributed | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only two blank lines here.
@@ -0,0 +1,37 @@ | |||
from keras import backend as K | |||
from keras.layers import TimeDistributed | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two blank lines here. (https://www.python.org/dev/peps/pep-0008/#blank-lines)
from keras import backend as K | ||
from keras.layers import TimeDistributed | ||
|
||
class FixedTimeDistributed(TimeDistributed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should call this TimeDistributed
, so that if/when Keras fixes its class, we just have to change an import in the model code (does that make sense?). I've been meaning to do this myself, but haven't yet. So can you do the import above as from keras.layers import TimeDistributed as KerasTimeDistributed
, and call this class TimeDistributed
? This is similar to how we did the Highway
layer. And then you'd want to change the name of the file, too.
|
||
|
||
class MultiInputTimeDistributed(FixedTimeDistributed): | ||
r""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need this r
here if you're not doing math.
class MultiInputTimeDistributed(FixedTimeDistributed): | ||
r""" | ||
This class extends the previous TimeDistributed fix (i.e., ``FixedTimeDistributed``) | ||
to allow for inputs that are lists of tensors. Currently, while masking is supported masking (i.e., if any of the inputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what this second sentence is saying.
has a mask) is not supported. | ||
|
||
Inputs: | ||
- probabilities, shape ``(batch, ..., N, ...)``, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the rest of this docstring needs updating.
@overrides | ||
def build(self, input_shape): | ||
num_inputs = len(input_shape) | ||
input_specs = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just do self.input_spec = []
here, and avoid the assignment on line 42.
|
||
@overrides | ||
def build(self, input_shape): | ||
num_inputs = len(input_shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should check for tuple vs. list here, like you do in get_output_shape_for
.
raise Exception("Invalid input_shape type.") | ||
|
||
@overrides | ||
def compute_mask(self, x, input_mask=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think I would just recommend punting on this for now, because we should put the call
and other code directly in our TimeDistributed
class. Then each of our subclasses that handle masks differently can incorporate the multi-input into their compute_mask
methods. The trouble with having a single implementation here is that different subclasses do different mask computations, so you can't really have a single general method.
If you want to include this in this PR, instead of punting on it, you can probably move what you have here to the TimeDistributedWithMask
class.
Thanks for working on this, it's a really helpful change. |
I'm going to try to finish this now, because it'd be easier for me than for you, and so you can get back to doing model design sooner. If I finish this fast enough, I'll work on the suggestions I made to your other PR, too. You've gotten them most of the way there, so it shouldn't take me too long to finish it. |
sorry I didn't do this yet -- just wasn't working on ai2 stuff today
…On Sat, Feb 18, 2017 at 8:50 PM, Matt Gardner ***@***.***> wrote:
I'm going to try to finish this now, because it'd be easier for me than
for you, and so you can get back to doing model design sooner. If I finish
this fast enough, I'll work on the suggestions I made to your other PR,
too. You've gotten them most of the way there, so it shouldn't take me too
long to finish it.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#207 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AFInifTthoQaYZl8J7vcMzRkYqK0l4cMks5rd8n6gaJpZM4MEqEw>
.
|
No need to apologize for not working on a Saturday. It's not expected at all. I'm just trying to help you get back to stuff that's more interesting for you. Improving the library probably isn't the most exciting thing for you to be working on. |
This makes the API cleaner; you can just use TimeDistributed, and it will handle all of the inputs that you want. I also moved a couple of other files around.
63acfc1
to
59f97da
Compare
Ok, I think this is done. It turned out it was more complicated than I originally thought it was (which makes me happy that I could do it for you, because it would have taken quite a bit of back-and-forth to figure out the issues and the solutions to them, and that would have just been a distraction for your internship project). But it should work now; you should be able to just use our |
And I'm going to add you as a reviewer to the PR - can you look over the changes I made? Oh, I can't add you as a reviewer, because you submitted the PR... But you can still look it over =). |
thanks again for all your help on this! looks great to me, and seems like you cleaned it all up by combining use cases, but then again, take my input with the appropriately sized grain of salt! |
I'd love you to:
check that the tests for the call method are sufficient to really make sure things are ok
check the compute_mask method -- I tried to borrow from the TD with Mask class and extend it parallel to how Mark helped me extend the call method of the multi Input one
help me write a test for the compute_mask (right now, it's a bunch of copied code from other tests that I half modified before quitting)
make sure I didn't ruin things with the move to wrappers/ with separate files...