Skip to content
This repository was archived by the owner on Jul 7, 2023. It is now read-only.

Conversation

@EndingCredits
Copy link
Contributor

Added the following layers (and tests):

*global_pool_1d: Not much to this, just a simple max/mean reduce across all vectors in a sequence. Useful for getting a single vector representation of a sequence/set. This can also be thought of as a kind of global attention.

  • linear_set_layer: Simple 1d convolution across all elements in a sequence (similar layers are already used in repo), but also includes the ability to parameterise the transformation via with a (learned) sequence 'context' vector. Currently this is done by simply concatenating the inputs with the context.
    *ravanbakhsh_set_layer: Layer type used in https://arxiv.org/abs/1611.04500.

These layers are permutation invariant and hence are suitable for applications where inputs are given as sets, but they may also be useful for sequence tasks as well.

Also added an example model transformer_alt which replaces the self attention layers in transformer with two different kinds of modules composed of the layers described above. I have no idea how well it performs (although similar previous architectures I tested seemed to do only slightly worse than the full transformer).

N.B: I have not extensively tested these layers in t2t, so it's entirely possible that they're not perfectly functional (particularly wrt to masking).

Also changed two lines in tests to stop numpy warnings.

Copy link
Contributor

@lukaszkaiser lukaszkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small changes might be needed in another PR, but this looks good enough for merging.

@lukaszkaiser lukaszkaiser merged commit d827bb2 into tensorflow:master Jul 11, 2017
@lukaszkaiser
Copy link
Contributor

Thanks! Let's get the model clear and training soon.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants