Incorporate information about earlier papers using three-way interactions #52

vdumoulin · 2017-11-24T15:18:07Z

No description provided.

vdumoulin · 2017-12-04T17:11:02Z

Roland Memisevic worked on several papers that touch the idea of gating and three-way interactions:

We should ask for feedback from him on that.

vdumoulin · 2018-01-15T18:25:49Z

His recommendation is to read his review paper as well as the gated softmax paper.

ethanjperez · 2018-01-25T17:25:38Z

"Gated Softmax Classification" uses bilinear transformations to represent the three-way interaction between an input (x), latent binary hidden variables (h), and output classes (y). To predict the pre-softmax score of y, you marginalize over all possible combinations of h to combine the various pre-softmax scores given by a class-specific bilinear transformation: (h^T)(W_y)(x). You can use some nice math to do the exact calculation tractably, and you can factor the weights used for the bilinear transformations to significantly decrease the number of parameters and increase regularization. The model gets good results on MNIST-like tasks.

The method seems like an interesting way to use bilinear transformations to condition computation, i.e. based on latent variables (rather than side-information or self-information). It seems worth a sentence/phrase/citation, but not much more at the moment. However, I will read the other papers you list here too; if I find a whole class of methods that condition on latent variables similarly, then we could consider making a subsection on this kind of approach.

Let me know if you want me to make a pull request for a change into the article, or if it's easier for you to incorporate it yourself directly.

vdumoulin · 2018-01-25T18:55:07Z

It would be more convenient for me if you made the PR and I reviewed it. Thanks!

ethanjperez · 2018-01-25T21:31:22Z

Okay sounds good I'll make one!

ethanjperez · 2018-02-15T21:34:47Z

I made a few notes on Roland's review paper for the portions I've read so far; I'll incorporate them (along with my Gated Softmax notes) into a pull request once I finish reading the review. Here are the notes I have so far (more for myself than anything):

"The idea of using multiplicative interactions in vision was introduced about 30 years ago under the terms “mapping units” [1] and “dynamic mappings” [2]."
"Our analysis... predicts that the use of squaring non-linearities or multiplicative interactions will be essential in any tasks that involve representing relations."
Multiplicative interactions are useful in learning relationships as they are natural in identifying "matches" (think dot product of two feature vectors, logical AND/XNOR, (covariance matrix?) etc.). In contrast, additive interactions are perhaps more natural for carrying out different roles, such as content detection, feature aggregation, logical OR, etc. [Me: Perhaps using both lets you have the best of both worlds?]
Gated autoencoders [18], [19] reconstruct an input as a function of some conditioning input.
Gated Boltzmann Machine [16] is an RBM whose parameters (and energy function and normalization constant) are a function of some conditioning input. Samples are drawn from conditional distributions, and the model trains in a similar way to an RBM.
Emphasizes that the symmetry of multiplicative interactions allows multiple interpretations of what computation is happening (i.e. [latents conditioning transformation over another input] vs. [another input conditioning transformation over latents])

vdumoulin · 2018-03-17T18:35:24Z

@ethanjperez as I mentioned to Florian in issue #92, I think we can avoid making the text heavier by integrating those citations into a bibliographic note in the appendix (see e.g. the CTC article and the relevant portion of its source code).

ethanjperez · 2018-04-03T20:24:23Z

@vdumoulin Did we decide to leave the note on Biological Plausibility out? Roland's review paper has this interesting note: "From a biological point of view, multiplicative interactions may also be viewed as a conceptually simple approximation to more complex dendritic computations [60] than the common neuron abstraction used in practically all deep learning models."

[60] K. A. Archie and B. W. Mel, “A model for intradendritic computation of binocular disparity,” Nature Neuroscience, vol. 3, no. 1, pp. 54–63, Jan. 2000.

vdumoulin · 2018-04-03T20:36:02Z

I wouldn't feel comfortable defending that connection, as I'm not familiar enough with that literature. I think we should leave biological plausibility out.

vdumoulin mentioned this issue Mar 14, 2018

Incorporate information about earlier papers using three-way interactions #87

Closed

vdumoulin changed the title ~~Are there older papers that use gating as a side-information fusion mechanism?~~ Incorporate information about earlier papers using three-way interactions Mar 14, 2018

vdumoulin added this to the Internal submission deadline milestone Mar 14, 2018

vdumoulin assigned ethanjperez Mar 14, 2018

vdumoulin closed this as completed Apr 4, 2018

vdumoulin unassigned ethanjperez Jul 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorporate information about earlier papers using three-way interactions #52

Incorporate information about earlier papers using three-way interactions #52

vdumoulin commented Nov 24, 2017

vdumoulin commented Dec 4, 2017

vdumoulin commented Jan 15, 2018

ethanjperez commented Jan 25, 2018 •

edited

Loading

vdumoulin commented Jan 25, 2018

ethanjperez commented Jan 25, 2018

ethanjperez commented Feb 15, 2018

vdumoulin commented Mar 17, 2018

ethanjperez commented Apr 3, 2018 •

edited

Loading

vdumoulin commented Apr 3, 2018

Incorporate information about earlier papers using three-way interactions #52

Incorporate information about earlier papers using three-way interactions #52

Comments

vdumoulin commented Nov 24, 2017

vdumoulin commented Dec 4, 2017

vdumoulin commented Jan 15, 2018

ethanjperez commented Jan 25, 2018 • edited Loading

vdumoulin commented Jan 25, 2018

ethanjperez commented Jan 25, 2018

ethanjperez commented Feb 15, 2018

vdumoulin commented Mar 17, 2018

ethanjperez commented Apr 3, 2018 • edited Loading

vdumoulin commented Apr 3, 2018

ethanjperez commented Jan 25, 2018 •

edited

Loading

ethanjperez commented Apr 3, 2018 •

edited

Loading