Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add samediff layer support (forward only, no output layers) #4675

Merged
merged 34 commits into from Feb 21, 2018

Conversation

AlexDBlack
Copy link
Contributor

@AlexDBlack AlexDBlack commented Feb 19, 2018

Starting from: #4426 - removes the backprop components and output/loss layers... I'll tackle those components separately.

Adds support for SameDiff layers:

  • Forward pass only for now, non-output layers only. Backprop is partially implemented, but disabled for now
  • I've tried to keep everything to a single class only: that means, implementing a SameDiff layer requires only extending BaseSameDiffLayer, and that's it (no implementation or param initializer classes, for example).
    • 2 complete examples of samediff layers can be found in this PR (both in test directory) - SameDiffConv and SameDiffDense. Param initialization and forward pass match the DL4J layers.
    • See also MinimalSameDiffDense for a minimal version (least number of methods required to be implemented)

@AlexDBlack AlexDBlack mentioned this pull request Feb 19, 2018
@AlexDBlack
Copy link
Contributor Author

Something we will want to support: importing a layer definition (serialized SameDiff, TF, etc) into a DL4J layer. The current layer API may not easily support this in all cases.

Copy link
Contributor

@wmeddie wmeddie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good.

One thing I'l like to see if to minimize the number of functions you need to override.

One way we can do that is to define helper methods in BaseSameDiffLayer that you could use when creating the graph. These methods can save the parameter attributes and for calculating the values for weightKeys, biasKeys, paramShapes and initializeParams.

For example, we might be able to define an optional constructor (or overridable method) that takes a pair of arrays of SDVariables for weights and biases and use their sizes to calculate the shapes. The Chainer API takes that approach.

I'm not familiar with when the keys and shapes are required though, so it may not be possible to implement SameDiffLayer that way.

//No op constructor for Jackson
}

public abstract List<String> defineLayer(SameDiff sameDiff, SDVariable layerInput, Map<String,SDVariable> paramTable);
Copy link
Contributor

@wmeddie wmeddie Feb 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt this would be better defined as a List<SDVariable> instead of List<String>. That way you aren't just throwing away objects, and it's impossible to make a typo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an idea oh how it could work (and a working unit test)

https://gist.github.com/wmeddie/809c602311503a2ebb2d2cd957daf09e

@AlexDBlack AlexDBlack merged commit f9c10b7 into master Feb 21, 2018
@AlexDBlack AlexDBlack deleted the ab_samedifflayer_fwd branch February 21, 2018 03:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants