New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add samediff layer support (forward only, no output layers) #4675

Merged
merged 34 commits into from Feb 21, 2018

Conversation

Projects
None yet
2 participants
@AlexDBlack
Copy link
Member

AlexDBlack commented Feb 19, 2018

Starting from: #4426 - removes the backprop components and output/loss layers... I'll tackle those components separately.

Adds support for SameDiff layers:

  • Forward pass only for now, non-output layers only. Backprop is partially implemented, but disabled for now
  • I've tried to keep everything to a single class only: that means, implementing a SameDiff layer requires only extending BaseSameDiffLayer, and that's it (no implementation or param initializer classes, for example).
    • 2 complete examples of samediff layers can be found in this PR (both in test directory) - SameDiffConv and SameDiffDense. Param initialization and forward pass match the DL4J layers.
    • See also MinimalSameDiffDense for a minimal version (least number of methods required to be implemented)

@AlexDBlack AlexDBlack referenced this pull request Feb 19, 2018

Closed

Add SameDiff Layer #4426

@AlexDBlack

This comment has been minimized.

Copy link
Member

AlexDBlack commented Feb 19, 2018

Something we will want to support: importing a layer definition (serialized SameDiff, TF, etc) into a DL4J layer. The current layer API may not easily support this in all cases.

@wmeddie
Copy link
Member

wmeddie left a comment

Looks pretty good.

One thing I'l like to see if to minimize the number of functions you need to override.

One way we can do that is to define helper methods in BaseSameDiffLayer that you could use when creating the graph. These methods can save the parameter attributes and for calculating the values for weightKeys, biasKeys, paramShapes and initializeParams.

For example, we might be able to define an optional constructor (or overridable method) that takes a pair of arrays of SDVariables for weights and biases and use their sizes to calculate the shapes. The Chainer API takes that approach.

I'm not familiar with when the keys and shapes are required though, so it may not be possible to implement SameDiffLayer that way.

//No op constructor for Jackson
}

public abstract List<String> defineLayer(SameDiff sameDiff, SDVariable layerInput, Map<String,SDVariable> paramTable);

This comment has been minimized.

@wmeddie

wmeddie Feb 20, 2018

Member

I felt this would be better defined as a List<SDVariable> instead of List<String>. That way you aren't just throwing away objects, and it's impossible to make a typo.

This comment has been minimized.

@wmeddie

wmeddie Feb 20, 2018

Member

Here's an idea oh how it could work (and a working unit test)

https://gist.github.com/wmeddie/809c602311503a2ebb2d2cd957daf09e

@AlexDBlack AlexDBlack force-pushed the ab_samedifflayer_fwd branch from 01a51a8 to 79af7a8 Feb 20, 2018

@AlexDBlack AlexDBlack merged commit f9c10b7 into master Feb 21, 2018

1 of 3 checks passed

codeclimate 11 issues to fix
Details
continuous-integration/jenkins/pr-merge This commit is being built
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@AlexDBlack AlexDBlack deleted the ab_samedifflayer_fwd branch Feb 21, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment