New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New functionality (epoch tracking and constraints) #3957

Merged
merged 9 commits into from Aug 31, 2017

Conversation

Projects
None yet
3 participants
@AlexDBlack
Copy link
Member

AlexDBlack commented Aug 28, 2017

Should be good to go.

  • Adds epoch tracking: #3894
  • Adds constraints (API + implementations) #3898 #2505

@maxpumperla maxpumperla changed the title New functionality New functionality (epoch tracking and constraints) Aug 29, 2017

@huitseeker
Copy link
Contributor

huitseeker left a comment

Nice & clean, looking forward ot the test pass on constraints.

*
* The current epoch count can be obtained using {@code ComputationGraph.getConfiguration().getEpochCount()}
*/
public void incrementEpochCount(){

This comment has been minimized.

@huitseeker

huitseeker Aug 29, 2017

Contributor

Is this thread-safe, could we envision a situation with concurrent setEpochCount calls ?

This comment has been minimized.

@AlexDBlack

AlexDBlack Aug 30, 2017

Member

No, it's not thread safe - but then nothing in MultiLayerNetwork or ComputationGraph is.
Setting can be done directly on the configuration objects.

@JsonTypeInfo(use = JsonTypeInfo.Id.CLASS, include = JsonTypeInfo.As.PROPERTY, property = "@class")
public interface LayerConstraint {

void applyConstraint(Layer layer, int iteration, int epoch);

This comment has been minimized.

@huitseeker

huitseeker Aug 29, 2017

Contributor

I see where the call would come in the optimizer, but again it seems dangerous that this could be applied less than once (or more than once, depending on the constraint) per pass. Now that we're tracking epochs, could we have some epoch checking to make sure two successive calls are idempotent and less than one per epoch log a warning?

This comment has been minimized.

@AlexDBlack

AlexDBlack Aug 30, 2017

Member

The way I'll be implementing it is that it'll get called once per iteration, after updates have been applied - it won't be possible to apply it more than once per iteration (and, even if it does: all of the implementations will give the same result with multiple sequential appliacions.
The epoch (and, iteration) number is just informational, to allow flexibility for users to have behaviour that depends on the number of epochs passed.

@AlexDBlack AlexDBlack requested review from huitseeker and maxpumperla Aug 30, 2017

@maxpumperla
Copy link
Contributor

maxpumperla left a comment

I'm a little confused by the use of isBias in apply, maybe you can clarify, other than that LGTM 👍.

ParamInitializer i = layer.conf().getLayer().initializer();
for(Map.Entry<String,INDArray> e : paramTable.entrySet()){
if(applyToWeights && i.isWeightParam(e.getKey()) || applyToBiases && i.isBiasParam(e.getKey())){
apply(e.getValue(), i.isBiasParam(e.getKey()));

This comment has been minimized.

@maxpumperla

maxpumperla Aug 30, 2017

Contributor

maybe we can replace i.isBiasParam(e.getKey() here simply with true?! It took me a while to realise that it doesn't matter what to use here, but this seems to indicate it has specifically to to with bias, which it doesn't

This comment has been minimized.

@AlexDBlack

AlexDBlack Aug 31, 2017

Member

No - some parameters are neither weights or biases... Consider for example batch normalization statistics, which are technically parameters in DL4J, but neither weights or biases.

*
* @param maxNorm Maximum L2 value
* @param dimensions Dimensions to apply to. For DenseLayer, OutputLayer, RnnOutputLayer, LSTM, etc: this should
* be dimension 1. For CNNs, this should be dimensions [1,2,3] correspending to last 3 of

This comment has been minimized.

@maxpumperla

maxpumperla Aug 30, 2017

Contributor

typo: corresponding

This comment has been minimized.

@AlexDBlack
* @param max Maximum L2 value
* @param min Minimum L2 value
* @param dimensions Dimensions to apply to. For DenseLayer, OutputLayer, RnnOutputLayer, LSTM, etc: this should
* be dimension 1. For CNNs, this should be dimensions [1,2,3] correspending to last 3 of

This comment has been minimized.

@maxpumperla

maxpumperla Aug 30, 2017

Contributor

correspending => corresponding. I think this is "everywhere".

}
}

public abstract void apply(INDArray param, boolean isBias);

This comment has been minimized.

@maxpumperla

maxpumperla Aug 30, 2017

Contributor

afaict isBias is never actually used in the constraints you implemented. Why do we have it?

This comment has been minimized.

@AlexDBlack

AlexDBlack Aug 31, 2017

Member

Removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment