Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic cleanup rest of framework, activations and initializers #231

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

JimClarke5
Copy link
Contributor

This PR cleans up the generics in activations and initializers.
Basically the generic was removed from the class declaration and moved to the call() method.

For example, Activation:

public abstract class Activation {
....
    public abstract <T extends TNumber> Operand<T> call(Operand<T> input);
}

and Initializer:

public interface Initializer {
    <T extends TType> Operand<T> call(Operand<TInt64> dims, Class<T> type);
}

Copy link
Collaborator

@karllessard karllessard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @JimClarke5 , I've left a few minor comments on this.

}

/**
* Gets the calculation operation for the activation.
*
* @param input the input tensor
* @return The operand for the activation
* @param <T> the data type of the input and result
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you reorder these tags so that the @param are all before @return? I think the generic parameter should be the first one, I'm not sure if there is a standard for this. Or, if you prefer, I would also be totally comfortable not documenting at all the generic parameter of these method, which is quite explicit. Your choice.

This comment applies to other places in that PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did find this document from Oracle, How to Write Doc Comments for the Javadoc Tool.
@param should be before @return, but it doesn't mention anything about ordering the generic within the params.
It just says Multiple @param tags should be listed in argument-declaration order. This makes it easier to visually match the list to the declaration.

I will fix by moving the @param<T> to before the @return

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Javadoc will warn on any missing parameters, generic or otherwise, so I think it's better to have all parameters documented.

public <T extends TType> Operand<T> call(Operand<TInt64> dims, Class<T> type) {
if (!TNumber.class.isAssignableFrom(type)) {
throw new IllegalArgumentException("Tensor type must be numeric: " + type.getSimpleName());
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double-checking if that's correct, the initial generic at the class level was bound to TFloating while now it is TNumber.

Same thing for a few other initializers below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the classes were TFloating, some were TNumber or TType for Initializer. If I change the overloaded method to TFloating, the compiler would complain about not overloading the abstract method. I am open to suggestions. Right now I throw IllegalArgumentException which is not optimal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed to the generic on the base class as @karllessard suggested. This eliminates this issue for the most part.
There is still a runtime check for Zeros and Ones to check if TNumber or TBool.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(See my question elsewhere about adding a superclass TNumberOrBool.)

if (!TNumber.class.isAssignableFrom(type) && type != TBool.class) {
throw new IllegalArgumentException("Tensor type must be numeric or boolean: " + type.getSimpleName());
throw new IllegalArgumentException(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this enforce that the calling type is the same as valueType?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's casting to type. Isn't that good enough?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well if you've constructed it with a double but call it with TInt32 that sounds like a programmer error to me and so we might want to flag that with a runtime exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other option is to change the value to an Operand, and use that type.

Copy link
Contributor Author

@JimClarke5 JimClarke5 Mar 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the value to Operand, the only constraint is that this must be a scalar (added throws IllegalArgumentException in CTOR).

The TF Python Constant has the constraint that the value must be "castable" to the type defined in the call method. So the way I have it matches TF Python. Do we want to create a method in NDArray to detect uncastable pairs?
I think the only noncastable operation is to/from TString.
(Throws org.tensorflow.exceptions.TFUnimplementedException: Cast string to int32 is not supported)

Also, TF Python supports a Constant for TString, but I cannot figure out how to create an Operand from a String[][] with tf.constant(), so the Unit Test case does:

       Shape shape = Shape.of(2, 2);
       String[][] expected = {
                {"Java Test", "Java Test"},
                {"Java Test", "Java Test"},
        };
        // There is no tf.constant(String[][]).
        Operand<TString> expectedOp = org.tensorflow.op.core.Constant.tensorOf(tf.scope(), expected);
        Constant instance = new Constant(tf, tf.constant("Java Test"));
        Operand<TInt32> result = instance.call(tf.constant(shape), TInt32.class);
       

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's probably ok. It seems unlikely that we'll need to initialise something with a String rather than just define it elsewhere.

*/
Operand<T> call(Operand<TInt64> dims, Class<T> type);
<T extends TType> Operand<T> call(Operand<TInt64> dims, Class<T> type);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we make this TNumber instead? Seems like that would reduce a lot of issues, and only lose the boolean case (which I'm unclear how much is used).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do that and have the calling program cast it to TBool if desired. It only effects Constant and Ones.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would be fine with that. @karllessard @deansher opinions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me.

Going a little further afield: could we / should we make TBool a subtype of TNumber? When we discuss constraining a tensor type to TNumber, we often end up with "yeah, but TBool".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "NotTString" :-)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The earliest versions of Tribuo had BNumber which was a boolean box that implemented java.lang.Number, but we refactored that out because it was more trouble than it was worth. As such I don't think it's the best idea, especially as TNumber lines up fairly well with things you'd put java.lang.Number on in Java.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is anyone besides me tempted to add a superclass of TNumber and TBool titled (perhaps, in a burst of creativity) TNumberOrBool?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that might be nicer for us, but do we want users to have TNumberOrBool in their code? Because it looks a little messy.

Copy link
Collaborator

@Craigacp Craigacp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I've only got a few things that need discussion or changing, but this is a lot nicer than the previous version and I think it's good to go now.

tf.math.tanh(
tf.math.mul(
// sqrt(2.0 / PI)
cast(tf, tf.constant(0.7978845608028654), input.type()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this one pulled out like the others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was mainly for debugging and keeping the parts of the equation manageable. I will change this one and add one for the constant "three".

BTW: It would be nice if we could pass a type to tf.constant, something liketf.constant(3, input.dtype())to return the correct type.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good extension to have. That should be fairly straightforward.

features / math_ops.cast(1.4142135623730951, features.dtype)))
*/
return tf.math.mul(
cast(tf, tf.constant(0.5), input.type()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe hoist this and the one below out of the if statement and use local variables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

if (!TNumber.class.isAssignableFrom(type) && type != TBool.class) {
throw new IllegalArgumentException(
"Tensor type must be numeric or boolean: " + type.getSimpleName());
}
return tf.fill(dims, tf.dtypes.cast(tf.constant(1.0), type));

return cast(tf, tf.fill(dims, tf.constant(1)), type);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This switched the order of the fill and the cast. Is there a reason to prefer one way over the other?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if it make a difference, but might be faster to fill with the right type first. I will fix.

tf.math.mul(distOp, tf.dtypes.cast(tf.constant(this.stddev), distOp.type()));
return cast(tf, tf.math.add(op, tf.dtypes.cast(tf.constant(mean), distOp.type())), type);
Operand<U> distOp = tf.random.statelessRandomNormal(dims, tf.constant(seeds), type);
Operand<U> op = tf.math.mul(distOp, cast(tf, tf.constant(this.stddev), type));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this initially, but is there a reason we don't check that the standard deviation is positive in the constructor?

@SuppressWarnings("unchecked")
Class<TNumber> nType = (Class<TNumber>) type;
Operand<TNumber> distOp;
public <U extends TNumber> Operand<U> call(Operand<TInt64> dims, Class<U> type) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to RandomNormal is there a reason we don't check that the minVal is less than the maxVal on construction?

type);
Operand<U> distOp = tf.random.statelessTruncatedNormal(dims, tf.constant(seeds), type);
return tf.math.add(
tf.math.mul(distOp, cast(tf, tf.constant(stddev), distOp.type())),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code shape here is different to RandomNormal. Can they be the same but only call a different tf.random function?

@Test
public void testCallFloat() {
float[][] input = {
{0.22805803f, 0.60407318f, 0.91519962f, 0.35643331f, 0.28702669f},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we test some negative values here? Specifically in the region -2,0 where it's negative, and something more negative like -10 where it should be approximately zero.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants