add cast floatX to avoid theano warning that cuda array can not cast fro... #606

udibr · 2015-04-30T14:59:10Z

add cast floatX to avoid theano warning that cuda array can not cast from float64 or int64

rizar · 2015-05-01T19:51:02Z

How do I reproduce such a warning? I ran MNIST example on my GPU and everything was clean.

udibr · 2015-05-13T14:46:33Z

Add Xavier weight initialization and BinaryMisclassificationRate which is similar to MisclassificationRate but works with a binary (scalar) target

rizar · 2015-05-14T20:22:42Z

@udibr , as I said in my previous message, I can not reproduce the warning you report by running a MNIST demo. Can you shed some light on when it actually happens?

rizar · 2015-05-15T11:48:04Z

blocks/initialization.py

+class Xavier(NdarrayInitialization):
+    """Initialize with Gaussian distribution with Xavier parameters.
+
+    Use the following gaussian parameters: mean=0 and var=scale/Nin


Apparently you forgot a square root here. Also we would need a reference to the paper here.

I've add reference (and changed var to std so as to not confuse the reader)
however, the original paper uses a uniform distribution and I am using a gaussian....
so either I change my code or not call it Xavier or keep it as is (most places use gaussian)

rizar · 2015-05-15T11:57:04Z

Because this sort of urgent, I opened a new PR for this issue to be merged right away, #635. However, you can continue working on this one and we will be happy to merge it later, when it is ready.

rizar · 2015-05-17T19:34:30Z

blocks/bricks/cost.py

@@ -58,7 +58,8 @@ def cost_matrix(self, y, y_hat):
 class CategoricalCrossEntropy(Cost):
    @application(outputs=["cost"])
    def apply(self, y, y_hat):
-        cost = tensor.nnet.categorical_crossentropy(y_hat, y).mean()
+        cost = tensor.nnet.categorical_crossentropy(y_hat, y).mean(
+            acc_dtype=floatX)


Why do you think we need this?

It resolved, for me, the same error messages I got in MisclassificationRate regarding float64 and CUDA. I guess this happens when y is int64.

I think it will be fixed with Theano/Theano#2913.

I also checked that given the first float32 and the second int64 argument it returns float32, which is just what we need:

In [5]: tensor.nnet.categorical_crossentropy(numpy.array([[0.5, 0.5], [0.5, 0.5]], dtype='f32'), numpy.array([0, 1], dtype='i64')).eval() Out[5]: array([ 0.69314718, 0.69314718], dtype=float32)

rizar · 2015-05-17T19:37:10Z

Xavier initialization and the fix in reverse_words look good to me, but please rebase the PR against the latest master (see our http://blocks.readthedocs.org/en/latest/development/pull_request.html for instructions)

dwf · 2015-05-17T19:46:53Z

blocks/initialization.py

@@ -223,3 +223,33 @@ def generate(self, rng, shape):
                                                 replace=False)
            weights[i, random_indices] = values[i]
        return weights
+
+
+class Xavier(NdarrayInitialization):


"Xavier" is the inventor's first name, so it's kind of weird to call it that. "Glorot" or "GlorotBengio" might be a better name...

rizar · 2015-05-18T09:57:09Z

blocks/bricks/cost.py

+    def apply(self, y, y_hat):
+        # Here we have to cast both operands to floatX explicitly.
+        # Because int64 / float32 = float64 in Theano, unfortunately.
+        return (tensor.sum(tensor.neq(y, y_hat > 0.5)).astype(floatX) /


This is almost a copy of MisclassificationRate, which is not nice.

You can add it as the second application method to MisclassificationRate, e.g. apply_binary, with the common code they have moved to a private method.

I also thought this way but then I noticed that in the same file you have both
BinaryCrossEntropy and CategoricalCrossEntropy so I thought it would be more appropriate to have a separate misclassification rates

Good point. I do not think there is much sense in having them as separate bricks. @bartvm , @dwf , what do you think?

…from float64 or int64

rizar · 2015-05-23T13:28:08Z

It seems like your pull request contains your master branch.. Can you make a bit clearer what of this do you actually want to contribute to Blocks? And it would be better to have separate PRs for the initialization scheme, Adagrad, etc.

udibr · 2015-05-24T12:32:56Z

You are right. I should have a separate branch for each feature and not everything on master. I am closing this pull request and will try to fix this

rizar mentioned this pull request May 15, 2015

Optimization Failure #633

Closed

rizar reviewed May 15, 2015
View reviewed changes

rizar reviewed May 17, 2015
View reviewed changes

dwf reviewed May 17, 2015
View reviewed changes

udibr force-pushed the master branch from bb3be04 to 3a4325a Compare May 17, 2015 20:51

rizar reviewed May 18, 2015
View reviewed changes

udibr force-pushed the master branch from 99e8a09 to c3595f9 Compare May 21, 2015 15:26

udibr added 11 commits May 22, 2015 18:59

add cast floatX to avoid theano warning that cuda array can not cast …

ea2eb63

…from float64 or int64

Xavier

3928d8e

BinaryMisclassificationRate

bd495b4

fix PEP8 problems

ce59762

fix PEP8 problems

cf31aa6

Reference to original Xavier paper

060e802

PEP8

10a71bc

undo changes

66d2bf7

AdaGrad

e33dcc8

undo acc_dtype=floatX in CategoricalCrossEntropy

0355779

Add use_bias parameter to Fork

61ed146

udibr force-pushed the master branch from ac24535 to 61ed146 Compare May 22, 2015 23:10

udibr closed this May 24, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add cast floatX to avoid theano warning that cuda array can not cast fro... #606

add cast floatX to avoid theano warning that cuda array can not cast fro... #606

udibr commented Apr 30, 2015

rizar commented May 1, 2015

udibr commented May 13, 2015

rizar commented May 14, 2015

rizar May 15, 2015

udibr May 15, 2015

rizar commented May 15, 2015

rizar May 17, 2015

udibr May 17, 2015

rizar May 18, 2015

rizar commented May 17, 2015

dwf May 17, 2015

rizar May 18, 2015

udibr May 18, 2015

rizar May 18, 2015

rizar commented May 23, 2015

udibr commented May 24, 2015

add cast floatX to avoid theano warning that cuda array can not cast fro... #606

add cast floatX to avoid theano warning that cuda array can not cast fro... #606

Conversation

udibr commented Apr 30, 2015

rizar commented May 1, 2015

udibr commented May 13, 2015

rizar commented May 14, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rizar commented May 15, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rizar commented May 17, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rizar commented May 23, 2015

udibr commented May 24, 2015