Suggestion to fix the TypeError issue #994 on Windows. #1550

notoraptor · 2016-11-24T22:05:32Z

Hello ! This is a working suggestion (and some remarks) to fix the issue 994 related to theano TypeError #994 .

I fixed the issue on my computer (Python 2.7.12 |Continuum Analytics, Inc.| (default, Jun 29 2016, 11:07:13) [MSC v.1500 64 bit (AMD64)], OS Windows-8.1-6.3.9600). As test code, I extracted the python code from the notebook of @napsternxg : https://gist.github.com/notoraptor/75fd0db142a21710a11b7a2e17f38cba

After further investigations, I found that the error comes from 2 problems.

First, the pymc3 function delta_logp, which is in fact a theano function, expects int64 arrays as input, but receives int32 arrays on Windows. These arrays are q and q0 passed here: https://github.com/pymc-devs/pymc3/blob/master/pymc3/step_methods/metropolis.py#L124 :

 q_new = metrop_select(self.delta_logp(q, q0), q, q0)

q and q0 are casted in previous lines with .astype(int), but on Windows, the Python type int is treated as an int32 by NumPy.

The second problem is in the function delta_logp. When it generates the associated theano function, it deactivates the input verification done in Theano by setting trust_input to true: https://github.com/pymc-devs/pymc3/blob/master/pymc3/step_methods/metropolis.py#L434

f.trust_input = True

As mentionned in theano documentation: https://github.com/Theano/Theano/blob/master/theano/compile/function_module.py

A Function instance have a ``trust_input`` field that default to
False. When True, we don't do extra check of the input to give
better error message. In some case, python code will still return
the good results if you pass a python or numpy scalar instead of a
numpy tensor.  C code should raise an error if you pass an object
of the wrong type.

Therefore, it is the C code that returns the TypeError when it detects the int32 vs int64 mismatch.

So, there is two solutions: either cast explicitely q and q0 to int64, or remove the line with f.trust_input = True in function delta_logp. In this pull request I suggest the later solution, as the type of the inputs of delta_logp does not seem easy to guarantee in any case (that is, is it always int64?).

@nouiz @lamblin !

Errors occurs because some ndarrays with dtype=int32 are passed to the function delta_logp which expects ndarrays with dtype=int64 as inputs. Inputs are casted with `ndarray.astype(int)`, but on Windows, the Python 'int' type is treated as a 32-bit integer by NumPy. I suggest to remove the line with `f.trust_input = True` to ensure that Theano always checks the inputs and casts it if necessary and when possible. This fix resolves the issue on my computer: ``` Python version: 2.7.12 |Continuum Analytics, Inc.| (default, Jun 29 2016, 11:07:13) [MSC v.1500 64 bit (AMD64)] OS version: Windows-8.1-6.3.9600 ```

twiecki · 2016-11-25T10:07:46Z

I think we set trust_input for speed so if there's another alternative I think that would be better. But I agree that it's a nasty problem.

twiecki · 2016-11-25T13:07:38Z

What happens if you do 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1], dtype=np.int32), value=-999,)?

twiecki · 2016-11-25T13:10:17Z

I would prefer your option 1 I think.

nouiz · 2016-11-25T13:59:13Z

Just a comment. For efficiency, I think option 1 should be done. That way only 1 cast will be done. Only not using trust input will make Theano cast the input again at the start of the function if the dtype isn't the right one. But I would also remove the trust_input except if it give super big speed up. Don't forget, it took 9 mounts to find that problem in pymc3. So is the speed up worth the debugging trouble?

…

On Fri, Nov 25, 2016 at 8:10 AM, Thomas Wiecki ***@***.***> wrote: I would prefer your option 1 I think. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1550 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AALC-4Dh6hr3tRjFFH9D901mBhbfzZFsks5rBt47gaJpZM4K7-7K> .

twiecki · 2016-11-25T14:02:49Z

@nouiz You do have a point. Perhaps we make this configurable and default to False.

twiecki · 2016-11-25T14:08:59Z

@nouiz I suppose there is no flag which would have ensured that the dtypes are correctly passed. I.e. not casting automatically if it's the wrong dtype, but raising an error.

nouiz · 2016-11-25T14:28:01Z

exact. Could be a nice feature request, but it isn't existing.

…

On Fri, Nov 25, 2016 at 9:09 AM, Thomas Wiecki ***@***.***> wrote: @nouiz <https://github.com/nouiz> I suppose there is no flag which would have ensured that the dtypes are correctly passed. I.e. not casting automatically if it's the wrong dtype, but raising an error. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1550 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AALC--CQVAgCZSSDaEph5zZ0qFQVdl1iks5rBuv9gaJpZM4K7-7K> .

notoraptor · 2016-11-25T16:48:10Z

Hello!

Aftter further investigations related to solution 1 (casting q and q0 to int64), it appears that the input type of delta_logp depends on the types of the argument vars passed to Metropolis.__init__() function, that is, in example code: https://gist.github.com/notoraptor/75fd0db142a21710a11b7a2e17f38cba#file-pymc-disaster-py-L74 :

step2 = pm.Metropolis([switchpoint, disasters.missing_values[0]] )

switchpoint and disasters.missing_values[0] have both dtype=int64 because switchpoint and disasters are created as DiscreteUniform and Poisson instances, both with default dtype (set to int64 in Discrete super-class). Then the delta_logp function is created based on these vars.

But the dtype of Discrete instances can be changed, and therefore, if an user does not use int64, some new errors appears.

For example, currently (with trust_input removed but q and q0 not yet casted), the example code works, but if I replace:

switchpoint = pm.DiscreteUniform('switchpoint', lower=year.min(), upper=year.max(), testval=1900)
# ...
disasters = pm.Poisson('disasters', rate, observed=disaster_data)

with

switchpoint = pm.DiscreteUniform('switchpoint', lower=year.min(), upper=year.max(), testval=1900, dtype='float32')
# ...
disasters = pm.Poisson('disasters', rate, observed=disaster_data, dtype='float32')

Then I got the following error:

  0%|          | 0/10000 [00:00<?, ?it/s]Traceback (most recent call last):
  File "PyMC+Disaster-test.py", line 78, in <module>
    trace = pm.sample(10000, step=[step1, step2])
  File "c:\users\hppc\mila\dev\git\pymc3\pymc3\sampling.py", line 175, in sample
    return sample_func(**sample_args)
  File "c:\users\hppc\mila\dev\git\pymc3\pymc3\sampling.py", line 185, in _sample
    for strace in sampling:
  File "C:\donnees\programmes\Anaconda3\envs\devPython2\lib\site-packages\tqdm\_tqdm.py", line 816, in __iter__
    for obj in iterable:
  File "c:\users\hppc\mila\dev\git\pymc3\pymc3\sampling.py", line 267, in _iter_sample
    point = step.step(point)
  File "c:\users\hppc\mila\dev\git\pymc3\pymc3\step_methods\compound.py", line 16, in step
    point = method.step(point)
  File "c:\users\hppc\mila\dev\git\pymc3\pymc3\step_methods\compound.py", line 16, in step
    point = method.step(point)
  File "c:\users\hppc\mila\dev\git\pymc3\pymc3\step_methods\arraystep.py", line 142, in step
    apoint = self.astep(bij.map(point))
  File "c:\users\hppc\mila\dev\git\pymc3\pymc3\step_methods\metropolis.py", line 124, in astep
    q_new = metrop_select(self.delta_logp(q, q0), q, q0)
  File "c:\users\hppc\mila\dev\git\theano\theano\compile\function_module.py", line 788, in __call__
    allow_downcast=s.allow_downcast)
  File "c:\users\hppc\mila\dev\git\theano\theano\tensor\type.py", line 140, in filter
    raise TypeError(err_msg)
TypeError: Bad input argument to theano function with name "c:\users\hppc\mila\dev\git\pymc3\pymc3\step_methods\metropolis.py:433" at index 0 (0-based).  
Backtrace when that variable is created:

  File "PyMC+Disaster-test.py", line 76, in <module>
    step2 = pm.Metropolis([switchpoint, disasters.missing_values[0]] )
  File "c:\users\hppc\mila\dev\git\pymc3\pymc3\step_methods\arraystep.py", line 60, in __new__
    step.__init__([var], *args, **kwargs)
  File "c:\users\hppc\mila\dev\git\pymc3\pymc3\step_methods\metropolis.py", line 98, in __init__
    self.delta_logp = delta_logp(model.logpt, vars, shared)
  File "c:\users\hppc\mila\dev\git\pymc3\pymc3\step_methods\metropolis.py", line 429, in delta_logp
    inarray1 = tensor_type('inarray1')
TensorType(float32, vector) cannot store a value of dtype float64 without risking loss of precision. If you do not mind this loss, you can: 1) explicitly cast your data to float32, or 2) set "allow_input_downcast=True" when calling "function". Value: "array([ 1.39761976,  1.55502846])"

I suggest to hold the dtype of Discrete classes to int64. That is, raise a TypeError in Discrete super-class constructor if dtype != int64.

Other modifications still include to remove trust_input, and to cast q and q0 to int64 only when Metropolis handles a Discrete model.

Could it be a good solution, @twiecki ?

@nouiz

notoraptor · 2016-11-25T16:50:56Z

PS: Perhaps the dtype of Continuous classes (float64) should also be forced (with a TypeError raised if any other dtype is specified) ?

twiecki · 2016-11-28T05:12:49Z

@notoraptor

I suggest to hold the dtype of Discrete classes to int64. That is, raise a TypeError in Discrete super-class constructor if dtype != int64.

Yes, I think that's the right solution here. Do you want to implement that?

In regards to float64 -- I think forcing that would not allow PyMC3 to run on the GPU which requires float32 to be used. See #1338.

However, we could just check that the dtype matches to whatever default we want to use (and perhaps automatically cast it).

Arguments for delta_logp in Metropolis.astep() are now casted to 'int64' when Metropolis handles a discrete model.

notoraptor · 2016-11-28T14:14:14Z

@twiecki Ok ! So I just update the PR with int64 corrections only.

Updated.

twiecki · 2016-11-29T10:29:38Z

Thanks @notoraptor!

* Suggestion to fix the TypeError issue pymc-devs#994 on Windows. Errors occurs because some ndarrays with dtype=int32 are passed to the function delta_logp which expects ndarrays with dtype=int64 as inputs. Inputs are casted with `ndarray.astype(int)`, but on Windows, the Python 'int' type is treated as a 32-bit integer by NumPy. I suggest to remove the line with `f.trust_input = True` to ensure that Theano always checks the inputs and casts it if necessary and when possible. This fix resolves the issue on my computer: ``` Python version: 2.7.12 |Continuum Analytics, Inc.| (default, Jun 29 2016, 11:07:13) [MSC v.1500 64 bit (AMD64)] OS version: Windows-8.1-6.3.9600 ``` * Discrete super-class raises TypeError if dtype != 'int64'. Arguments for delta_logp in Metropolis.astep() are now casted to 'int64' when Metropolis handles a discrete model.

Discrete super-class raises TypeError if dtype != 'int64'.

89d369d

Arguments for delta_logp in Metropolis.astep() are now casted to 'int64' when Metropolis handles a discrete model.

twiecki merged commit a7410c2 into pymc-devs:master Nov 29, 2016

twiecki mentioned this pull request Dec 5, 2016

Coal Mining: Disaster model on Getting started gives TypeError: expected type_num 9 (NPY_INT64) got 7 #994

Closed

notoraptor mentioned this pull request Feb 17, 2017

Clarify error message related to expected kern_shape specs (1D, int64) in GpuDnnConvDesc Theano/Theano#5568

Merged

ColCarroll mentioned this pull request Apr 3, 2018

Fixing bug when working with discrete and continuous variables using Metropolis step with blocked=True #2918

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion to fix the TypeError issue #994 on Windows. #1550

Suggestion to fix the TypeError issue #994 on Windows. #1550

notoraptor commented Nov 24, 2016 •

edited

twiecki commented Nov 25, 2016

twiecki commented Nov 25, 2016

twiecki commented Nov 25, 2016

nouiz commented Nov 25, 2016 via email

twiecki commented Nov 25, 2016

twiecki commented Nov 25, 2016

nouiz commented Nov 25, 2016 via email

notoraptor commented Nov 25, 2016

notoraptor commented Nov 25, 2016

twiecki commented Nov 28, 2016

notoraptor commented Nov 28, 2016

twiecki commented Nov 29, 2016

Suggestion to fix the TypeError issue #994 on Windows. #1550

Suggestion to fix the TypeError issue #994 on Windows. #1550

Conversation

notoraptor commented Nov 24, 2016 • edited

twiecki commented Nov 25, 2016

twiecki commented Nov 25, 2016

twiecki commented Nov 25, 2016

nouiz commented Nov 25, 2016 via email

twiecki commented Nov 25, 2016

twiecki commented Nov 25, 2016

nouiz commented Nov 25, 2016 via email

notoraptor commented Nov 25, 2016

notoraptor commented Nov 25, 2016

twiecki commented Nov 28, 2016

notoraptor commented Nov 28, 2016

twiecki commented Nov 29, 2016

notoraptor commented Nov 24, 2016 •

edited