Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dtype "f64" silently results in "float32" #5790

Closed
cdeil opened this issue Apr 23, 2015 · 11 comments · Fixed by #5840
Closed

dtype "f64" silently results in "float32" #5790

cdeil opened this issue Apr 23, 2015 · 11 comments · Fixed by #5840

Comments

@cdeil
Copy link

cdeil commented Apr 23, 2015

I've recently been bitten by this:

>>> import numpy as np
>>> np.dtype('f64')
dtype('float32')
>>> np.__version__
'1.9.2'

Some proposals what to change:

  1. Warning for 'f64' input?
  2. Error for 'f64' input?
  3. Make 'f64' result in 'float64'

I know change is difficult for numpy because of backwards-compatibility concerns.
Hopefully it's possible to be more strict here ... in my code I was getting slightly incorrect results for half a year because I was using 32-bit floats where I know I needed to use 64-bit floats, but for some reason wrote "f64" instead of "float64".

@rgommers
Copy link
Member

I would consider this a bug, so no need to worry about backwards compat. First thought: an error makes sense here.

@jaimefrio
Copy link
Member

The correct code for float64 is 'f8', you have to give it the number of bytes, not bits.

But the behavior does seem to be wrong, it seems to be checking the letter, and if the number doesn't make sense, it just returns the default type:

>>> np.dtype('f')
dtype('float32')
>>> np.dtype('f8')
dtype('float64')
>>> np.dtype('f123')
dtype('float32')
>>> np.dtype('i')
dtype('int32')
>>> np.dtype('i8')
dtype('int64')
>>> np.dtype('i123')
dtype('int32')

I agree with Ralf that raising an error makes all the sense in the world here.

@cdeil
Copy link
Author

cdeil commented Apr 23, 2015

+1 to raising an error.

If it's a simple change to make I could attach a commit here.
Is it? In which file / function?

@jaimefrio
Copy link
Member

The relevant code is here. It seems there has been a deprecation warning in place since 1.7:

>>> import numpy as np
>>> import warnings
>>> warnings.simplefilter('always')
>>> np = np.dtype('f64')
__main__:1: DeprecationWarning: Specified size is invalid for this data type.
Size will be ignored in NumPy 1.7 but may throw an exception in future versions.

Has the time come to turn this into an error in 1.10? I guess discussing on the list would be in order. Can you send a message to the mailing list, Christoph?

@charris
Copy link
Member

charris commented Apr 23, 2015

Raising an error looks like the right thing. Because it has been deprecated, I don't think list discussion is needed.

@cdeil
Copy link
Author

cdeil commented Apr 23, 2015

Looking at

PyArray_TypestrConvert(int itemsize, int gentype)

it's not clear at all to me what needs to be done.

Could someone else here please take care of this change (or discussion if needed)?

Thanks!

@jaimefrio
Copy link
Member

I'll look into it today or tomorrow.

@cdeil
Copy link
Author

cdeil commented May 4, 2015

@jaimefrio - ping

@jaimefrio
Copy link
Member

There's a tricky thing here, which has me swamped with test errors in trying to fix this... If you look at the code I linked above, one of the behaviors that is explicitly deprecated is using codes 'O4' or 'O8' for object arrays. Getting rid of the deprecation warning, including some special casing to handle pickling that should still accept, is not much of a problem. The issue is that, with current master:

In [5]: np.dtype('O').str
Out[5]: '|O8'

That is, we have deprecated, and want to turn into an error, a dtype string descriptor... that is the one dtype.str is producing!

I see 3 ways to move forward on this:

  1. Leave the object dtypes as is, and turn into errors all other dtypes only.
  2. Turn into errors all dtypes except object, change the string representation of object dtypes to remove the trailing size, but leave the object dtypes interpretation as is, still accepting the trailing size descriptor.
  3. Turn all dtypes to errors and remove the trailing size from object dtype descriptors.

Doing 1 is just punting on the object behavior, not my favorite. Doing 3 may be a little too aggressive, although it is the state we wish to one day arrive at, and what I am leaning towards right now. Doing 2 now, and turning trailing sizes on object dtypes into errors in the next release may be the aristotelian golden mean.

Thoughts are welcome!

@mhvk
Copy link
Contributor

mhvk commented May 5, 2015

FWIW: I'd split the solution into two, one implementing (1) and thus solving the present issue, the other going after the object dtypes (where I am not completely sure what the best solution would be; probably your (2) first, then (3), just to be sure, though maybe directly going to (3) is fine).

@njsmith
Copy link
Member

njsmith commented May 5, 2015

Sounds to me like you've answered your question :-).

No reason to delay the deprecation->changes for non-O dtypes. For O we
definitely should stop emitting those integers regardless, since they are
truly meaningless. Given the situation, it would make sense to be a but
lenient and keep accepting O4 and O8 with warnings for now, so they'll take
another release cycle to catch up with the other dtypes. (As a small
refinement we could start erroring on all O sizes that aren't 4 or 8 now.)
But the main thing is that in a few releases we eventually stabilize on the
right thing, I.e. all meaningless size specifiers are errors.
On May 5, 2015 6:27 AM, "Jaime" notifications@github.com wrote:

There's a tricky thing here, which has me swamped with test errors in
trying to fix this... If you look at the code I linked above, one of the
behaviors that is explicitly deprecated is using codes 'O4' or 'O8' for
object arrays. Getting rid of the deprecation warning, including some
special casing to handle pickling that should still accept, is not much of
a problem. The issue is that, with current master:

In [5]: np.dtype('O').str
Out[5]: '|O8'

That is, we have deprecated, and want to turn into an error, a dtype
string descriptor... that is the one dtype.str is producing!

I see 3 ways to move forward on this:

  1. Leave the object dtypes as is, and turn into errors all other
    dtypes only.
  2. Turn into errors all dtypes except object, change the string
    representation of object dtypes to remove the trailing size, but leave the
    object dtypes interpretation as is, still accepting the trailing size
    descriptor.
  3. Turn all dtypes to errors and remove the trailing size from object
    dtype descriptors.

Doing 1 is just punting on the object behavior, not my favorite. Doing 3
may be a little too aggressive, although it is the state we wish to one day
arrive at, and what I am leaning towards right now. Doing 2 now, and
turning trailing sizes on object dtypes into errors in the next release may
be the aristotelian golden mean.

Thoughts are welcome!


Reply to this email directly or view it on GitHub
#5790 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants