Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python/C++] Array constructor should not truncate floats when casting to int #19229

Closed
asfimport opened this issue Jul 16, 2018 · 5 comments
Closed

Comments

@asfimport
Copy link

I would expect the following code to raise instead of truncating the float

In [4]: pa.array([1.9], type=pa.int8())
Out[4]:
<pyarrow.lib.Int8Array object at 0x113455e58>
[
  1
]

Reporter: Florian Jetter / @fjetter

Note: This issue was originally created as ARROW-2856. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
This is consistent with Numpy behaviour. I would expect PyArrow to not be stricter than Numpy here:

>>> np.int8([1.1])
array([1], dtype=int8)

@asfimport
Copy link
Author

Wes McKinney / @wesm:
I'm inclined to agree with @pitrou on this. Raising could also potentially expose the user to issues caused by small floating point errors

@asfimport
Copy link
Author

Wes McKinney / @wesm:
NumPy does raise with NaN, which aligns with work happening in ARROW-2806

>>> np.array([np.nan], dtype='i8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: cannot convert float NaN to integer

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
Yes, we raise on NaN as well:

>>> pa.array([float('nan')], type=pa.int8())
Traceback (most recent call last):
  File "<ipython-input-3-357aa3c7f8d3>", line 1, in <module>
    pa.array([float('nan')], type=pa.int8())
  File "pyarrow/array.pxi", line 186, in pyarrow.lib.array
    return _sequence_to_array(obj, size, type, pool, from_pandas)
  File "pyarrow/array.pxi", line 40, in pyarrow.lib._sequence_to_array
    check_status(
  File "pyarrow/error.pxi", line 81, in pyarrow.lib.check_status
    raise ArrowInvalid(message)
ArrowInvalid: ../src/arrow/python/builtin_convert.cc:920 code: AppendPySequence(seq, size, real_type, builder.get(), from_pandas)
../src/arrow/python/iterators.h:60 code: func(value)
../src/arrow/python/builtin_convert.cc:454 code: internal::CIntFromPython(obj, &value)
../src/arrow/python/helpers.cc:259 code: CheckPyError()
cannot convert float NaN to integer

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
I'm rejecting the issue. Please re-open if you disagree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant