Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In io.ascii, fall back to string if integers are too large #2234

Merged
merged 7 commits into from Mar 27, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGES.rst
Expand Up @@ -194,6 +194,10 @@ Bug Fixes

- ``astropy.io.ascii``

- When reading a table with values that generate an overflow error
during type conversion (e.g. overflowing the native C long type), fall
through to using string. Previously this generated an exception [#2234].

- ``astropy.io.fits``

- ``astropy.io.misc``
Expand Down
8 changes: 8 additions & 0 deletions astropy/io/ascii/core.py
Expand Up @@ -16,10 +16,12 @@
import itertools
import functools
import numpy
import warnings

from ...extern import six
from ...extern.six.moves import zip
from ...extern.six.moves import cStringIO as StringIO
from ...utils.exceptions import AstropyWarning

from ...table import Table
from ...utils.data import get_readable_fileobj
Expand Down Expand Up @@ -642,6 +644,12 @@ def _convert_vals(self, cols):
col.type = converter_type
except (TypeError, ValueError):
col.converters.pop(0)
except OverflowError:
# Overflow during conversion (most likely an int that doesn't fit in native C long).
# Put string at the top of the converters list for the next while iteration.
warnings.warn("OverflowError converting to {0} for column {1}, using string instead."
.format(converter_type.__name__, col.name), AstropyWarning)
col.converters.insert(0, convert_numpy(numpy.str))
except IndexError:
raise ValueError('Column %s failed to convert' % col.name)

Expand Down
17 changes: 17 additions & 0 deletions astropy/io/ascii/tests/test_read.py
Expand Up @@ -11,11 +11,28 @@
from ... import ascii as asciitable # TODO: delete this line, use ascii.*
from ... import ascii
from ....table import Table
from distutils import version

from .common import (raises, assert_equal, assert_almost_equal,
assert_true, setup_function, teardown_function)
from ....tests.helper import pytest

_NUMPY_VERSION = version.LooseVersion(np.__version__)

def test_convert_overflow():
"""
Test reading an extremely large integer, which falls through to
string due to an overflow error (#2234).
"""
# Before Numpy 1.6 the exception from np.array(['1' * 10000], dtype=np.int)
# is exactly the same as np.array(['abc'], dtype=np.int). In this case
# it falls through to float, so we just accept this as a known issue for
# numpy < 1.6.
expected_kind = ('f',) if _NUMPY_VERSION < version.LooseVersion('1.6') else ('S', 'U')
dat = ascii.read(['a', '1' * 10000], format='basic', guess=False)
assert dat['a'].dtype.kind in expected_kind


def test_guess_with_names_arg():
"""
Make sure reading a table with guess=True gives the expected result when
Expand Down
12 changes: 12 additions & 0 deletions docs/known_issues.rst
Expand Up @@ -184,3 +184,15 @@ One workaround is to install the ``bsddb3`` module.
.. [#] Continuum `says
<https://groups.google.com/a/continuum.io/forum/#!topic/anaconda/mCQL6fVx55A>`_
this will be fixed in their next Python build.


Very long integers in ASCII tables silently converted to float for Numpy 1.5
----------------------------------------------------------------------------

For Numpy 1.5, when reading an ASCII table that has integers which are too
large to fit into the native C long int type for the machine, then the
values get converted to float type with no warning. This is due to the
behavior of `numpy.array` and cannot easily be worked around. We recommend
that users upgrade to a newer version of Numpy. For Numpy >= 1.6 a warning
is printed and the values are treated as strings to preserve all information.