lib.maybe_convert_objects will fail on uint64 values that exceed int64 max #4471

wesm · 2013-08-05T22:58:49Z

xref #11440 for addtl tests

Observed in the wild. cc @blais

jtratner · 2013-09-15T14:13:00Z

@jreback @wesm objects that would pass this wouldn't be compatible with anything else but integers that are all > 0 right? Float, float128, complex, and longdouble all lose precision.

jtratner · 2013-09-15T14:14:52Z

I'm wrong, float128 (which I think is the same as longdouble) can work...

cpcloud · 2013-09-15T14:16:26Z

float128 doesn't have be long double... long double could be 64 bits... it's an implementation detail but it ends up being what you expect most of the time...

cpcloud · 2013-09-15T14:17:10Z

same applies to int64 etc ... e.g., long is 32 bits on 32 bit arch and 64 on 64-bit arch

jtratner · 2013-09-15T14:17:46Z

@cpcloud so what's the right dtype to contain something that's uint64 and greater than what int64 can handle? - this SO answer claims float128 is 'a mess'. http://stackoverflow.com/questions/9062562/what-is-the-internal-precision-of-numpy-float128

jtratner · 2013-09-15T14:18:18Z

i.e., just allow uint64_t size and go from there? and then disallow with anything that's not an actual integer > 0?

cpcloud · 2013-09-15T14:23:08Z

I'm not sure why this is happening ... uint64 should hold values up to 2 * INT_MAX... i think probably allowing uint64 is the way 2 go...not sure i follow the second question.

jtratner · 2013-09-15T14:40:04Z

@cpcloud in convert_objects, if you can't fit everything into the same container, then it doesn't work. This is why uint64 doesn't work:

        elif util.is_integer_object(val):
            seen_int = 1
            floats[i] = <float64_t> val
            complexes[i] = <double complex> val
            if not seen_null:
                try:
                    ints[i] = val
                except OverflowError:
                    seen_object = 1
                    break

jtratner · 2013-09-15T14:40:26Z

it's not hard to set this up, I just wanted to clarify I had the right idea...going to fix it now.

jtratner · 2013-09-15T14:42:01Z

@cpcloud what I mean by the second question is what should be returned from this:

import sys
arr = np.array([-5, sys.maxint + 5, 3], dtype=object)
lib.maybe_convert_objects(arr)

It should be object right? Otherwise the -5 becomes gobbdledygook.

jtratner · 2013-09-15T15:29:06Z

Well, this is mostly useless anyways, because BlockManager converts uint64 to object internally in form_block:

        elif issubclass(v.dtype.type, np.integer):
            if v.dtype == np.uint64:
                # HACK #2355 definite overflow
                if (v > 2 ** 63 - 1).any():
                    object_items.append((i, k, v))
                    continue
            int_items.append((i, k, v))

So need a unsigned int type or something in block manager

jtratner · 2013-09-15T15:31:35Z

Anyways, working version of lib.maybe_convert_objects here: https://github.com/jtratner/pandas/tree/GH4471_fix_uint64_maybe_convert_objects

pwaller · 2016-05-16T18:17:13Z

I keep hitting this while importing a dataset which has uint64's in it. Is there anything I can do to help it along, given that someone already made a patch but it didn't get in?

jreback · 2016-05-16T18:18:41Z

where's the patch?

pwaller · 2016-05-17T08:39:40Z

@jreback see @jtratner's comment above. Is the patch unsuitable or is it just that it wasn't shepherded into master?

jreback · 2016-05-17T09:47:07Z

that's 2 years old - if someone wants to cherry pick and present then can look

DrRibosome · 2016-11-19T14:30:36Z

also hitting this bug - just wondering if the fix is in progress, or if interest is simply too low

jreback · 2016-11-19T14:38:56Z

well need someone motivated to push a fix

Adds handling for uint64 objects during conversion. When negative numbers and uint64 are detected, we then convert the result to object. Picks up where pandas-devgh-8485 left off. Closes pandas-devgh-4471.

Adds handling for uint64 objects during conversion. When negative numbers and uint64 are detected, we then convert the result to object. Picks up where pandas-devgh-4845 left off. Closes pandas-devgh-4471.

Adds handling for `uint64` objects during conversion. When negative numbers and `uint64` are detected, we then convert the result to `object`. Picks up where pandas-dev#4845 left off. Closes pandas-dev#4471. Author: gfyoung <gfyoung17@gmail.com> Closes pandas-dev#14916 from gfyoung/convert-objects-uint64 and squashes the following commits: ed325cd [gfyoung] BUG: Convert uint64 in maybe_convert_objects

ghost assigned jtratner Sep 15, 2013

jtratner mentioned this issue Sep 15, 2013

BUG: Make lib.maybe_convert_objects work with uint64 #4845

Closed

jreback modified the milestones: Someday, 0.14.0 Feb 18, 2014

teto mentioned this issue Oct 27, 2015

OverflowError when loading uint64 from csv #11440

Closed

jreback modified the milestones: Next Major Release, Someday Oct 27, 2015

jreback added Difficulty Intermediate labels Oct 27, 2015

jreback mentioned this issue Dec 15, 2015

pd.DataFrame converts np.uint64 greater than 2**63-1 to objects #11846

Closed

DrRibosome unassigned jtratner Nov 19, 2016

jreback mentioned this issue Nov 23, 2016

Series.unique converts uint64 to int64 (with overflow) #14721

Closed

chris-b1 mentioned this issue Dec 14, 2016

DataFrame constructor converts uint64 series to object series #14881

Closed

gfyoung mentioned this issue Dec 19, 2016

BUG: Convert uint64 in maybe_convert_objects #14916

Closed

jreback modified the milestones: 0.20.0, Next Major Release Dec 19, 2016

jreback closed this as completed in 0c52813 Dec 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib.maybe_convert_objects will fail on uint64 values that exceed int64 max #4471

lib.maybe_convert_objects will fail on uint64 values that exceed int64 max #4471

wesm commented Aug 5, 2013

jtratner commented Sep 15, 2013

jtratner commented Sep 15, 2013

cpcloud commented Sep 15, 2013

cpcloud commented Sep 15, 2013

jtratner commented Sep 15, 2013

jtratner commented Sep 15, 2013

cpcloud commented Sep 15, 2013

jtratner commented Sep 15, 2013

jtratner commented Sep 15, 2013

jtratner commented Sep 15, 2013

jtratner commented Sep 15, 2013

jtratner commented Sep 15, 2013

pwaller commented May 16, 2016 •

edited

jreback commented May 16, 2016

pwaller commented May 17, 2016

jreback commented May 17, 2016

DrRibosome commented Nov 19, 2016

jreback commented Nov 19, 2016

lib.maybe_convert_objects will fail on uint64 values that exceed int64 max #4471

lib.maybe_convert_objects will fail on uint64 values that exceed int64 max #4471

Comments

wesm commented Aug 5, 2013

jtratner commented Sep 15, 2013

jtratner commented Sep 15, 2013

cpcloud commented Sep 15, 2013

cpcloud commented Sep 15, 2013

jtratner commented Sep 15, 2013

jtratner commented Sep 15, 2013

cpcloud commented Sep 15, 2013

jtratner commented Sep 15, 2013

jtratner commented Sep 15, 2013

jtratner commented Sep 15, 2013

jtratner commented Sep 15, 2013

jtratner commented Sep 15, 2013

pwaller commented May 16, 2016 • edited

jreback commented May 16, 2016

pwaller commented May 17, 2016

jreback commented May 17, 2016

DrRibosome commented Nov 19, 2016

jreback commented Nov 19, 2016

pwaller commented May 16, 2016 •

edited