Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bincount does not accept input of type > N.uint16 (Trac #225) #823

Open
numpy-gitbot opened this issue Oct 19, 2012 · 10 comments
Open

bincount does not accept input of type > N.uint16 (Trac #225) #823

numpy-gitbot opened this issue Oct 19, 2012 · 10 comments

Comments

@numpy-gitbot
Copy link

Original ticket http://projects.scipy.org/numpy/ticket/225 on 2006-08-03 by @stefanv, assigned to @teoliphant.

Under r2944:

In [22]: N.bincount(N.array([1],N.uint16))
Out[22]: array([0, 1])





In [23]: N.bincount(N.array([1],N.uint32))
---------------------------------------------------------------------------
exceptions.TypeError                                 Traceback (most recent call last)

/home/stefan/work/scipy/hough/<ipython console>

TypeError: array cannot be safely cast to required type
@numpy-gitbot
Copy link
Author

@stefanv wrote on 2006-08-03

Can be improved by changing to NPY_UINTP, but that still won't allow 64-bit integers. Is there a way to easily fix this for 32-bit systems?

@numpy-gitbot
Copy link
Author

Milestone changed to 1.1 by @alberts on 2007-05-12

@numpy-gitbot
Copy link
Author

Milestone changed to Unscheduled by @cournape on 2009-03-02

@numpy-gitbot
Copy link
Author

@mwiebe wrote on 2011-03-23

This function still errors (with uint64 on a 64-bit machine). Bincount is a candidate to convert to using the iterator with buffering, since currently it will cause a copy if the input isn't contiguous and the right type.

@numpy-gitbot
Copy link
Author

@bsouthey wrote on 2011-05-09

My C is not very good but hopefully someone with can correct the patch. The less obvious change (at least was to me) was that PyArray_ContiguousFromAny was being called with unsigned int, PyArray_INTP, rather than PyArray_UINTP.

According to the comment on the bincount code in "numpy/lib/src/_compiled_base.c", the first argument must be an array of non-negative integers (relevant lines):

 * bincount accepts one, two or three arguments. The first is an array of
 * non-negative integers The second, if present, is an array of weights,

Consequently I changed bincount to use unsigned ints instead of signed ints.

The patch is incorrect if the mxx and mnx functions can be used outside of bincount. In that case new functions would have to declared.

@numpy-gitbot
Copy link
Author

Attachment added by @bsouthey on 2011-05-09: 0001-bincount-unsigned-ints.patch

@numpy-gitbot
Copy link
Author

Attachment added by @bsouthey on 2011-05-09: 0002-Redo-bincount-signed-int-change.patch

@numpy-gitbot
Copy link
Author

@bsouthey wrote on 2011-05-09

Okay, still was not that simple! I had to change one PyObject variable into PyArrayObject. This allowed to get the input dtype for PyArray?_ContiguousFromAny. Both patches need to be applied because I do not know how to get a single patch with git without redoing everything - sorry!

@charris charris added the Patch label Feb 18, 2014
jaimefrio added a commit to jaimefrio/numpy that referenced this issue Feb 19, 2014
This adds an axis argument to np.bincount, which can now take multidimensional
arrays and do the counting over an arbitrary number of axes. `axis` defaults
to `None`, which does the counting over all the axes, i.e. over the flattened
array.

The shape of the output is computed by removing from `list` (the input array)
all dimensions in `axis`, and appending a dimension of length
`max(max(list) + 1, minlength)` to the end. `out[..., n]` will hold the number
of occurrences of `n` at the given position over the selected axes.

If a `weights` argument is provided, its shape is broadcasted with `list`
before removing the axes. In this case, `axis` refers to the axes of `list`
*before* broadcasting, and `out[..., n]` will hold the sum of `weights` over
the selected axes, at all positions where `list` takes the value `n`.

The general case is handled with nested iterators, but shortcuts without
having to set up an iterator are provided for 1D cases, with no performance
loss against the previous version.

As a plus, this PR also solves numpy#823, by providing specialized functions for
all integer types to find the max. There are also specialized functions for
all integer types for counting and doing weighted counting.
jaimefrio added a commit to jaimefrio/numpy that referenced this issue Feb 26, 2014
Gives array-likes that cannot be cast to np.intp a second chance, by
comparing its entries one by one to NPY_INTP_MAX and forcing the cast if
none exceeds it. Fixes numpy#823.
@OmerJog
Copy link

OmerJog commented Feb 6, 2019

Is this still an issue?

@endolith
Copy link
Contributor

endolith commented Apr 17, 2019

@OmerJog The original example doesn't fail:

np.bincount(np.array([1], np.uint32))
Out[1]: array([0, 1], dtype=int64)

but this does:

np.bincount(np.array([1], np.uint64))
Traceback (most recent call last):

  File "<ipython-input-2-850f8c15a097>", line 1, in <module>
    np.bincount(np.array([1], np.uint64))

TypeError: Cannot cast array data from dtype('uint64') to dtype('int64') according to the rule 'safe'

pranaysy added a commit to pranaysy/entropy that referenced this issue Mar 24, 2021
Reasons:
 - np.bincount can't work with uint64 safely
    open issues:
    numpy/numpy#17760
    numpy/numpy#823
    numpy/numpy#4384

 - Not easy to exceed the largest 32-bit uint: 4294967295

 - UTF-8 can have 4 bytes or 32 bits max

 - Platform-independence is not the goal since other parts of the
   module have 64-bit floats
pranaysy added a commit to pranaysy/entropy that referenced this issue Mar 25, 2021
 - The function entropy.lziv_complexity now converts all valid input
   types to arrays of type np.uint32 for fast computation with Numba.

 - For strings and list of strings, utf-8 numeric representations
   are used for integer arrays using built-in ord

 - Normalization uses np.bincount instead of set on numpy arrays,
   significant speedup on large arrays. A major reason for
   using 32-bit and not 64-bint uints as  np.bincount can't work with
   uint64 safely. Several open issues about this:
      numpy/numpy#17760
      numpy/numpy#823
      numpy/numpy#4384
pranaysy added a commit to pranaysy/entropy that referenced this issue Mar 29, 2021
 - The function entropy.lziv_complexity now converts all valid input
   types to arrays of type np.uint32 for fast computation with Numba.

 - For strings and list of strings, utf-8 numeric representations
   are used for integer arrays using built-in ord

 - Normalization uses np.bincount instead of set on numpy arrays,
   significant speedup on large arrays. A major reason for
   using 32-bit and not 64-bint uints as  np.bincount can't work with
   uint64 safely. Several open issues about this:
      numpy/numpy#17760
      numpy/numpy#823
      numpy/numpy#4384
pranaysy added a commit to pranaysy/entropy that referenced this issue Mar 29, 2021
Reasons:
 - np.bincount can't work with uint64 safely
    open issues:
    numpy/numpy#17760
    numpy/numpy#823
    numpy/numpy#4384

 - Not easy to exceed the largest 32-bit uint: 4294967295

 - UTF-8 can have 4 bytes or 32 bits max

 - Platform-independence is not the goal since other parts of the
   module have 64-bit floats
pranaysy added a commit to pranaysy/antropy that referenced this issue Mar 29, 2021
 - The function entropy.lziv_complexity now converts all valid input
   types to arrays of type np.uint32 for fast computation with Numba.

 - For strings and list of strings, utf-8 numeric representations
   are used for integer arrays using built-in ord

 - Normalization uses np.bincount instead of set on numpy arrays,
   significant speedup on large arrays. A major reason for
   using 32-bit and not 64-bint uints as  np.bincount can't work with
   uint64 safely. Several open issues about this:
    numpy/numpy#17760
    numpy/numpy#823
    numpy/numpy#4384
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants