Skip to content

enhancements numpy getitem

DagSverreSeljebotn edited this page May 25, 2008 · 2 revisions
Clone this wiki locally

The getitem operator

This prototypes __getitem__ using the Python interface, introducing any fancy features we'd need to pull off that. If that won't work, we'll roll our own __cgetitem__ with an easier compilable interface, at least for now.

How Python getitem works

>>> class A:
...     def __getitem__(self, idx): print repr(idx)
>>> a = A()
>>> a[2]
>>> a[2,3]
(2, 3)
>>> a[...,4]
(Ellipsis, 4)
>>> a[1:2]
slice(1, 2, None)
>>> a[1:2:-1, ..., 4]
(slice(1, 2, -1), Ellipsis, 4)


Comment legend:
  • ECTO - "Easily Compile-Time Optimizeable"; provided that the function is inlined.
  • CTO - Possibly compile-time optimizeable with a not-too-complex visitor simulating a Python interpreter for compile-time-known expressions and statements. I.e., this comment below means that the values involved will be known at compile-time in the regular use-case (and that not optimizing is the right thing to do if it is not known)
  • DBA - "Disappears Because of Assumptions".

I'll treat always as the underlying C buffer rather than the Python attribute -- this issue is somewhat orthogonal, and anyway I'd guess that [] would lead to the C "overload" getting selected.

This is inside an cdef class ndarray block in a pxd file.


# Use "generic" (parameter polymorphism) in order to avoid certain typing issues
# that will be explained as we go. "index" will have to be compile-time optimized
# anyway, so can as well make it object.
# "self" is generic because in my parameter polymorphism draft, I propose that
# assumptions gets carried over when "generic" is used. I.e., using "generic"
# for self makes sure that one instance of the method is created per
# assumption-combination. (This is, I suppose, only really useful if the method
# is not inlined though. So might make it "object" and rely on inlining optimization
# instead *shrug*)

cdef generic __getitem__(generic self, object index):
    if isinstance(index, int): # ECTO
        if not self.ndim == 1: # DBA
            return (<object>self)[index] # gets a slice
            # Somehow use an assumption, self.dtype, in a type context. If the dtype assumption
            # is not made then this will raise compile-time error.
            # Since the user is "probably" assigning this to a variable of the right type,
            # the "generic" return value will be the correct one (or if not, the return statement
            # will turn into raising the correct coercion error)
            return (<self.dtype*>( + self.strides[0] * index))[0]
    else: # ECTO, is a tuple
        # All of the below is CTO, but definitely the hardest part
        offset = 0
        for idx, item in enumerate(index):
            if i.__class__ is slice or i is Ellipsis: break
            else: offset += self.strides[idx] * item
            # can use direct access
            return (<self.dtype*>( + offset))[0]
        # Encountered break, fall back to Python
        return (<object>self)[index]
Something went wrong with that request. Please try again.