Skip to content

BUG: Setting an object array via slicing with nested lists invokes the objects __array__ methods unnecessarily #24336

@chriseclectic

Description

@chriseclectic

Describe the issue:

I have been running into issues with trying to use Numpy object arrays to contruct ND-Arrays of Python objects where I want to take advantage of the ND-indexing and other array features, but the values of the array are Python objects, which themselves have an __array__ method defined, but the __array__ method is inefficient and should be avoided.

In these cases I am not able to construct an object array from a nested list via slicing without invoking the objects __array__ method.

A simplified example of where this is a problem is as follows:

Consider a skeleton implementation of a Python class for N-qubit Pauli matrices. These objects have an efficient representation (for simplicity I will just use the length N string labels), but can also be represented inefficiently as shape (2 ** N, 2 **N) complex matrices. Suppose this class defines an __array__ method to convert to these inefficient complex arrays like shown in the bellow code example.

Now suppose I want to construct an object ND-array of Paulis. The simplest case would be

arr1 = np.empty([1], dtype=object)
arr1[0] = Pauli("X")

This does not invoke the Pauli.__array__ method and returns array([Pauli(X)], dtype=object).

However if I try and set via slicing

arr2 = np.empty([1], dtype=object)
arr2[:] = [Pauli("X")]
# Prints: Pauli.__array__ called

This invokes the array method, even though the returned array is an object array with the original objects as values (np.all(arr1 == arr2) is True).

Complete code shown bellow, where you can see where this becomes a problem for example by setting nq=20 which would require ~17TB memory per Pauli for each __array__ return.

Reproduce the code example:

import numpy as np

class Pauli:
    """Basic N-qubit Pauli"""

    _mats = {
        "I": np.eye(2, dtype=complex),
        "X": np.array([[0, 1], [1, 0]], dtype=complex),
        "Y": np.array([[0, -1j], [0, 1j]], dtype=complex),
        "Z": np.array([[1, 0], [0, -1]], dtype=complex),
    }

    def __init__(self, label):
        """Initialize a Pauli from a string label"""
        self.label = label
     
    def __repr__(self):
        return f"Pauli({self.label})"
    
    def to_matrix(self):
        """Convert to a (2**N, 2**N) complex matrix"""
        ret = np.eye(1, dtype=complex)
        for i in self.label:
            ret = np.kron(self._mats[i], ret)
        return ret

    def __array__(self, dtype=None):
        # Print to know when the `__array__` method is invoked
        print("Pauli.__array__ called")
        mat = self.to_matrix()
        if dtype:
            mat = mat.astype(dtype)
        return mat

    def __eq__(self, other):
        if isinstance(other, Pauli):
            return self.label == other.label
        return False

# Create an array-like nested list of Pauli objects
nq = 1
paulis = [[Pauli(nq * "I"), Pauli(nq * "X")],
          [Pauli(nq * "Y"), Pauli(nq * "Z")]]

# Initialize and empty array and set values to nested list elements
arr = np.empty([2, 2], dtype=object)
arr[:] = paulis

# This print `Pauli.__array__ called` 4 times, so invokes it for each Pauli
# Still returns an object array with the `Pauli` objects as values.

Error message:

No response

Runtime information:

My main numpy runtime is

1.23.5
3.9.13 | packaged by conda-forge | (main, May 27 2022, 17:01:00) 
[Clang 13.0.1 ]

I also tried on Numpy 1.25.2 and get same results.

Context for the issue:

This is probably quite a specific edge case, but it is something I've been struggling with for awhile and have had to come up with various messy work arounds to avoid the issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions