Skip to content

Bug: unclear handling of arrays of different shape with to_arrays #1380

@arturoptophys

Description

@arturoptophys

Bug Report

Description

to_arrays() cannot combine arrays of different shape if they differ on 2. axis, but can handle different size arrays when they differ on 1. axis.

Reproducibility

import numpy as np
import datajoint as dj

dj.conn()  # using datajoint.json
schema = dj.Schema("TEST")


@schema
class RandomNumbers(dj.Manual):
    definition = """
    idx: int32
    ---
    values1: <blob>
    values2: <blob>
    """
for idx,seed in enumerate([42, 7, 123]):
    RandomNumbers.insert1(
        {
            "idx": idx,
            "values1": np.random.RandomState(seed).randn(100) if idx==0 else np.random.RandomState(seed).randn(100,idx),            
            "values2": np.random.RandomState(seed).randn(100) if idx==0 else np.random.RandomState(seed).randn(idx,100),
        }
    )
RandomNumbers.to_arrays('values2').shape  # returns 3
RandomNumbers.to_arrays('values1').shape  # fails
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File ~/datajoint-python/src/datajoint/expression.py:859, in QueryExpression.to_arrays(self, include_key, order_by, limit, offset, squeeze, *attrs)
    858 try:
--> 859     arr = np.array(values)
    860 except ValueError:
    861     # Variable-size data (e.g., arrays of different shapes)

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (3, 100) + inhomogeneous part.

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[17], line 1
----> 1 RandomNumbers.to_arrays('values1')

File ~/datajoint-python/src/datajoint/expression.py:862, in QueryExpression.to_arrays(self, include_key, order_by, limit, offset, squeeze, *attrs)
    859         arr = np.array(values)
    860     except ValueError:
    861         # Variable-size data (e.g., arrays of different shapes)
--> 862         arr = np.array(values, dtype=object)
    863     result_arrays.append(arr)
    865 if include_key:

ValueError: could not broadcast input array from shape (100,1) into shape (100,)

Expected Behavior

If arrays cannot be broadcast together, they should be stacked as is done with values2 in the example.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIndicates an unexpected problem or unintended behaviortriageIndicates issues, pull requests, or discussions need to be reviewed for the first time

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions