Fixes the swap of data and feature dimension to work in the general case. #214

kshiteejm · 2023-04-12T02:05:58Z

Previous implementation was broken as using transpose assumes that data_list is a 2D array.

However, in certain cases (when all the feature values array lengths are the same) the data_list can be a 3D array as the call to data_list = np.array(list(dataset.as_numpy_iterator()), dtype=object) merges inner np arrays and converts data_list into one big 3D array.

…ase. Previous implementation was broken as using transpose assumes that `data_list` is a 2D array. However, in certain cases (when all the feature values array lengths are the same) the `data_list` can be a 3D array as the call to `data_list = np.array(list(dataset.as_numpy_iterator()), dtype=object)` merges inner np arrays and converts `data_list` into one big 3D array.

google-cla · 2023-04-12T02:06:02Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

mtrofin · 2023-04-12T14:18:44Z

Thanks, @kshiteejm!

To capture offline discussion and @kshiteejm's offline example:

it's how numpy behaves:

>>> np.array(list([[np.ones(2)], [np.ones(2)], [np.ones(2)]]), dtype=object)
array([[[1.0, 1.0]],

       [[1.0, 1.0]],

       [[1.0, 1.0]]], dtype=object)

>>> np.array(list([[np.ones(2)], [np.ones(2)], [np.ones(3)]]), dtype=object)
array([[array([1., 1.])],
       [array([1., 1.])],
       [array([1., 1., 1.])]], dtype=object)

In our case, the data is shaped as:

1st dimension is traces (i.e. 1 per module)
2nd dimension is features
3rd dimension is feature tensor values

The general case is that the feature values have different shapes, in which case the data would be 2D with object values (the objects being various sized arrays). But if all features have exactly the same length, the result appears as a 3D value.

Thanks, @kshiteejm, for this clarification!

kshiteejm requested a review from mtrofin April 12, 2023 02:05

mtrofin merged commit 0f24e63 into main Apr 12, 2023

mtrofin deleted the kshiteejm-patch-1 branch April 12, 2023 14:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes the swap of data and feature dimension to work in the general case. #214

Fixes the swap of data and feature dimension to work in the general case. #214

kshiteejm commented Apr 12, 2023

google-cla bot commented Apr 12, 2023

mtrofin commented Apr 12, 2023

Fixes the swap of data and feature dimension to work in the general case. #214

Fixes the swap of data and feature dimension to work in the general case. #214

Conversation

kshiteejm commented Apr 12, 2023

google-cla bot commented Apr 12, 2023

mtrofin commented Apr 12, 2023