Skip to content

bug: unable to cast from list(t) to list(python) #4426

@NellyWhads

Description

@NellyWhads

Describe the bug

Hello hello,

Daft panics when using tensors or python objects in list columns. The example below fails for both return dtypes.

To Reproduce

"""Example of using a list of tensors in a dataframe."""

from typing import List

import daft
import numpy as np

# Create a list of 5 boxes with xyxy coordinates for each row
boxes = [
    [[100, 100, 200, 200], [300, 300, 400, 400], [500, 500, 600, 600], [700, 700, 800, 800], [900, 900, 1000, 1000]]
    for _ in range(5)
]

# Create the dataframe with the boxes column
df = daft.from_pylist([{"boxes": row_boxes} for row_boxes in boxes])
print(df.schema())
print(df.collect())

# @daft.udf(return_dtype=daft.DataType.list(daft.DataType.python()))
@daft.udf(return_dtype=daft.DataType.list(daft.DataType.tensor(daft.DataType.int64(), shape=(4,))))
def convert_boxes_to_numpy(boxes: daft.Series) -> List[List[np.ndarray]]:
    """Convert the boxes column to a list of numpy arrays."""
    return [[np.array(box) for box in boxes_] for boxes_ in boxes.to_pylist()]

df = df.with_column("boxes_numpy", convert_boxes_to_numpy(daft.col("boxes")))
print(df.schema())
print(df.collect())

Expected behavior

This should not raise "List not supported for ..." errors.

Component(s)

Expressions

Additional context

No response

Metadata

Metadata

Labels

bugSomething isn't workingtypesIssues related to the type system

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions