Skip to content

Bug: TypeError / ValueError when materializing Array(String) feature views with Athena offline store #6325

@alan-gauthier-jt

Description

@alan-gauthier-jt

Description

Materializing feature views that contain Array(String) columns using the Athena offline store fails intermittently with one of two errors:

Error 1: ValueError: The truth value of an empty array is ambiguous

Triggered when an entity row has no values set for an array column (e.g. tags = []).

File ".../feast/type_map.py", line 772, in _python_value_to_proto_value
    elif not pd.isnull(value):
ValueError: The truth value of an empty array is ambiguous. Use `array.size > 0` to check that an array is not empty.

Error 2: TypeError: bad argument type for built-in operation

Triggered when any user has values in an array column.

File ".../feast/type_map.py", line 905, in <listcomp>
    ProtoValue(**{field_name: proto_type(val=value)})
TypeError: bad argument type for built-in operation

Root Cause

Arrow/Athena deserializes Array(String) feature columns as numpy.ndarray with object dtype rather than plain Python lists. Two code paths in type_map.py do not handle this:

  1. Scalar null-check (_convert_scalar_values_to_proto): The line elif not pd.isnull(value) calls pd.isnull() on a numpy array, which returns an array of bools — then not <array> raises ValueError because the truth value is ambiguous.

  2. Generic list conversion (_convert_list_values_to_proto): The call proto_type(val=value) passes the raw numpy.ndarray directly to the protobuf constructor. Protobuf rejects non-list types and raises TypeError. Additionally, Arrow nullable columns can produce None elements inside the ndarray, which protobuf StringList also rejects.

  3. Type validation (_validate_collection_item_types): None elements inside an ndarray fail the type(item) in valid_types check before they can be sanitized downstream.

Steps to Reproduce

  1. Define a FeatureView with an Array(String) field:
    Field(name="tags", dtype=Array(String))
  2. Materialize from Athena where some rows have non-empty arrays, some have empty arrays, and some have NULL values in array elements.
  3. Observe ValueError or TypeError in feast/type_map.py.

Expected Behavior

Materialization completes successfully. Array columns from Arrow/Athena are converted to proto-safe Python lists, with None elements replaced by empty string.

Environment

  • Feast version: (any version with the generic list proto conversion)
  • Offline store: Athena
  • Python: 3.11
  • Feature column type: Array(String) (maps to ValueType.STRING_LIST)

Fix

I opened a PR with a fix for this issue: #6324

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions