Skip to content

copy.deepcopy() raises TypeError on And, Or, Not expressions #3297

@robreeves

Description

@robreeves

Apache Iceberg version

None

Please describe the bug 🐞

copy.deepcopy() on And, Or, or Not raises TypeError because Pydantic v2's BaseModel.__deepcopy__ calls cls.__new__(cls) with no args, but these classes require positional arguments in __new__.

import copy
from pyiceberg.expressions import And, EqualTo

copy.deepcopy(And(EqualTo("x", 1), EqualTo("y", 2)))
# TypeError: And.__new__() missing 2 required positional arguments: 'left' and 'right'

Use case
We use PyIceberg to load data for model training in Ray. As part of this we have PyIceberg filters that are pushed down to dataset reads as part of a config object. Ray is doing a deep copy and we hit this issue. Other PyIceberg filters don't have this issue because they don't override __new__.

File "/export/apps/python/3.12/lib/python3.12/site-packages/ray/train/v2/api/data_parallel_trainer.py", line 179, in fit
      callbacks=self._create_default_callbacks(),
    File "/export/apps/python/3.12/lib/python3.12/site-packages/ray/train/v2/api/data_parallel_trainer.py", line 206, in _create_default_callbacks
      datasets_callback = DatasetsSetupCallback(
    File "/export/apps/python/3.12/lib/python3.12/site-packages/ray/train/v2/_internal/callbacks/datasets.py", line 54, in __init__
      self._data_config = copy.deepcopy(train_run_context.dataset_config)
    File "/export/apps/python/3.12/lib/python3.12/copy.py", line 162, in deepcopy
      y = _reconstruct(x, memo, *rv)
    File "/export/apps/python/3.12/lib/python3.12/copy.py", line 259, in _reconstruct
      state = deepcopy(state, memo)
    File "/export/apps/python/3.12/lib/python3.12/copy.py", line 136, in deepcopy
      y = copier(x, memo)
    File "/export/apps/python/3.12/lib/python3.12/copy.py", line 221, in _deepcopy_dict
      y[deepcopy(key, memo)] = deepcopy(value, memo)
    File "/export/apps/python/3.12/lib/python3.12/copy.py", line 136, in deepcopy
      y = copier(x, memo)
    File "/export/apps/python/3.12/lib/python3.12/copy.py", line 196, in _deepcopy_list
      append(deepcopy(a, memo))
    File "/export/apps/python/3.12/lib/python3.12/copy.py", line 162, in deepcopy
      y = _reconstruct(x, memo, *rv)
    File "/export/apps/python/3.12/lib/python3.12/copy.py", line 259, in _reconstruct
      state = deepcopy(state, memo)
    File "/export/apps/python/3.12/lib/python3.12/copy.py", line 136, in deepcopy
      y = copier(x, memo)
    File "/export/apps/python/3.12/lib/python3.12/copy.py", line 221, in _deepcopy_dict
      y[deepcopy(key, memo)] = deepcopy(value, memo)
    File "/export/apps/python/3.12/lib/python3.12/copy.py", line 162, in deepcopy
      y = _reconstruct(x, memo, *rv)
    File "/export/apps/python/3.12/lib/python3.12/copy.py", line 253, in _reconstruct
      y = func(*args)
    File "/export/apps/python/3.12/lib/python3.12/copy.py", line 252, in <genexpr>
      args = (deepcopy(arg, memo) for arg in args)
    File "/export/apps/python/3.12/lib/python3.12/copy.py", line 143, in deepcopy
      y = copier(memo)
    File "/export/apps/python/3.12/lib/python3.12/site-packages/pydantic/main.py", line 977, in __deepcopy__
      m = cls.__new__(cls)
  TypeError: And.__new__() missing 2 required positional arguments: 'left' and 'right'

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions