`TypeError` when drawing from a strategy that produces `dataclasses` with a `defaultdict` attribute #3812

Stranger6667 · 2023-12-14T21:15:10Z

Since Hypothesis 6.92.0, some tests in Schematesis started to fail with a traceback like this:

@given(data=st.data())
  >   @settings(deadline=None)
  
  test/test_hypothesis.py:450: 
  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
  test/test_hypothesis.py:476: in test_date_format
      case = data.draw(strategy)
  .tox/py38/lib/python3.8/site-packages/hypothesis/strategies/_internal/core.py:2106: in draw
      self.conjecture_data._observability_args[desc] = to_jsonable(result)
  .tox/py38/lib/python3.8/site-packages/hypothesis/strategies/_internal/utils.py:180: in to_jsonable
      return to_jsonable(dcs.asdict(obj))
  /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/dataclasses.py:10[73](https://github.com/schemathesis/schemathesis/actions/runs/7162876716/job/19500510033#step:6:74): in asdict
      return _asdict_inner(obj, dict_factory)
  /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/dataclasses.py:10[80](https://github.com/schemathesis/schemathesis/actions/runs/7162876716/job/19500510033#step:6:81): in _asdict_inner
      value = _asdict_inner(getattr(obj, f.name), dict_factory)
  /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/dataclasses.py:1080: in _asdict_inner
      value = _asdict_inner(getattr(obj, f.name), dict_factory)
  /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/dataclasses.py:1080: in _asdict_inner
      value = _asdict_inner(getattr(obj, f.name), dict_factory)
  /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/dataclasses.py:1080: in _asdict_inner
      value = _asdict_inner(getattr(obj, f.name), dict_factory)
  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
  
  obj = defaultdict(<class 'list'>, {}), dict_factory = <class 'dict'>
  
      def _asdict_inner(obj, dict_factory):
          if _is_dataclass_instance(obj):
              result = []
              for f in fields(obj):
                  value = _asdict_inner(getattr(obj, f.name), dict_factory)
                  result.append((f.name, value))
              return dict_factory(result)
          elif isinstance(obj, tuple) and hasattr(obj, '_fields'):
              # obj is a namedtuple.  Recurse into it, but the returned
              # object is another namedtuple of the same type.  This is
              # similar to how other list- or tuple-derived classes are
              # treated (see below), but we just need to create them
              # differently because a namedtuple's __init__ needs to be
              # called differently (see bpo-34363).
      
              # I'm not using namedtuple's _asdict()
              # method, because:
              # - it does not recurse in to the namedtuple fields and
              #   convert them to dicts (using dict_factory).
              # - I don't actually want to return a dict here.  The main
              #   use case here is json.dumps, and it handles converting
              #   namedtuples to lists.  Admittedly we're losing some
              #   information here when we produce a json list instead of a
              #   dict.  Note that if we returned dicts here instead of
              #   namedtuples, we could no longer call asdict() on a data
              #   structure where a namedtuple was used as a dict key.
      
              return type(obj)(*[_asdict_inner(v, dict_factory) for v in obj])
          elif isinstance(obj, (list, tuple)):
              # Assume we can create an object of this type by passing in a
              # generator (which is not true for namedtuples, handled
              # above).
              return type(obj)(_asdict_inner(v, dict_factory) for v in obj)
          elif isinstance(obj, dict):
  >           return type(obj)((_asdict_inner(k, dict_factory),
                                _asdict_inner(v, dict_factory))
                               for k, v in obj.items())
  E           TypeError: first argument must be callable or None
  E           Falsifying example: test_date_format(
  E               data=data(...),
  E           )

I am not sure if it is an issue on the Python or Hypothesis side, but it could be reproduced with the following snippet (running under pytest):

from collections import defaultdict
from dataclasses import dataclass, field
from hypothesis import strategies as st, given


@dataclass
class T:
    key: defaultdict = field(default_factory=lambda: defaultdict(list))
    

@given(st.data())
def test_t(data):
    data.draw(st.builds(T))

Alternatively, I can try to work around this on the Schemathesis side. Let me know what you think! :)

The text was updated successfully, but these errors were encountered:

tybug · 2023-12-14T21:52:04Z

Looks like this is an issue with dataclasses.asdict not supporting defaultdict. stdlib recently added support for it: python/cpython#32056. I'm not sure which versions that patch landed in, but I can reproduce with 3.8.18, 3.10.13, 3.11.5, but not 3.12.0, so it seems safe to assume this was fixed only in 3.12.

We could try/catch dcs.asdict and fall back to the pretty printer here, or try and vendor the fixed asdict method. @Zac-HD what do you think?

Zac-HD · 2023-12-14T23:39:33Z

I'd vote to vendor the fixed asdict() function, and use it on Python < 3.12 - we have many such cases in compat.py, even after dropping py2 erased the most annoying cases. It's only a little more work for us, and a noticeably better experience for users 🙂

We should IMO also guard this line with an if TESTCASE_CALLBACKS: for a (very small) perf / stability win in the common case where nobody is going to look at the observability data. My fault for forgetting to do that originally 😅

Stranger6667 added the bug something is clearly wrong here label Dec 14, 2023

tybug mentioned this issue Dec 15, 2023

Vendor dataclasses.asdict #3813

Merged

Zac-HD closed this as completed in #3813 Dec 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`TypeError` when drawing from a strategy that produces `dataclasses` with a `defaultdict` attribute #3812

`TypeError` when drawing from a strategy that produces `dataclasses` with a `defaultdict` attribute #3812

Stranger6667 commented Dec 14, 2023

tybug commented Dec 14, 2023 •

edited

Zac-HD commented Dec 14, 2023 •

edited

TypeError when drawing from a strategy that produces dataclasses with a defaultdict attribute #3812

TypeError when drawing from a strategy that produces dataclasses with a defaultdict attribute #3812

Comments

Stranger6667 commented Dec 14, 2023

tybug commented Dec 14, 2023 • edited

Zac-HD commented Dec 14, 2023 • edited

`TypeError` when drawing from a strategy that produces `dataclasses` with a `defaultdict` attribute #3812

`TypeError` when drawing from a strategy that produces `dataclasses` with a `defaultdict` attribute #3812

tybug commented Dec 14, 2023 •

edited

Zac-HD commented Dec 14, 2023 •

edited