Skip to content

Flaky TestSerializers.test_params test #35699

@potiuk

Description

@potiuk

Body

Recently we started to have a flaky TestSerializers.test_params

This seems to be a problem in either the tests or implementation of serde - seems like discovery of classes that are serializable in some cases is not working well while the import of serde happens.

It happens rarely and it's not easy to reproduce locallly, by a quick look it might be a side effect from another test - I have a feeling that when tests are run, some other test might leave behind a thread that cleans the list of classes that have been registered with serde and that cleanup happens somewhat randomly.

cc: @bolkedebruin - maybe you can take a look or have an idea where it can come from - might be fastest for you as you know the discovery mechanism best and you wrote most of the tests there ? Maybe there are some specially crafted test cases somewhere that do a setup/teardown or just cleanup of the serde-registered classes that could cause such an effect?

Example error: https://github.com/apache/airflow/actions/runs/6898122803/job/18767848684?pr=35693#step:5:754

Error:

_________________________ TestSerializers.test_params __________________________
[gw3] linux -- Python 3.8.18 /usr/local/bin/python

self = <tests.serialization.serializers.test_serializers.TestSerializers object at 0x7fb113165550>

    def test_params(self):
        i = ParamsDict({"x": Param(default="value", description="there is a value", key="test")})
        e = serialize(i)
>       d = deserialize(e)

tests/serialization/serializers/test_serializers.py:173: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

o = {'__classname__': 'airflow.models.param.ParamsDict', '__data__': {'x': 'value'}, '__version__': 1}
full = True, type_hint = None

    def deserialize(o: T | None, full=True, type_hint: Any = None) -> object:
        """
        Deserialize an object of primitive type and uses an allow list to determine if a class can be loaded.
    
        :param o: primitive to deserialize into an arbitrary object.
        :param full: if False it will return a stringified representation
                     of an object and will not load any classes
        :param type_hint: if set it will be used to help determine what
                     object to deserialize in. It does not override if another
                     specification is found
        :return: object
        """
        if o is None:
            return o
    
        if isinstance(o, _primitives):
            return o
    
        # tuples, sets are included here for backwards compatibility
        if isinstance(o, _builtin_collections):
            col = [deserialize(d) for d in o]
            if isinstance(o, tuple):
                return tuple(col)
    
            if isinstance(o, set):
                return set(col)
    
            return col
    
        if not isinstance(o, dict):
            # if o is not a dict, then it's already deserialized
            # in this case we should return it as is
            return o
    
        o = _convert(o)
    
        # plain dict and no type hint
        if CLASSNAME not in o and not type_hint or VERSION not in o:
            return {str(k): deserialize(v, full) for k, v in o.items()}
    
        # custom deserialization starts here
        cls: Any
        version = 0
        value: Any = None
        classname = ""
    
        if type_hint:
            cls = type_hint
            classname = qualname(cls)
            version = 0  # type hinting always sets version to 0
            value = o
    
        if CLASSNAME in o and VERSION in o:
            classname, version, value = decode(o)
    
        if not classname:
            raise TypeError("classname cannot be empty")
    
        # only return string representation
        if not full:
            return _stringify(classname, version, value)
    
        if not _match(classname) and classname not in _extra_allowed:
>           raise ImportError(
                f"{classname} was not found in allow list for deserialization imports. "
                f"To allow it, add it to allowed_deserialization_classes in the configuration"
            )
E           ImportError: airflow.models.param.ParamsDict was not found in allow list for deserialization imports. To allow it, add it to allowed_deserialization_classes in the configuration

airflow/serialization/serde.py:246: ImportError

Committer

  • I acknowledge that I am a maintainer/committer of the Apache Airflow project.

Metadata

Metadata

Assignees

No one assigned

    Labels

    QuarantineIssues that are occasionally failing and are quarantinedkind:metaHigh-level information important to the community

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions