Skip to content

downloadermiddlewares.retry.BackwardsCompatibilityMetaclass does not provide backward compatibility for middleware instances #6049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Prometheus3375 opened this issue Sep 13, 2023 · 1 comment · Fixed by #6050

Comments

@Prometheus3375
Copy link

Prometheus3375 commented Sep 13, 2023

Description

Previously, EXCEPTIONS_TO_RETRY was an attribute of RetryMiddleware. This allows:

  • RetryMiddleware subclasses could access EXCEPTIONS_TO_RETRY via cls.EXCEPTIONS_TO_RETRY.
  • RetryMiddleware instances and instances of its subclasses could access EXCEPTIONS_TO_RETRY via self.EXCEPTIONS_TO_RETRY.

In 2.10 EXCEPTIONS_TO_RETRY was removed and added as a property to BackwardsCompatibilityMetaclass. This added compatibility only for the first point.

Steps to Reproduce

class MyRetryMiddleware(RetryMiddleware):

    def process_exception(self, request, exception, spider):
        if isinstance(exception, self.EXCEPTIONS_TO_RETRY) and not request.meta.get('dont_retry', False):
            # update request
            return self._retry(request, exception, spider)

Expected behavior

A warning about EXCEPTIONS_TO_RETRY deprecation.

Actual behavior

AttributeError: 'MyRetryMiddleware' object has no attribute 'EXCEPTIONS_TO_RETRY'

Versions

Scrapy       : 2.10.1
lxml         : 4.9.3.0
libxml2      : 2.10.3
cssselect    : 1.2.0
parsel       : 1.8.1
w3lib        : 2.1.2
Twisted      : 22.10.0
Python       : 3.11.4 (tags/v3.11.4:d2340ef, Jun  7 2023, 05:45:37) [MSC v.1934 64 bit (AMD64)]
pyOpenSSL    : 23.2.0 (OpenSSL 3.1.2 1 Aug 2023)
cryptography : 41.0.3
Platform     : Windows-10-10.0.19044-SP0
@Prometheus3375
Copy link
Author

The fix is actually very tricky. Here is a sample snippet:

DEPRECATED_ATTRIBUTE = 'A'


def __getattr__(self, item):
    if item == DEPRECATED_ATTRIBUTE:
        return 'A does not exist'

    raise AttributeError(f'{self.__class__.__name__!r} object has no attribute {item!r}')


class Meta(type):
    __getattr__ = __getattr__


class Data(metaclass=Meta):
    def __init__(self):
        try:
            self.a = self.__getattribute__(DEPRECATED_ATTRIBUTE)
        except AttributeError:
            self.a = 'a here'

    __getattr__ = __getattr__


class DataDefineA(Data):
    A = 1


class DataNoA(Data): pass


print(Data.A)
print(Data().A)
print(Data().a)

print(DataDefineA.A)
print(DataDefineA().A)
print(DataDefineA().a)

print(DataNoA.A)
print(DataNoA().A)
print(DataNoA().a)

Output:

A does not exist
A does not exist
a here
1
1
1
A does not exist
A does not exist
a here

Such solution achieves both compatibility issues. In addition, if a subclass or its instance defines attribute A, then it will be assigned to attribute a.

Rewritten retry.py:

DEPRECATED_ATTRIBUTE = 'EXCEPTIONS_TO_RETRY'


def backwards_compatibility_getattr(self, item):
    if item == DEPRECATED_ATTRIBUTE:
        warnings.warn(
            f"Attribute RetryMiddleware.{DEPRECATED_ATTRIBUTE} is deprecated. "
            "Use the RETRY_EXCEPTIONS setting instead.",
            ScrapyDeprecationWarning,
            stacklevel=2,
            )

        return tuple(
            load_object(x) if isinstance(x, str) else x
            for x in Settings().getlist("RETRY_EXCEPTIONS")
            )

    raise AttributeError(f'{self.__class__.__name__!r} object has no attribute {item!r}')


class BackwardsCompatibilityMetaclass(type):
    __getattr__ = backwards_compatibility_getattr


class RetryMiddleware(metaclass=BackwardsCompatibilityMetaclass):
    def __init__(self, settings):
        if not settings.getbool("RETRY_ENABLED"):
            raise NotConfigured
        self.max_retry_times = settings.getint("RETRY_TIMES")
        self.retry_http_codes = set(
            int(x) for x in settings.getlist("RETRY_HTTP_CODES")
        )
        self.priority_adjust = settings.getint("RETRY_PRIORITY_ADJUST")

        try:
            self.exceptions_to_retry = self.__getattribute__(DEPRECATED_ATTRIBUTE)
        except AttributeError:
            # If EXCEPTIONS_TO_RETRY is not "overridden"
            self.exceptions_to_retry = tuple(
                load_object(x) if isinstance(x, str) else x
                for x in settings.getlist("RETRY_EXCEPTIONS")
                )

    __getattr__ = backwards_compatibility_getattr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant