Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI is broken in master #422

Closed
curita opened this issue Oct 10, 2023 · 1 comment
Closed

CI is broken in master #422

curita opened this issue Oct 10, 2023 · 1 comment

Comments

@curita
Copy link
Member

curita commented Oct 10, 2023

Issue

The CI checks fail in master. This is affecting new PRs (#421)

Reproduce

❯ python --version
Python 3.10.9
❯ python -m venv .venv
❯ source .venv/bin/activate
❯ pip install tox
❯ tox -e base

Traceback

❯ tox -e base
base: install_deps> python -I -m pip install Jinja2 pytest pytest-cov pytest-mock scrapy
.pkg: install_requires> python -I -m pip install 'setuptools>=40.8.0' wheel
.pkg: _optional_hooks> python /Users/julia/src/spidermon/.venv/lib/python3.10/site-packages/pyproject_api/_backend.py True setuptools.build_meta __legacy__
.pkg: get_requires_for_build_sdist> python /Users/julia/src/spidermon/.venv/lib/python3.10/site-packages/pyproject_api/_backend.py True setuptools.build_meta __legacy__
.pkg: get_requires_for_build_wheel> python /Users/julia/src/spidermon/.venv/lib/python3.10/site-packages/pyproject_api/_backend.py True setuptools.build_meta __legacy__
.pkg: install_requires_for_build_wheel> python -I -m pip install wheel
.pkg: prepare_metadata_for_build_wheel> python /Users/julia/src/spidermon/.venv/lib/python3.10/site-packages/pyproject_api/_backend.py True setuptools.build_meta __legacy__
.pkg: build_sdist> python /Users/julia/src/spidermon/.venv/lib/python3.10/site-packages/pyproject_api/_backend.py True setuptools.build_meta __legacy__
base: install_package_deps> python -I -m pip install Jinja2 boto boto3 itemadapter 'jsonschema[format]>=3.2.0' premailer python-slugify requests scrapinghub scrapinghub-entrypoint-scrapy scrapy sentry-sdk slack-sdk
base: install_package> python -I -m pip install --force-reinstall --no-deps /Users/julia/src/spidermon/.tox/.tmp/package/1/spidermon-1.20.0.tar.gz
base: commands[0]> pytest -s --ignore=./tests/contrib --ignore=./tests/utils/test_zyte.py tests
================================================================================== test session starts ===================================================================================
platform darwin -- Python 3.10.9, pytest-7.4.2, pluggy-1.3.0
cachedir: .tox/base/.pytest_cache
Spidermon monitor filtering
rootdir: /Users/julia/src/spidermon
plugins: cov-4.1.0, mock-3.11.1
collected 384 items                                                                                                                                                                      

tests/test_actions.py ......
tests/test_add_field_coverage.py ..........
tests/test_data.py .........
tests/test_descriptions.py ...
tests/test_expressions.py ....
tests/test_extension.py FFFFFF
tests/test_item_scraped_signal.py ...............
tests/test_levels.py .
tests/test_loaders.py ...
tests/test_messagetranslator.py ...
tests/test_names.py ....
tests/test_ordering.py ..
tests/test_spidermon_signal_connect.py ......
tests/test_suites.py ........
tests/test_templateloader.py ...
tests/test_validators_jsonschema.py ................................................................................................................................................................................................................................................................
tests/utils/test_field_coverage.py ..
tests/utils/test_settings.py ......

======================================================================================== FAILURES ========================================================================================
__________________________________________________________________________ test_spider_opened_suites_should_run __________________________________________________________________________

get_crawler = <function get_crawler.<locals>._crawler at 0x1277fe320>, suites = ['tests.fixtures.suites.Suite01']

    def test_spider_opened_suites_should_run(get_crawler, suites):
        """The suites defined at spider_opened_suites should be loaded and run"""
        crawler = get_crawler()
        spidermon = Spidermon(crawler, spider_opened_suites=suites)
        spidermon.spider_opened_suites[0].run = mock.MagicMock()
>       spidermon.spider_opened(crawler.spider)

tests/test_extension.py:18: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
spidermon/contrib/scrapy/extensions.py:120: in spider_opened
    self._run_suites(spider, self.spider_opened_suites)
spidermon/contrib/scrapy/extensions.py:203: in _run_suites
    data = self._generate_data_for_spider(spider)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <spidermon.contrib.scrapy.extensions.Spidermon object at 0x127811270>, spider = <Spider 'dummy' at 0x1247c9030>

    def _generate_data_for_spider(self, spider):
        return {
>           "stats": self.crawler.stats.get_stats(spider),
            "stats_history": spider.stats_history
            if hasattr(spider, "stats_history")
            else [],
            "crawler": self.crawler,
            "spider": spider,
            "job": self.client.job if self.client.available else None,
        }
E       AttributeError: 'NoneType' object has no attribute 'get_stats'

spidermon/contrib/scrapy/extensions.py:210: AttributeError
__________________________________________________________________________ test_spider_closed_suites_should_run __________________________________________________________________________

get_crawler = <function get_crawler.<locals>._crawler at 0x1277fe440>, suites = ['tests.fixtures.suites.Suite01']

    def test_spider_closed_suites_should_run(get_crawler, suites):
        """The suites defined at spider_closed_suites should be loaded and run"""
        crawler = get_crawler()
        spidermon = Spidermon(
            crawler, spider_opened_suites=suites, spider_closed_suites=suites
        )
        spidermon.spider_closed_suites[0].run = mock.MagicMock()
>       spidermon.spider_opened(crawler.spider)

tests/test_extension.py:30: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
spidermon/contrib/scrapy/extensions.py:120: in spider_opened
    self._run_suites(spider, self.spider_opened_suites)
spidermon/contrib/scrapy/extensions.py:203: in _run_suites
    data = self._generate_data_for_spider(spider)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <spidermon.contrib.scrapy.extensions.Spidermon object at 0x12787f3d0>, spider = <Spider 'dummy' at 0x127835390>

    def _generate_data_for_spider(self, spider):
        return {
>           "stats": self.crawler.stats.get_stats(spider),
            "stats_history": spider.stats_history
            if hasattr(spider, "stats_history")
            else [],
            "crawler": self.crawler,
            "spider": spider,
            "job": self.client.job if self.client.available else None,
        }
E       AttributeError: 'NoneType' object has no attribute 'get_stats'

spidermon/contrib/scrapy/extensions.py:210: AttributeError
_________________________________________________________________________ test_engine_stopped_suites_should_run __________________________________________________________________________

get_crawler = <function get_crawler.<locals>._crawler at 0x1277fe950>, suites = ['tests.fixtures.suites.Suite01']

    def test_engine_stopped_suites_should_run(get_crawler, suites):
        """The suites defined at engine_stopped_suites should be loaded and run"""
        crawler = get_crawler()
        spidermon = Spidermon(crawler, engine_stopped_suites=suites)
        spidermon.engine_stopped_suites[0].run = mock.MagicMock()
>       spidermon.engine_stopped()

tests/test_extension.py:41: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
spidermon/contrib/scrapy/extensions.py:136: in engine_stopped
    self._run_suites(spider, self.engine_stopped_suites)
spidermon/contrib/scrapy/extensions.py:203: in _run_suites
    data = self._generate_data_for_spider(spider)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <spidermon.contrib.scrapy.extensions.Spidermon object at 0x12779a590>, spider = <Spider 'dummy' at 0x127798a60>

    def _generate_data_for_spider(self, spider):
        return {
>           "stats": self.crawler.stats.get_stats(spider),
            "stats_history": spider.stats_history
            if hasattr(spider, "stats_history")
            else [],
            "crawler": self.crawler,
            "spider": spider,
            "job": self.client.job if self.client.available else None,
        }
E       AttributeError: 'NoneType' object has no attribute 'get_stats'

spidermon/contrib/scrapy/extensions.py:210: AttributeError
____________________________________________________________________ test_spider_opened_suites_should_run_from_signal ____________________________________________________________________

self = <MagicMock id='4957297904'>, args = (<ANY>,), kwargs = {}, msg = "Expected 'mock' to be called once. Called 0 times."

    def assert_called_once_with(self, /, *args, **kwargs):
        """assert that the mock was called exactly once and that that call was
        with the specified arguments."""
        if not self.call_count == 1:
            msg = ("Expected '%s' to be called once. Called %s times.%s"
                   % (self._mock_name or 'mock',
                      self.call_count,
                      self._calls_repr()))
>           raise AssertionError(msg)
E           AssertionError: Expected 'mock' to be called once. Called 0 times.

../../.pyenv/versions/3.10.9/lib/python3.10/unittest/mock.py:940: AssertionError

During handling of the above exception, another exception occurred:

get_crawler = <function get_crawler.<locals>._crawler at 0x1277fe710>, suites = ['tests.fixtures.suites.Suite01']

    def test_spider_opened_suites_should_run_from_signal(get_crawler, suites):
        """The suites defined at SPIDERMON_SPIDER_OPEN_MONITORS setting should be loaded and run"""
        settings = {"SPIDERMON_SPIDER_OPEN_MONITORS": suites}
        crawler = get_crawler(settings)
        spidermon = Spidermon.from_crawler(crawler)
        spidermon.spider_opened_suites[0].run = mock.MagicMock()
        crawler.signals.send_catch_log(signal=signals.spider_opened, spider=crawler.spider)
>       spidermon.spider_opened_suites[0].run.assert_called_once_with(mock.ANY)
E       AssertionError: Expected 'mock' to be called once. Called 0 times.

tests/test_extension.py:53: AssertionError
----------------------------------------------------------------------------------- Captured log call ------------------------------------------------------------------------------------
ERROR    scrapy.utils.signal:signal.py:59 Error caught on signal handler: <bound method Spidermon.spider_opened of <spidermon.contrib.scrapy.extensions.Spidermon object at 0x1277a4520>>
Traceback (most recent call last):
  File "/Users/julia/src/spidermon/.tox/base/lib/python3.10/site-packages/scrapy/utils/signal.py", line 46, in send_catch_log
    response = robustApply(
  File "/Users/julia/src/spidermon/.tox/base/lib/python3.10/site-packages/pydispatch/robustapply.py", line 55, in robustApply
    return receiver(*arguments, **named)
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 120, in spider_opened
    self._run_suites(spider, self.spider_opened_suites)
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 203, in _run_suites
    data = self._generate_data_for_spider(spider)
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 210, in _generate_data_for_spider
    "stats": self.crawler.stats.get_stats(spider),
AttributeError: 'NoneType' object has no attribute 'get_stats'
____________________________________________________________________ test_spider_closed_suites_should_run_from_signal ____________________________________________________________________

self = <MagicMock id='4958067808'>, args = (<ANY>,), kwargs = {}, msg = "Expected 'mock' to be called once. Called 0 times."

    def assert_called_once_with(self, /, *args, **kwargs):
        """assert that the mock was called exactly once and that that call was
        with the specified arguments."""
        if not self.call_count == 1:
            msg = ("Expected '%s' to be called once. Called %s times.%s"
                   % (self._mock_name or 'mock',
                      self.call_count,
                      self._calls_repr()))
>           raise AssertionError(msg)
E           AssertionError: Expected 'mock' to be called once. Called 0 times.

../../.pyenv/versions/3.10.9/lib/python3.10/unittest/mock.py:940: AssertionError

During handling of the above exception, another exception occurred:

get_crawler = <function get_crawler.<locals>._crawler at 0x1277fe8c0>, suites = ['tests.fixtures.suites.Suite01']

    def test_spider_closed_suites_should_run_from_signal(get_crawler, suites):
        """The suites defined at SPIDERMON_SPIDER_CLOSE_MONITORS setting should be loaded and run"""
        settings = {"SPIDERMON_SPIDER_CLOSE_MONITORS": suites}
        crawler = get_crawler(settings)
        spidermon = Spidermon.from_crawler(crawler)
        spidermon.spider_closed_suites[0].run = mock.MagicMock()
        crawler.signals.send_catch_log(signal=signals.spider_closed, spider=crawler.spider)
>       spidermon.spider_closed_suites[0].run.assert_called_once_with(mock.ANY)
E       AssertionError: Expected 'mock' to be called once. Called 0 times.

tests/test_extension.py:63: AssertionError
----------------------------------------------------------------------------------- Captured log call ------------------------------------------------------------------------------------
ERROR    scrapy.utils.signal:signal.py:59 Error caught on signal handler: <bound method Spidermon.spider_closed of <spidermon.contrib.scrapy.extensions.Spidermon object at 0x127860220>>
Traceback (most recent call last):
  File "/Users/julia/src/spidermon/.tox/base/lib/python3.10/site-packages/scrapy/utils/signal.py", line 46, in send_catch_log
    response = robustApply(
  File "/Users/julia/src/spidermon/.tox/base/lib/python3.10/site-packages/pydispatch/robustapply.py", line 55, in robustApply
    return receiver(*arguments, **named)
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 128, in spider_closed
    self._add_field_coverage_to_stats()
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 181, in _add_field_coverage_to_stats
    stats = self.crawler.stats.get_stats()
AttributeError: 'NoneType' object has no attribute 'get_stats'
___________________________________________________________________ test_engine_stopped_suites_should_run_from_signal ____________________________________________________________________

self = <MagicMock id='4957251824'>, args = (<ANY>,), kwargs = {}, msg = "Expected 'mock' to be called once. Called 0 times."

    def assert_called_once_with(self, /, *args, **kwargs):
        """assert that the mock was called exactly once and that that call was
        with the specified arguments."""
        if not self.call_count == 1:
            msg = ("Expected '%s' to be called once. Called %s times.%s"
                   % (self._mock_name or 'mock',
                      self.call_count,
                      self._calls_repr()))
>           raise AssertionError(msg)
E           AssertionError: Expected 'mock' to be called once. Called 0 times.

../../.pyenv/versions/3.10.9/lib/python3.10/unittest/mock.py:940: AssertionError

During handling of the above exception, another exception occurred:

get_crawler = <function get_crawler.<locals>._crawler at 0x1277fe950>, suites = ['tests.fixtures.suites.Suite01']

    def test_engine_stopped_suites_should_run_from_signal(get_crawler, suites):
        """The suites defined at SPIDERMON_ENGINE_STOP_MONITORS setting should be loaded and run"""
        settings = {"SPIDERMON_ENGINE_STOP_MONITORS": suites}
        crawler = get_crawler(settings)
        spidermon = Spidermon.from_crawler(crawler)
        spidermon.engine_stopped_suites[0].run = mock.MagicMock()
        crawler.signals.send_catch_log(signal=signals.engine_stopped, spider=crawler.spider)
>       spidermon.engine_stopped_suites[0].run.assert_called_once_with(mock.ANY)
E       AssertionError: Expected 'mock' to be called once. Called 0 times.

tests/test_extension.py:73: AssertionError
----------------------------------------------------------------------------------- Captured log call ------------------------------------------------------------------------------------
ERROR    scrapy.utils.signal:signal.py:59 Error caught on signal handler: <bound method Spidermon.engine_stopped of <spidermon.contrib.scrapy.extensions.Spidermon object at 0x12779afe0>>
Traceback (most recent call last):
  File "/Users/julia/src/spidermon/.tox/base/lib/python3.10/site-packages/scrapy/utils/signal.py", line 46, in send_catch_log
    response = robustApply(
  File "/Users/julia/src/spidermon/.tox/base/lib/python3.10/site-packages/pydispatch/robustapply.py", line 55, in robustApply
    return receiver(*arguments, **named)
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 136, in engine_stopped
    self._run_suites(spider, self.engine_stopped_suites)
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 203, in _run_suites
    data = self._generate_data_for_spider(spider)
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 210, in _generate_data_for_spider
    "stats": self.crawler.stats.get_stats(spider),
AttributeError: 'NoneType' object has no attribute 'get_stats'
==================================================================================== warnings summary ====================================================================================
spidermon/contrib/pytest/plugins/filter_monitors.py:10
  /Users/julia/src/spidermon/spidermon/contrib/pytest/plugins/filter_monitors.py:10: PytestDeprecationWarning: The hookimpl pytest_collection_modifyitems uses old-style configuration options (marks or attributes).
  Please use the pytest.hookimpl(trylast=True) decorator instead
   to configure the hooks.
   See https://docs.pytest.org/en/latest/deprecations.html#configuring-hook-specs-impls-using-markers
    @pytest.mark.trylast

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================ short test summary info =================================================================================
FAILED tests/test_extension.py::test_spider_opened_suites_should_run - AttributeError: 'NoneType' object has no attribute 'get_stats'
FAILED tests/test_extension.py::test_spider_closed_suites_should_run - AttributeError: 'NoneType' object has no attribute 'get_stats'
FAILED tests/test_extension.py::test_engine_stopped_suites_should_run - AttributeError: 'NoneType' object has no attribute 'get_stats'
FAILED tests/test_extension.py::test_spider_opened_suites_should_run_from_signal - AssertionError: Expected 'mock' to be called once. Called 0 times.
FAILED tests/test_extension.py::test_spider_closed_suites_should_run_from_signal - AssertionError: Expected 'mock' to be called once. Called 0 times.
FAILED tests/test_extension.py::test_engine_stopped_suites_should_run_from_signal - AssertionError: Expected 'mock' to be called once. Called 0 times.
======================================================================== 6 failed, 341 passed, 1 warning in 1.78s ========================================================================
base: exit 1 (3.90 seconds) /Users/julia/src/spidermon> pytest -s --ignore=./tests/contrib --ignore=./tests/utils/test_zyte.py tests pid=81920
.pkg: _exit> python /Users/julia/src/spidermon/.venv/lib/python3.10/site-packages/pyproject_api/_backend.py True setuptools.build_meta __legacy__
  base: FAIL code 1 (55.27=setup[51.37]+cmd[3.90] seconds)
  evaluation failed :( (55.38 seconds)

Initial Diagnosis

Looks like self.crawler.stats is None and that presents issues later on. Seems the crawler we are creating in the get_crawler() fixture (located in conftest.py) doesn't have a stats instance initialized yet.

It's possible this is something that changed in latest Scrapy versions, and it's showing up now as Scrapy isn't pinned.

@Gallaecio
Copy link
Member

scrapy/scrapy#6038

See specifically how Scrapy’s own get_crawler in scrapy.utils.test changed.

curita added a commit that referenced this issue Nov 27, 2023
curita added a commit that referenced this issue Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants