Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'NoneType' object has no attribute 'properties' in OversizeMulticastTransferTest #126

Open
jiridanek opened this issue Mar 4, 2022 · 6 comments
Assignees
Labels
test-bug It's a test bug unless proven otherwise

Comments

@jiridanek
Copy link
Contributor

jiridanek commented Mar 4, 2022

https://github.com/jiridanek/skupper-router/runs/5426622599?check_suite_focus=true#step:9:5364

23: DEBUG    root:system_test.py:1390 2022-03-04 18:41:21.956829 OversizeMulticastTransferTest - multicast/e81: Container run exception: ['Traceback (most recent call last):
  File "/home/runner/work/skupper-router/skupper-router/skupper-router/tests/system_tests_policy_oversize_compound.py", line 631, in run
    Container(self).run()
  File "/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/site-packages/proton/_reactor.py", line 194, in run
    while self.process():
  File "/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/site-packages/proton/_reactor.py", line 257, in process
    event.dispatch(handler)
  File "/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/site-packages/proton/_events.py", line 164, in dispatch
    self.dispatch(h, type)
  File "/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/site-packages/proton/_events.py", line 161, in dispatch
    _dispatch(handler, type.method, self)
  File "/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/site-packages/proton/_events.py", line 128, in _dispatch
    m(*args)
  File "/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/site-packages/proton/_handlers.py", line 753, in on_reactor_init
    self.on_start(event)
  File "/home/runner/work/skupper-router/skupper-router/skupper-router/tests/system_tests_policy_oversize_compound.py", line 423, in on_start
    self.test_class.wait_router_network_connected()
  File "/home/runner/work/skupper-router/skupper-router/skupper-router/tests/system_tests_policy_oversize_compound.py", line 740, in wait_router_network_connected
    cls.INT_A.wait_router_connected(\'INT.B\')
  File "/home/runner/work/skupper-router/skupper-router/skupper-router/tests/system_test.py", line 745, in wait_router_connected
    retry(lambda: self.is_router_connected(router_id), **retry_kwargs)
  File "/home/runner/work/skupper-router/skupper-router/skupper-router/tests/system_test.py", line 140, in retry
    ret = function()
  File "/home/runner/work/skupper-router/skupper-router/skupper-router/tests/system_test.py", line 745, in <lambda>
    retry(lambda: self.is_router_connected(router_id), **retry_kwargs)
  File "/home/runner/work/skupper-router/skupper-router/skupper-router/tests/system_test.py", line 727, in is_router_connected
    self.management.read(identity="router.node/%s" % router_id)
  File "/home/runner/work/skupper-router/skupper-router/skupper-router/python/qpid_dispatch/management/client.py", line 331, in read
    return Entity(self, self.call(request).body)
  File "/home/runner/work/skupper-router/skupper-router/skupper-router/python/qpid_dispatch/management/client.py", line 192, in call
    self.check_response(response, expect=expect)
  File "/home/runner/work/skupper-router/skupper-router/skupper-router/python/qpid_dispatch/management/client.py", line 159, in check_response
    code = response.properties.get(\'statusCode\')
AttributeError: 'NoneType' object has no attribute 'properties'
"]
@jiridanek
Copy link
Contributor Author

Issue happens even when running test single-threaded, https://github.com/jiridanek/skupper-router/runs/5428006739?check_suite_focus=true#step:27:5701

@jiridanek
Copy link
Contributor Author

I've tried investigating the test. There seem to be multiple possible issues that can cause the test to fail, and getting over one exposes another.

The logs quoted in this message are from a PR with extra logging added. E.g. to log full stracktrace of the exception, and such.

AttributeError: 'NoneType' object has no attribute 'properties'

https://github.com/jiridanek/skupper-router/runs/5428035645?check_suite_focus=true#step:9:7800

This gets raised from line 739 doing

@classmethod
def wait_router_network_connected(cls):
cls.INT_A.wait_router_connected('INT.B')
cls.INT_B.wait_router_connected('INT.A')
cls.EA1.wait_connectors()
cls.EB1.wait_connectors()

I do not see how it is possible for SyncRequestResponse#call to return None (it should always either return received Message, or raise a TimeoutException, as far as I can tell), but nevertheless, that is what seems to be happening

def call(self, request, expect=error.OK):
"""
Send a management request message, wait for a response.
@return: Response message.
"""
response = self.client.call(request)
self.check_response(response, expect=expect)
return response

Getting a None might be possible only if the management client gets disconnected after it or the other peer (the router) closed connection, while management client waits for its reply, given this code in Proton

        if self.disconnected and not self._is_closed():
            raise ConnectionException(
                "Connection %s disconnected: %s" % (self.url, self.disconnected))

(from https://github.com/apache/qpid-proton/blob/bf923263c6b7ebf32b0d00796822cb7e282f12cd/python/proton/_utils.py#L532)

That cannot be possible, however, because the previous test has failed (https://github.com/jiridanek/skupper-router/runs/5428035645?check_suite_focus=true#step:9:7749) less than a second prior; there was not enough time for a timeout to occur.

  • I think that this failure is somehow caused by the test that failed previously. But there is not cleanup in the tests that might get skipped due to the raised exception, so if anything, it has to be timing...

There is a timeout-expired message in INT.A.log, lines 8718-8724, that looks like idle timeout from management client!
https://gist.github.com/jiridanek/109918a935e0325ec01bd193ed488984#file-int-a-log

2022-03-04 20:57:27.086512 +0000 ROUTER_CORE (trace) Core action 'link_flow' (../src/router_core/router_core_thread.c:235)
2022-03-04 20:57:27.186954 +0000 PROTOCOL (trace) [C6]:FRAME: 0 -> @close(24) [error=@error(29) [condition=:"amqp:resource-limit-exceeded", description="local-idle-timeout expired"]] (../src/server.c:110)
2022-03-04 20:57:27.187026 +0000 SERVER (info) [C6] Connection from 127.0.0.1:32860 (to 0.0.0.0:20514) failed: amqp:resource-limit-exceeded local-idle-timeout expired (../src/server.c:1046)
2022-03-04 20:57:27.187091 +0000 PROTOCOL (trace) [C6]:FRAME:   -> EOS (../src/server.c:110)
2022-03-04 20:57:27.187128 +0000 CONTAINER (debug) Aborting link '76c34267-770e-4fe2-b922-01acbd802824-$management' due to parent connection end (../src/container.c:325)
2022-03-04 20:57:27.187189 +0000 CONTAINER (debug) Aborting link '76c34267-770e-4fe2-b922-01acbd802824-df0fbdae-648a-4b11-a357-f310eb8df2e2' due to parent connection end (../src/container.c:325)
2022-03-04 20:57:27.187290 +0000 POLICY (debug) [C6] Connection '127.0.0.1:32860' closed with resources n_sessions=1, n_senders=1, n_receivers=1, sessions_denied=0, senders_denied=0, receivers_denied=0, max_message_size_denied:0, nConnections= 3. (../src/policy.c:319)
  • Maybe the machine was too loaded to update idle timeouts? Afaik BlockingConnection maintains a background thread to keep track of heartbeats, so this is not supposed to happen.

multicast/e80: test_80 test error: Timeout Expired net_stable:True sent=10 rcvd=[0, 0, 0, 10] rejected=2 aborted=[0, 2, 0, 0] connection_error:0 send_settled:10

https://github.com/jiridanek/skupper-router/runs/5428035645?check_suite_focus=true#step:9:5382

The test code assumes one reject, or up-to-one reject, depending on the specific test

def _check_done(self):
self.logger.log("check_done: " + self._current())
if self.error is not None:
self.logger.log("TEST FAIL")
self._shut_down_test()
else:
if self.blocked_by_interior:
if self.blocked_by_ingress:
# Blocked by interior and edge. Expect edge connection to go down
# and some of our messaages arrive at edge after it has sent
# AMQP close. Those messages are never settled. TODO: Is that OK?
done = self.n_rejected == 1 and \
self.n_connection_error == 1
else:
# Blocked by interior only. Connection to edge stays up
# and all messages must be accounted for.
all_received = True
for idx in [IDX_INTA, IDX_INTB, IDX_EA1, IDX_EB1]:
if self.expect_receives[idx] > 0:
if not self.n_rcvds[idx] == self.expect_receives[idx]:
all_received = False
done = self.n_rejected <= 1 and \
self.n_send_settled == self.count and \
all_received
else:
# Blocked by edge should never deliver to interior
done = self.n_rejected == 1 and \
self.n_connection_error == 1
if done:
self.logger.log("TEST DONE!!!")
# self.log_unhandled = True # verbose debugging
self._shut_down_test()

I tried adjusting this to tolerate two rejects, because that is what I am often seeing when running tests with -j100. Then, I got

23: DEBUG    root:system_test.py:1439 2022-03-05 11:46:29.951349 OversizeMulticastTransferTest - multicast/e80: FAIL: N closed events in log file did not increment by 1. oBefore: 1, oAfter: 1, iBefore:2, iAfter:4

(from https://github.com/jiridanek/skupper-router/runs/5432676676?check_suite_focus=true#step:9:4004)

@jiridanek jiridanek added the test-bug It's a test bug unless proven otherwise label Mar 5, 2022
@jiridanek
Copy link
Contributor Author

23: DEBUG root:system_test.py:1439 2022-03-05 11:46:29.951349 OversizeMulticastTransferTest - multicast/e80: FAIL: N closed events in log file did not increment by 1. oBefore: 1, oAfter: 1, iBefore:2, iAfter:4

[before:]

2022-03-05 11:46:14.293825 +0000 PROTOCOL (trace) [C1]:FRAME: 0 <- @close(24) [error=@error(29) [condition=:"amqp:connection:forced", description="Message size exceeded"]] (../src/server.c:110)
2022-03-05 11:46:18.765349 +0000 PROTOCOL (trace) [C6]:FRAME: 0 <- @close(24) [error=@error(29) [condition=:"amqp:connection:forced", description="Message size exceeded"]] (../src/server.c:110)
2022-03-05 11:46:22.783473 +0000 PROTOCOL (trace) [C9]:FRAME: 0 -> @close(24) [error=@error(29) [condition=:"amqp:connection:forced", description="Message size exceeded"]] (../src/server.c:110)

[after:]

2022-03-05 11:46:26.544117 +0000 PROTOCOL (trace) [C8]:FRAME: 0 <- @close(24) [error=@error(29) [condition=:"amqp:connection:forced", description="Message size exceeded"]] (../src/server.c:110)
2022-03-05 11:46:28.884387 +0000 PROTOCOL (trace) [C12]:FRAME: 0 <- @close(24) [error=@error(29) [condition=:"amqp:connection:forced", description="Message size exceeded"]] (../src/server.c:110)

The test expected only one such line in the "After:", that is, log lines added between start of test and the final evaluation.

@jiridanek
Copy link
Contributor Author

23: DEBUG root:system_test.py:1439 2022-03-05 13:24:38.692392 OversizeMessageTransferTest - e60: test_60 test error: Timeout Expired check_done: sent=10 rcvd=0 rejected=2 aborted=1 connection_error:0 send_settled:10

https://github.com/jiridanek/skupper-router/runs/5433073293?check_suite_focus=true#step:9:7705

Sometimes, the messages just don't go through.

@jiridanek
Copy link
Contributor Author

jiridanek commented Mar 13, 2022

Duplicated by #146, related to #168

@jiridanek
Copy link
Contributor Author

See also #618.

@ganeshmurthy ganeshmurthy modified the milestones: 2.1.0, 2.2.0 Sep 30, 2022
@ganeshmurthy ganeshmurthy modified the milestones: 2.2.0, 2.3.0 Nov 11, 2022
@ganeshmurthy ganeshmurthy modified the milestones: 2.3.0, 2.4.0 Feb 6, 2023
@ganeshmurthy ganeshmurthy removed this from the 2.4.0 milestone Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test-bug It's a test bug unless proven otherwise
Projects
None yet
Development

No branches or pull requests

3 participants