Skip to content

implemented backoff for ConnectionResetError and added unittest #169

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jun 20, 2023

Conversation

Rohan397
Copy link
Contributor

Description of change

https://jira.talendforge.org/browse/TDL-23130

QA steps

  • automated tests passing
  • manual qa steps passing (list below)

Risks

Rollback steps

  • revert this branch

@RushiT0122 RushiT0122 self-requested a review June 12, 2023 09:24
Comment on lines 125 to 131
http.client.IncompleteRead,
max_tries=MAX_RETRIES,
factor=2)
@backoff.on_exception(backoff.expo, # ConnectionResetError raised
ConnectionResetError,
max_tries=MAX_RETRIES,
factor=2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
http.client.IncompleteRead,
max_tries=MAX_RETRIES,
factor=2)
@backoff.on_exception(backoff.expo, # ConnectionResetError raised
ConnectionResetError,
max_tries=MAX_RETRIES,
factor=2)
(http.client.IncompleteRead, ConnectionResetError),
max_tries=MAX_RETRIES,
factor=2)

Comment on lines 32 to 49
@mock.patch("time.sleep")
@mock.patch("pyactiveresource.activeresource.ActiveResource.find")
def test_check_access_handle_connection_reset_error(self, mocked_find, mocked_sleep):
'''
Test retry handling of ConnectionResetError
'''
# mock 'find' and raise IncompleteRead error
mocked_find.side_effect = ConnectionResetError
# initialize class
locations = Transactions()
try:
# function call
locations.replication_object.find()
except ConnectionResetError:
pass

self.assertEqual(stream.replication_object.find.call_count, 5)
# verify we backoff 5 times
self.assertEquals(mocked_find.call_count, 5)
Copy link
Contributor

@RushiT0122 RushiT0122 Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Optional] Here if possible, can we use parameterised unittests?

Comment on lines 42 to 46
try:
# function call
locations.replication_object.find()
except ConnectionResetError:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use assertRaises() instead of try-catch.

Copy link
Contributor

@RushiT0122 RushiT0122 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions in-line.


self.assertEqual(stream.replication_object.find.call_count, 5)
# verify we backoff 5 times
self.assertEquals(mocked_find.call_count, 5)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of hard coding the call_count to be 5, we can import MAX_RETRIES from from tap_shopify.streams.base and use it here for assertion

Copy link
Contributor

@RushiT0122 RushiT0122 Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, we should not use MAX_RETRIES from tap_shopify.streams.base in assertion because unittests would not catch any deviation in MAX_RETRIESfrom the expectations.

Copy link
Contributor

@RushiT0122 RushiT0122 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please confirm if make file is not required elsewhere before merging the code. Otherwise rest looks good to me, so approved!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem 1: Don't use exec

Parameterizing unit tests is fine, in theory. However I think exec is a tool we should only reach for when we have no other options.

This is not one of those times. In Python, functions are objects too. And objects can be passed around as parameters to functions.

Here's some test code that runs and fails, but illustrates my point:

import unittest
from unittest import mock
from urllib.error import URLError
from tap_shopify.streams.transactions import Transactions
from tap_shopify.streams.orders import Orders
from parameterized import parameterized

class TestShopifyConnectionResetErrorHandling(unittest.TestCase):

    @parameterized.expand([
       (URLError('<urlopen error [Errno 104] Connection reset by peer>'), Orders.replication_object.find, 5),
       (ConnectionResetError(''), Transactions.replication_object.find, 5),
    ])
    @mock.patch("time.sleep")
    @mock.patch("pyactiveresource.activeresource.ActiveResource.find")
    def test_check_access(self, error, func, expected_retries, mocked_find, mocked_sleep):
        '''
        Test retry handling of URLError and ConnectionResetError
        '''
        mocked_find.side_effect = error

        with self.assertRaises(type(error)):
            func()

        # verify we backoff expected number of times
        self.assertEqual(mocked_find.call_count, expected_retries)

And the test output

$ nosetests tests/unittests/test_error_handling.py
FINFO Backing off find(...) for 1.8s (ConnectionResetError)
INFO Backing off find(...) for 0.9s (ConnectionResetError)
INFO Backing off find(...) for 2.9s (ConnectionResetError)
INFO Backing off find(...) for 6.6s (ConnectionResetError)
ERROR Giving up find(...) after 5 tries (ConnectionResetError)
.
======================================================================
FAIL: Test retry handling of URLError and ConnectionResetError [with *args=(URLError('<urlopen error [Errno...ify.resources.order.Order'>>, 5)]
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/share/virtualenvs/tap-shopify/lib/python3.9/site-packages/parameterized/parameterized.py", line 620, in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
  File "/home/ubuntu/.pyenv/versions/3.9.6/lib/python3.9/unittest/mock.py", line 1337, in patched
    return func(*newargs, **newkeywargs)
  File "/usr/local/share/virtualenvs/tap-shopify/lib/python3.9/site-packages/parameterized/parameterized.py", line 93, in dummy_func
    return orgfunc(*args, **kwargs)
  File "/home/ubuntu/.pyenv/versions/3.9.6/lib/python3.9/unittest/mock.py", line 1337, in patched
    return func(*newargs, **newkeywargs)
  File "/opt/code/tap-shopify/tests/unittests/test_error_handling.py", line 28, in test_check_access
    self.assertEqual(mocked_find.call_count, expected_retries)
AssertionError: 1 != 5

----------------------------------------------------------------------
Ran 2 tests in 0.005s

This fails because in my example I changed Orders to call Orders.replication_object.find instead of Orders.call_api like in your test.

I encourage you to try to parameterize this test yourself and get it passing for Orders.call_api and Transactions.replication_object.find. And DM me after and we can talk about it.

But when you get that passing, I would still ask for changes to the test because

Problem 2: One unit test should test one function

I understand the desire to find a way to dry up the code remove as much boilerplate code as possible because this file is only concerned with shopify_error_handling and whether it works for both Orders and Transactions.

Parameterizing the test like this though sacrifices readability of the test for what? For a 45% reduction in lines of code?

That's not a trade worth making in my opinion. So this one class TestShopifyConnectionResetErrorHandling should just have like a test_orders_retries_on_URLError and a test_transactions_retries_on_ConnectionResetError.

Also, in this particular instance, the abstraction needed to parameterize the test isn't perfect. call_api is a method and needs the orders object to actually be initialized. But find doesn't need transactions because find is on the ShopifySDK.Transaction object, which is already initialized and attached to the tap_shopify.Transaction object (or find is a static method, which highlights my point even more).

So going back to this code

import unittest
from unittest import mock
from urllib.error import URLError
from tap_shopify.streams.transactions import Transactions
from tap_shopify.streams.orders import Orders
from parameterized import parameterized

class TestShopifyConnectionResetErrorHandling(unittest.TestCase):

    @parameterized.expand([
       (URLError('<urlopen error [Errno 104] Connection reset by peer>'), Orders.call_api, 5),
       (ConnectionResetError(''), Transactions.replication_object.find, 5),
    ])
    @mock.patch("time.sleep")
    @mock.patch("pyactiveresource.activeresource.ActiveResource.find")
    def test_check_access(self, error, func, expected_retries, mocked_find, mocked_sleep):
        '''
        Test retry handling of URLError and ConnectionResetError
        '''
        mocked_find.side_effect = error

        with self.assertRaises(type(error)):
            func()

        # verify we backoff expected number of times
        self.assertEqual(mocked_find.call_count, expected_retries)

And running it

$ nosetests tests/unittests/test_error_handling.py
EINFO Backing off find(...) for 0.8s (ConnectionResetError)
INFO Backing off find(...) for 0.1s (ConnectionResetError)
INFO Backing off find(...) for 7.8s (ConnectionResetError)
INFO Backing off find(...) for 7.6s (ConnectionResetError)
ERROR Giving up find(...) after 5 tries (ConnectionResetError)
.
======================================================================
ERROR: Test retry handling of URLError and ConnectionResetError [with *args=(URLError('<urlopen error [Errno....call_api at 0x7f610b4533a0>, 5)]
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/share/virtualenvs/tap-shopify/lib/python3.9/site-packages/parameterized/parameterized.py", line 620, in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
  File "/home/ubuntu/.pyenv/versions/3.9.6/lib/python3.9/unittest/mock.py", line 1337, in patched
    return func(*newargs, **newkeywargs)
  File "/usr/local/share/virtualenvs/tap-shopify/lib/python3.9/site-packages/parameterized/parameterized.py", line 93, in dummy_func
    return orgfunc(*args, **kwargs)
  File "/home/ubuntu/.pyenv/versions/3.9.6/lib/python3.9/unittest/mock.py", line 1337, in patched
    return func(*newargs, **newkeywargs)
  File "/opt/code/tap-shopify/tests/unittests/test_error_handling.py", line 23, in test_check_access
    func()
  File "/usr/local/share/virtualenvs/tap-shopify/lib/python3.9/site-packages/backoff/_sync.py", line 94, in retry
    ret = target(*args, **kwargs)
  File "/usr/local/share/virtualenvs/tap-shopify/lib/python3.9/site-packages/backoff/_sync.py", line 94, in retry
    ret = target(*args, **kwargs)
  File "/usr/local/share/virtualenvs/tap-shopify/lib/python3.9/site-packages/backoff/_sync.py", line 94, in retry
    ret = target(*args, **kwargs)
  [Previous line repeated 1 more time]
  File "/opt/code/tap-shopify/tap_shopify/streams/base.py", line 149, in wrapper
    return fnc(*args, **kwargs)
TypeError: call_api() missing 2 required positional arguments: 'self' and 'query_params'

----------------------------------------------------------------------
Ran 2 tests in 0.005s

FAILED (errors=1)

You can see that Transactions passes without your initialize the object code stream = Context.stream_objects[stream_name]().

So forcing two things to be similar when they're not is another code smell to me

Copy link
Contributor

@luandy64 luandy64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good. Just delete that extra code and I will approve

mocked_find.side_effect = ConnectionResetError

# initialize class
locations = Transactions()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason this is named locations?

@Rohan397 Rohan397 merged commit 539564f into master Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants