Skip to content

retry on zk metadata sync issue (PR 6826)#7280

Merged
MeredithAnya merged 2 commits into
masterfrom
meredith/copy-6826
Jul 17, 2025
Merged

retry on zk metadata sync issue (PR 6826)#7280
MeredithAnya merged 2 commits into
masterfrom
meredith/copy-6826

Conversation

@MeredithAnya

Copy link
Copy Markdown
Member

Re-creation of #6826

@MeredithAnya MeredithAnya requested a review from a team as a code owner July 16, 2025 18:15
@codecov

codecov Bot commented Jul 16, 2025

Copy link
Copy Markdown

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
732 1 731 7
View the top 1 failed test(s) by shortest run time
tests.migrations.test_operations::test_drop_table
Stack Traces | 0.005s run time
Traceback (most recent call last):
  File ".../local/lib/python3.11........./site-packages/_pytest/runner.py", line 341, in from_call
    result: TResult | None = func()
                             ^^^^^^
  File ".../local/lib/python3.11........./site-packages/_pytest/runner.py", line 242, in <lambda>
    lambda: runtest_hook(item=item, **kwds), when=when, reraise=reraise
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../local/lib/python3.11....../site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../local/lib/python3.11....../site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../local/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 182, in _multicall
    return outcome.get_result()
           ^^^^^^^^^^^^^^^^^^^^
  File ".../local/lib/python3.11.../site-packages/pluggy/_result.py", line 100, in get_result
    raise exc.with_traceback(exc.__traceback__)
  File ".../local/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File ".../local/lib/python3.11....../site-packages/_pytest/threadexception.py", line 92, in pytest_runtest_call
    yield from thread_exception_runtest_hook()
  File ".../local/lib/python3.11....../site-packages/_pytest/threadexception.py", line 68, in thread_exception_runtest_hook
    yield
  File ".../local/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File ".../local/lib/python3.11....../site-packages/_pytest/unraisableexception.py", line 95, in pytest_runtest_call
    yield from unraisable_exception_runtest_hook()
  File ".../local/lib/python3.11....../site-packages/_pytest/unraisableexception.py", line 70, in unraisable_exception_runtest_hook
    yield
  File ".../local/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File ".../local/lib/python3.11....../site-packages/_pytest/logging.py", line 846, in pytest_runtest_call
    yield from self._runtest_for(item, "call")
  File ".../local/lib/python3.11....../site-packages/_pytest/logging.py", line 829, in _runtest_for
    yield
  File ".../local/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File ".../local/lib/python3.11.../site-packages/_pytest/capture.py", line 880, in pytest_runtest_call
    return (yield)
            ^^^^^
  File ".../local/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File ".../local/lib/python3.11.../site-packages/_pytest/skipping.py", line 257, in pytest_runtest_call
    return (yield)
            ^^^^^
  File ".../local/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../local/lib/python3.11........./site-packages/_pytest/runner.py", line 174, in pytest_runtest_call
    item.runtest()
  File ".../local/lib/python3.11....../site-packages/_pytest/python.py", line 1627, in runtest
    self.ihook.pytest_pyfunc_call(pyfuncitem=self)
  File ".../local/lib/python3.11....../site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../local/lib/python3.11....../site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../local/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 139, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File ".../local/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../local/lib/python3.11....../site-packages/_pytest/python.py", line 159, in pytest_pyfunc_call
    result = testfunction(**testargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../tests/migrations/test_operations.py", line 117, in test_drop_table
    assert (
AssertionError: assert 'DROP TABLE IF EXISTS test_table SYNC;' == 'DROP TABLE IF EXISTS test_table;'
  
  - DROP TABLE IF EXISTS test_table;
  + DROP TABLE IF EXISTS test_table SYNC;
  ?                                +++++

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@kylemumma kylemumma left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems fine to me, but it does change the behavior of all migrations

def execute(self) -> None:
for i in range(30, -1, -1): # wait at most ~30 seconds
try:
super().execute() # type: ignore

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you call super.execute when this class doesnt inheret?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specific operation classes (e.g. class AddColumn(RetryOnSyncError, SqlOperation)) inherit from both RetryOnSyncError and SqlOperation now. And super uses the mro (method resolution order) to figure out what to method to call

print(AddColumn.mro())

[<class 'snuba.migrations.operations.AddColumn'>, <class 'snuba.migrations.operations.RetryOnSyncError'>, <class 'snuba.migrations.operations.SqlOperation'>, <class 'abc.ABC'>, <class 'object'>]

The order is determined by how the class is defined

class AddColumn(RetryOnSyncError, SqlOperation)

@MeredithAnya

Copy link
Copy Markdown
Member Author

seems fine to me, but it does change the behavior of all migrations

The behavior should be quite similar except in the very specific error code that is added in the RetryOnError class, and this specific error should be retryable. Currently folks have to manual intervene when running migrations, so I think it's a reasonable change.

@pyhp2017

Copy link
Copy Markdown

seems fine to me, but it does change the behavior of all migrations

The behavior should be quite similar except in the very specific error code that is added in the RetryOnError class, and this specific error should be retryable. Currently folks have to manual intervene when running migrations, so I think it's a reasonable change.

It was incredibly helpful for us at Kubit Cloud when managing multiple instances (both primary and replica) in Kubernetes, especially when dealing with large ClickHouse sizes and heavy loads. We often had to manually fix migrations, even on fresh installs!

@MeredithAnya MeredithAnya merged commit d5babc7 into master Jul 17, 2025
35 checks passed
@MeredithAnya MeredithAnya deleted the meredith/copy-6826 branch July 17, 2025 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants