Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Maintain dc/rack by topology" ++ #12065

Closed
wants to merge 5 commits into from

Conversation

xemul
Copy link
Contributor

@xemul xemul commented Nov 23, 2022

This set reverts the large recent change that moved topology (dc and rack) information onto token_metadata::topology and some more changes made on top of it. The motivation is that change revealed some nasty hidden land-mines not all of which had been fixed after it.

@asias , it turned out that #11936 stood on the way. It turned out to be simpler to reverted first, then reverted this revert fixing few trivial conflicts afterwards. Please, check

@bhalevy , I had to revert #12001 , and getting it back was not trivial, sorry :(

@kbr-, I tried to keep #11998 (all the more so it didn't cause direct conflicts) but not 100% sure it stays correct, please check

https://jenkins.scylladb.com/job/releng/job/Scylla-CI/3138/ is on its way

… check' from Asias He"

This reverts commit a44ca06, reversing
changes made to 2779a17.
…levy"

This reverts commit 2add9ba, reversing
changes made to 7ead1a7.
This reverts commit 6ce659b, reversing
changes made to dd0b571.
…eck' from Asias He

This reverts commit 6878de0 that
reverted this functionality few patches earlies effectively bringing the
a44ca06 back
@kbr-
Copy link
Contributor

kbr- commented Nov 23, 2022

But... why?

@kbr-, I tried to keep #11998 (all the more so it didn't cause direct conflicts) but not 100% sure it stays correct, please check

It seems that #11998 is kept, but I don't know about #11942:

-        && _token_metadata->get()->get_topology().has_endpoint(addr, locator::topology::pending::yes);
+        && _token_metadata->get()->get_topology().has_endpoint(addr);

but I see that after these reverts, there's no such thing as _pending_locations.
Are pending endpoints now kept in _current_locations?

@xemul
Copy link
Contributor Author

xemul commented Nov 23, 2022

Are pending endpoints now kept in _current_locations?

That's the problem -- pending endpoints are not kept on topology

@xemul
Copy link
Contributor Author

xemul commented Nov 23, 2022

@benipeled , need your help here. The CI fails on sanity-tests stage without staring any test

@benipeled
Copy link
Contributor

@benipeled , need your help here. The CI fails on sanity-tests stage without staring any test

Caused by https://github.com/scylladb/scylla-pkg/pull/3159 - I dequeued it - you can retrigger the CI

@xemul
Copy link
Contributor Author

xemul commented Nov 24, 2022

@xemul
Copy link
Contributor Author

xemul commented Nov 24, 2022

☝️ fails:

[2022-11-24T15:15:22.398Z] [2580/2581] topology  debug  [ FAIL ] test_topology.2571 469.52s

@xemul
Copy link
Contributor Author

xemul commented Nov 24, 2022

The test caught bug #11780

@avikivity
Copy link
Member

The test caught bug #11780

Do you mean, we had a unit test that caught the bug but we ignored it due to "flakiness"?

@xemul
Copy link
Contributor Author

xemul commented Nov 24, 2022

Do you mean, we had a unit test that caught the bug but we ignored it due to "flakiness"?

First the test was disabled. Next messaging service was taught (1bd2471) to keep connections to endpoints even if they were joining/leaving and this test was enabled.

After the revert topology doesn't maintain locations for joining/leaving endpoints, respectively the messaging service drops those connections at unwanted moments. It probably can be fixed by reverting the fix that made messaging-service topology-aware (a396c27 and 13ace7a at least)

@kbr-
Copy link
Contributor

kbr- commented Nov 24, 2022

Do you mean, we had a unit test that caught the bug but we ignored it due to "flakiness"?

If I understand correctly, the sequence of events was as follows:

@xemul
Copy link
Contributor Author

xemul commented Nov 24, 2022

but apparently the injections work. (Or we're just (un)lucky :P)

... or both 😛

@bhalevy
Copy link
Member

bhalevy commented Jan 1, 2023

Can this PR be closed?

@bhalevy
Copy link
Member

bhalevy commented Feb 16, 2023

@xemul can this PR draft be closed?

@xemul
Copy link
Contributor Author

xemul commented Feb 17, 2023

@xemul can this PR draft be closed?

Definitely

@xemul xemul closed this Feb 17, 2023
@xemul xemul deleted the br-revert-topo-on-topo branch April 5, 2024 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants