Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restart blocked mysql routers #565

Merged

Conversation

thedac
Copy link
Contributor

@thedac thedac commented Apr 22, 2021

LP Bug #1918953 [0] was resolved on the mysql-innodb-cluster side using coordinated delayed action. Since then we have seen a similar issue in CI [1] after the pause and resume test. Mysql-router hangs with:
2021-04-21 20:42:05 metadata_cache WARNING [7f3f968d5700] Instance '192.168.254.18:3306' [72b4ac2c-a2dd-11eb-82a5-fa163e5a4b7e] of replicaset 'default' is unreachable. Increasing metadata cache refresh frequency.

This cannot be fixed from the cluster side. I will be looking into solutions on the mysql-rotuer side. But in the meantime, to unblock the mysql-innodb-cluster gate this change restarts blocked MySQL routers.

[0] https://bugs.launchpad.net/charm-mysql-router/+bug/1918953
[1] https://openstack-ci-reports.ubuntu.com/artifacts/test_charm_pipeline_func_full/openstack/charm-mysql-innodb-cluster/786514/3/8479/consoleText.test_charm_func_full_11494.txt

@thedac
Copy link
Contributor Author

thedac commented Apr 22, 2021

Copy link
Contributor

@ajkavanagh ajkavanagh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ajkavanagh ajkavanagh merged commit 5d533fb into openstack-charmers:master Apr 23, 2021
@lourot
Copy link
Contributor

lourot commented Apr 26, 2021

@thedac should this be backported to stable/21.04?

thedac added a commit to thedac/zaza-openstack-tests that referenced this pull request Apr 27, 2021
LP Bug #1918953 [0] was resolved on the mysql-innodb-cluster side using coordinated delayed action. Since then we have seen a similar issue in CI [1] after the pause and resume test. Mysql-router hangs with:
2021-04-21 20:42:05 metadata_cache WARNING [7f3f968d5700] Instance '192.168.254.18:3306' [72b4ac2c-a2dd-11eb-82a5-fa163e5a4b7e] of replicaset 'default' is unreachable. Increasing metadata cache refresh frequency.

This cannot be fixed from the cluster side. I will be looking into solutions on the mysql-rotuer side. But in the meantime, to unblock the mysql-innodb-cluster gate this change restarts blocked MySQL routers.

[0] https://bugs.launchpad.net/charm-mysql-router/+bug/1918953
[1] https://openstack-ci-reports.ubuntu.com/artifacts/test_charm_pipeline_func_full/openstack/charm-mysql-innodb-cluster/786514/3/8479/consoleText.test_charm_func_full_11494.txt
openstack-mirroring pushed a commit to openstack/charm-mysql-innodb-cluster that referenced this pull request May 5, 2021
Previously when an instance was removed the leadership settings and
charms.reactive flags remained for that instance's IP address. If a new
instance was subsequently added and happened to have the same IP address
the charm would never add the new instance to the cluster because it
believed the instance was already configured and clustered based on
leader settings.

Clear leader settings flags for instance cluster configured and
clustered.

Due to a bug in Juju the previous use of IP addresses with '.' were
unable to be unset. Transform dotted flags to use '-' instead.

func-test-pr: openstack-charmers/zaza-openstack-tests#565

Change-Id: If3ffa9e9191c057ac7e3d96bfcf84d8a3a2ad45a
Closes-Bug: #1922394
Related-Bug: #1889792
openstack-mirroring pushed a commit to openstack/openstack that referenced this pull request May 5, 2021
* Update charm-mysql-innodb-cluster from branch 'master'
  to f22ca3b5b4dde7f92edb3a9b1e17835555590d1a
  - Remove instance flags when instance removed
    
    Previously when an instance was removed the leadership settings and
    charms.reactive flags remained for that instance's IP address. If a new
    instance was subsequently added and happened to have the same IP address
    the charm would never add the new instance to the cluster because it
    believed the instance was already configured and clustered based on
    leader settings.
    
    Clear leader settings flags for instance cluster configured and
    clustered.
    
    Due to a bug in Juju the previous use of IP addresses with '.' were
    unable to be unset. Transform dotted flags to use '-' instead.
    
    func-test-pr: openstack-charmers/zaza-openstack-tests#565
    
    Change-Id: If3ffa9e9191c057ac7e3d96bfcf84d8a3a2ad45a
    Closes-Bug: #1922394
    Related-Bug: #1889792
openstack-mirroring pushed a commit to openstack/openstack that referenced this pull request May 5, 2021
* Update charm-mysql-innodb-cluster from branch 'master'
  to 8c9920ec6a8a7b6dbc5679c5bef90a9966defc1d
  - Do not fail on Cloned recoveryMethod
    
    When the recoveryMethod clone actually needs to overwrite the remote
    node the mysql-shell unfortunately returns with returncode 1. Both
    "Clone process has finished" and "Group Replication is running"
    actually indicate successful states.
    
    Handle these two edge cassess as succesful.
    
    func-test-pr: openstack-charmers/zaza-openstack-tests#565
    
    Closes-Bug: #1912688
    Change-Id: Ia0e99feee76f403ba5ed6e631bd0671c017c9c2c
openstack-mirroring pushed a commit to openstack/charm-mysql-innodb-cluster that referenced this pull request May 5, 2021
When the recoveryMethod clone actually needs to overwrite the remote
node the mysql-shell unfortunately returns with returncode 1. Both
"Clone process has finished" and "Group Replication is running"
actually indicate successful states.

Handle these two edge cassess as succesful.

func-test-pr: openstack-charmers/zaza-openstack-tests#565

Closes-Bug: #1912688
Change-Id: Ia0e99feee76f403ba5ed6e631bd0671c017c9c2c
openstack-mirroring pushed a commit to openstack/charm-mysql-innodb-cluster that referenced this pull request May 20, 2021
When the recoveryMethod clone actually needs to overwrite the remote
node the mysql-shell unfortunately returns with returncode 1. Both
"Clone process has finished" and "Group Replication is running"
actually indicate successful states.

Handle these two edge cassess as succesful.

func-test-pr: openstack-charmers/zaza-openstack-tests#565

Closes-Bug: #1912688
Change-Id: Ia0e99feee76f403ba5ed6e631bd0671c017c9c2c
(cherry picked from commit 8c9920e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants