Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPE-3547 mitigations for container restart #377

Merged
merged 6 commits into from Feb 15, 2024

Conversation

paulomach
Copy link
Contributor

Issue

Workload containers get restarted due timeout on livenessProbe pebble endpoint.
Discussion at juju lp bug: https://bugs.launchpad.net/bugs/2052517

Solution

Here some mitigations and optimizations, not a final solution though.

  • set floor for max_connections in 100
  • function retries
  • flush logs in single call
    • test coverage

* set floor for max_connections in 100
* function retries
* flush logs in single call
* + test coverage
dpe-3547-mitigations-for-container-kills

# Conflicts:
#	tests/unit/test_mysql_k8s_helpers.py
Copy link
Contributor

@carlcsaposs-canonical carlcsaposs-canonical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see vm review comments canonical/mysql-operator#398 (review)

Comment on lines +758 to +762
content = self.container.list_files(MYSQL_DATA_DIR)
content_set = {item.name for item in content}
logger.debug("Resetting MySQL data directory.")
for item in content_set:
self.container.remove_path(f"{MYSQL_DATA_DIR}/{item}", recursive=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've observed an eventual failure to remove the parent directory due some process accessing it, but could not determine the process. Removing the content instead did not present the issue

lib/charms/mysql/v0/mysql.py Outdated Show resolved Hide resolved
src/charm.py Show resolved Hide resolved
@@ -93,7 +93,7 @@ def test_on_leader_elected_secrets(self):
secret_data = self.harness.model.get_secret(label="mysql-k8s.app").get_content()

# Test passwords in content and length
required_passwords = ["root-password", "server-config-password", "cluster-admin-password"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing about his in PR description. Q: Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgot to pop from local stash, fixed ea1e96d

@paulomach paulomach merged commit e5b0f94 into main Feb 15, 2024
32 checks passed
@paulomach paulomach deleted the fix/dpe-3547-mitigations-for-container-kills branch February 15, 2024 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants