Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automates BZ OCS podDisruptionBudget prevents successful OCP upgrades #3888

Merged
merged 4 commits into from Sep 2, 2021

Conversation

Shrivaibavi
Copy link
Contributor

Automates BZ https://bugzilla.redhat.com/show_bug.cgi?id=1861104

Signed-off-by: Shrivaibavi Raghaventhiran sraghave@redhat.com

@Shrivaibavi Shrivaibavi requested a review from a team as a code owner February 22, 2021 17:58
@pull-request-size pull-request-size bot added the size/M PR that changes 30-99 lines label Feb 22, 2021
@Shrivaibavi Shrivaibavi added Needs Testing Run tests and provide logs link Customer defects Defects automated aspart of GSS closed loop team/e2e E2E team related issues/PRs upgrade labels Feb 22, 2021
@pull-request-size pull-request-size bot added size/L PR that changes 100-499 lines and removed size/M PR that changes 30-99 lines labels May 5, 2021
ocs_ci/ocs/cluster.py Outdated Show resolved Hide resolved
ocs_ci/ocs/cluster.py Show resolved Hide resolved
tests/ecosystem/upgrade/test_upgrade.py Outdated Show resolved Hide resolved
ocs_ci/ocs/cluster.py Outdated Show resolved Hide resolved
@Shrivaibavi Shrivaibavi changed the title [WIP] Automates BZ OCS podDisruptionBudget prevents successful OCP upgrades Automates BZ OCS podDisruptionBudget prevents successful OCP upgrades May 5, 2021
ocs_ci/ocs/cluster.py Outdated Show resolved Hide resolved
ocs_ci/ocs/constants.py Outdated Show resolved Hide resolved
tests/ecosystem/upgrade/test_upgrade.py Outdated Show resolved Hide resolved
tests/ecosystem/upgrade/test_upgrade.py Outdated Show resolved Hide resolved
ocs_ci/ocs/cluster.py Outdated Show resolved Hide resolved
@Shrivaibavi Shrivaibavi requested review from a team as code owners May 26, 2021 17:17
@Shrivaibavi Shrivaibavi force-pushed the new_pdb_design branch 2 times, most recently from 91fcdad to 449dbac Compare June 3, 2021 14:44
ocs_ci/ocs/cluster.py Outdated Show resolved Hide resolved
ocs_ci/ocs/cluster.py Outdated Show resolved Hide resolved
ocs_ci/ocs/cluster.py Outdated Show resolved Hide resolved
ocs_ci/ocs/cluster.py Show resolved Hide resolved
ocs_ci/ocs/cluster.py Outdated Show resolved Hide resolved
ocs_ci/ocs/cluster.py Outdated Show resolved Hide resolved
ocs_ci/ocs/cluster.py Outdated Show resolved Hide resolved
@Shrivaibavi Shrivaibavi force-pushed the new_pdb_design branch 2 times, most recently from 34ddf4c to 8cdf37a Compare August 10, 2021 15:08
ocs_ci/ocs/cluster.py Outdated Show resolved Hide resolved
ocs_ci/ocs/cluster.py Outdated Show resolved Hide resolved
ocs_ci/ocs/cluster.py Outdated Show resolved Hide resolved

log = logging.getLogger(__name__)


@post_ocs_upgrade
@skipif_external_mode
@pytest.mark.polarion_id("OCS-2449")
class TestToCheckMonPDBPostUpgrade(ManageTest):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the test file "test_check_mon_pdb_post_upgrade.py" consists of both mon and osd pdb test functions. Please also change the test file name accordingly or make it generic something like test_pdb_check_post_upgrade.py
Same for the class function also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DOne

2. Rook-ceph-drain-canary pods disappeared after upgrade of OCS version to ocs-operator.v4.6.2-233.ci

"""
assert (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also consider checking values for disruptions_allowed, min_available_osd, max_unavailable_osd, the way how we are checking for test_check_mon_pdb_post_upgrade?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added validations in validate_existence_of_blocking_pdb(), Please check

Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: tdesala-a11
Cluster Configuration:
PR Test Suite: tier4b
PR Test Path: tests/manage/z_cluster/nodes/test_nodes_maintenance.py::TestNodesMaintenance::test_simultaneous_node_drains
Additional Test Params:
OCP VERSION: 4.8
OCS VERSION: 4.8

Job FAILED (installation failed, tests not executed).

Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: tdesala-a11
Cluster Configuration:
PR Test Suite: tier4b
PR Test Path: tests/manage/z_cluster/nodes/test_nodes_maintenance.py::TestNodesMaintenance::test_simultaneous_node_drains
Additional Test Params:
OCP VERSION: 4.8
OCS VERSION: 4.8

Job FAILED (installation failed, tests not executed).

OdedViner
OdedViner previously approved these changes Aug 17, 2021
Copy link
Contributor

@OdedViner OdedViner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: tdesala-a11
Cluster Configuration:
PR Test Suite: tier4b
PR Test Path: tests/manage/z_cluster/nodes/test_nodes_maintenance.py::TestNodesMaintenance::test_pdb_check_simultaneous_node_drains
Additional Test Params:
OCP VERSION: 4.8
OCS VERSION: 4.8

Job FAILED (installation failed, tests not executed).

Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: tdesala-a11
Cluster Configuration:
PR Test Suite:
PR Test Path: tests/manage/z_cluster/nodes/test_nodes_maintenance.py::TestNodesMaintenance::test_pdb_check_simultaneous_node_drains
Additional Test Params:
OCP VERSION: 4.8
OCS VERSION: 4.8

Job UNSTABLE (some or all tests failed).

Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: tdesala-a11
Cluster Configuration:
PR Test Suite:
PR Test Path: tests/manage/z_cluster/nodes/test_nodes_maintenance.py::TestNodesMaintenance::test_pdb_check_simultaneous_node_drains
Additional Test Params:
OCP VERSION: 4.8
OCS VERSION: 4.8

Job UNSTABLE (some or all tests failed).

Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: tdesala-a11
Cluster Configuration:
PR Test Suite:
PR Test Path: tests/manage/z_cluster/nodes/test_nodes_maintenance.py::TestNodesMaintenance::test_pdb_check_simultaneous_node_drains
Additional Test Params:
OCP VERSION: 4.8
OCS VERSION: 4.8

Job UNSTABLE (some or all tests failed).

Signed-off-by: Shrivaibavi Raghaventhiran <sraghave@redhat.com>
@Shrivaibavi Shrivaibavi added the Verified Mark when PR was verified and log provided label Aug 19, 2021
Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: tdesala-a11
Cluster Configuration:
PR Test Suite:
PR Test Path: tests/manage/z_cluster/nodes/test_nodes_maintenance.py::TestNodesMaintenance::test_pdb_check_simultaneous_node_drains
Additional Test Params:
OCP VERSION: 4.8
OCS VERSION: 4.8

Job PASSED.

Signed-off-by: Shrivaibavi Raghaventhiran <sraghave@redhat.com>
@Shrivaibavi Shrivaibavi removed the Needs Testing Run tests and provide logs link label Sep 1, 2021
Copy link
Contributor

@OdedViner OdedViner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@dahorak dahorak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one note, but it is not a blocker for this PR.

Comment on lines +1740 to +1741
allowed_disruptions = osd_pdb[osd].get("status").get("disruptionsAllowed")
maximum_unavailable = osd_pdb[osd].get("spec").get("maxUnavailable")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it might happen, but if status or spec key will not be available, it will raise exception AttributeError: 'NoneType' object has no attribute 'get'.
If it is a valid scenario (that those keys might not be present) and we want to get some default value (None in this case), it is better to provide empty dict ({}) as default value for the get(...) method:

Suggested change
allowed_disruptions = osd_pdb[osd].get("status").get("disruptionsAllowed")
maximum_unavailable = osd_pdb[osd].get("spec").get("maxUnavailable")
allowed_disruptions = osd_pdb[osd].get("status", {}).get("disruptionsAllowed")
maximum_unavailable = osd_pdb[osd].get("spec", {}).get("maxUnavailable")

Or if this keys status and spec should be always present, you might use this way:

Suggested change
allowed_disruptions = osd_pdb[osd].get("status").get("disruptionsAllowed")
maximum_unavailable = osd_pdb[osd].get("spec").get("maxUnavailable")
allowed_disruptions = osd_pdb[osd]["status"].get("disruptionsAllowed")
maximum_unavailable = osd_pdb[osd]["spec"].get("maxUnavailable")

And if it will for some reason happen, that status or spec key are not present, it will raise KeyError: 'status' exception, which might be slightly more clear, than the AttributeError: 'NoneType' object has no attribute 'get' exception mentioned above.

for pdb in pdb_obj_get.get("items"):
if not any(
osd in pdb["metadata"]["name"]
for osd in [constants.MDS_PDB, constants.MON_PDB]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have osd pdb as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get only OSD PDBs from the list of PDBs so we exclude mons and mds PDBs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Customer defects Defects automated aspart of GSS closed loop size/L PR that changes 100-499 lines team/e2e E2E team related issues/PRs upgrade Verified Mark when PR was verified and log provided
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants