Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MGMT-14605: Gather more data for hypershift tests #5344

Merged

Conversation

CrystalChun
Copy link
Contributor

https://issues.redhat.com/browse/MGMT-14605
Collects more relevant data and logs for hypershift tests. Includes:

  1. hostedcluster nodes
  2. hostedcluster pods yamls and logs
  3. hypershift pods' operator logs

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

/cc @eranco74

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jul 7, 2023
@openshift-ci openshift-ci bot requested a review from eranco74 July 7, 2023 23:15
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jul 7, 2023

@CrystalChun: This pull request references MGMT-14605 which is a valid jira issue.

In response to this:

https://issues.redhat.com/browse/MGMT-14605
Collects more relevant data and logs for hypershift tests. Includes:

  1. hostedcluster nodes
  2. hostedcluster pods yamls and logs
  3. hypershift pods' operator logs

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

/cc @eranco74

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jul 7, 2023
@codecov
Copy link

codecov bot commented Jul 8, 2023

Codecov Report

❗ No coverage uploaded for pull request base (master@421eb97). Click here to learn what that means.
Report is 227 commits behind head on master.
The diff coverage is n/a.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #5344   +/-   ##
=========================================
  Coverage          ?   71.60%           
=========================================
  Files             ?      238           
  Lines             ?    41682           
  Branches          ?        0           
=========================================
  Hits              ?    29845           
  Misses            ?     9563           
  Partials          ?     2274           

@eranco74
Copy link
Contributor

eranco74 commented Jul 9, 2023

This is very useful 👍
For some reason I can't access the capi tests results of this PR.
The code already collects the hypershift operator logs and all hosted control plane pods (running on the HUB) logs and yamls.
hosted control plane pods logs and yamls here

I think that gather_hypershift_data function isn't collecting any logs, not specifically the code that you added to the function but what's already tehre.
see example here

Also I don't get why it's trying to collect all these or these logs, assisted isn't running in this namespace, and all CRDs are already collected elsewhere

Copy link
Contributor

@eranco74 eranco74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the hypershift CLI (the same CLI we use to deploy the hosoted cluster here) has a 'dump` option which collects all relevant information, similar to what must-gather is doing.
We used to have back when assisted-test-infra was running the CAPI tests

and this is how it's executed

@@ -157,6 +157,11 @@ function gather_hypershift_data() {
hypershift_dir="${LOGS_DEST}/hypershift"
mkdir -p "${hypershift_dir}"

oc get po -n hypershift -o yaml > ${hypershift_dir}/hypershift_pods.yaml
oc get po -n hypershift -o custom-columns=:.metadata.name --no-headers | while read hypershift_pod_name ; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already collected here
Consider collecting the yaml there or moving the log collection here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out! Moved yaml collection there

oc get po -n hypershift -o custom-columns=:.metadata.name --no-headers | while read hypershift_pod_name ; do
oc logs -n hypershift $hypershift_pod_name > ${hypershift_dir}/hypershift_pod_log_$hypershift_pod_name.log
done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can delete line 165 to line 181

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I don't get why it's trying to collect all these or these logs, assisted isn't running in this namespace, and all CRDs are already collected elsewhere

Oh so I think @danielerez added this for his AI on zero worker hypershift enhancement (PR link).

I think it could be relevant if we run tests that deploy AI on hypershift clusters, but maybe it's not as relevant for all hypershift tests.

Maybe these could be moved to a different function? What do you both think?

CRS=(agents infraenvs clusterdeployments agentclusterinstalls clusterimagesets)
for cr in "${CRS[@]}"; do
oc --kubeconfig ${SPOKE_KUBECONFIG} get "${cr}" -n "${SPOKE_NAMESPACE}" -o yaml > "${spoke_dir}/oc_get_${cr}.yaml"
done

oc --kubeconfig "${SPOKE_KUBECONFIG}" get nodes -o yaml > "${spoke_dir}/oc_get_nodes.yaml"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great.

https://issues.redhat.com/browse/MGMT-14605
Collects more relevant data and logs for hypershift tests.
Includes:
1. hostedcluster nodes
2. hostedcluster pods yamls and logs
3. hypershift pods' operator logs
@CrystalChun
Copy link
Contributor Author

For some reason I can't access the capi tests results of this PR.

Oh hmm it seems to have failed detecting the spoke kubeconfig so it didn't get the artifacts from the spoke cluster 🤔

@eranco74
Copy link
Contributor

@CrystalChun is this still relevant?

Copy link

openshift-ci bot commented Nov 14, 2023

@CrystalChun: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/edge-e2e-ai-operator-ztp-capi 3c202fb link false /test edge-e2e-ai-operator-ztp-capi
ci/prow/edge-e2e-oci-assisted-4-14 3c202fb link true /test edge-e2e-oci-assisted-4-14

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 3, 2023
Copy link

openshift-ci bot commented Dec 3, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: avishayt, CrystalChun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [CrystalChun,avishayt]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 4f81467 into openshift:master Dec 3, 2023
16 of 18 checks passed
@CrystalChun CrystalChun deleted the gather-hypershift-logs branch January 29, 2024 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants