Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 2014420: dont crash topology page #10272

Merged

Conversation

yaacov
Copy link
Member

@yaacov yaacov commented Oct 19, 2021

The referenceForModel method crash if called with model undefined, this may happen in the topology page.
This PR let it fail gracefully.

Screenshots:
Before:
screenshot-localhost_9000-2021 10 19-21_06_18

After:
screenshot-localhost_9000-2021 10 19-21_07_41

@openshift-ci openshift-ci bot added the bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. label Oct 19, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 19, 2021

@yaacov: This pull request references Bugzilla bug 2014420, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.10.0) matches configured target release for branch (4.10.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @sanketpathak

In response to this:

Bug 2014420: dont crash topology page

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Oct 19, 2021
@openshift-ci openshift-ci bot added the component/core Related to console core functionality label Oct 19, 2021
@yaacov
Copy link
Member Author

yaacov commented Oct 20, 2021

@rawagner PTAL

@yaacov
Copy link
Member Author

yaacov commented Oct 20, 2021

@spadgett @vojtechszocs please review

@yaacov
Copy link
Member Author

yaacov commented Oct 21, 2021

@spadgett @vojtechszocs ping

@@ -7,7 +7,7 @@ export const referenceForGroupVersionKind = (group: string) => (version: string)
) => [group, version, kind].join('~');

export const referenceForModel = (model: K8sKind): GroupVersionKind =>
referenceForGroupVersionKind(model.apiGroup || 'core')(model.apiVersion)(model.kind);
model && referenceForGroupVersionKind(model.apiGroup || 'core')(model.apiVersion)(model.kind);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is better than the current state but we need to look into two things

  • model is required, so ideally should be handled by the caller
  • why is the model not discovered

we can get this PR in and but still will be nice to dig deeper

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yaacov Is this happening in a lot of places or one place? Any reason not to fix this at the caller?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spadgett hi, didn't find the offending call, I think it's hiding inside the dynamic creation of the kebab mechanism, but didn't dig that deep.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yaacov IMO it would be better to track that down. We don't know what other issues there might be if the model is not found. I'd like to understand what's happening better before putting in this fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spadgett , the offending call in was using modelForGroupKind as argument for referenceForModel without checking the return value of modelForGroupKind.

I added a check that an internally defined model actually exist before assuming it exist and using it even if a dynamically defined one is missing.

please re-review.

Copy link
Member

@spadgett spadgett Oct 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rollback this change for now. I think we should discuss whether we want null checking when we move referenceForModel to the dynamic plugin SDK. We should strive to be consistent with how we handle that across APIs in the SDK.

@openshift-ci openshift-ci bot added the component/topology Related to topology label Oct 25, 2021
@invincibleJai
Copy link
Member

invincibleJai commented Oct 25, 2021

could be happening due to usage of referenceFor(modelFor(res)) i.e apiDiscovery is not providing the model

@yaacov
Copy link
Member Author

yaacov commented Oct 25, 2021

could be happening due to usage of referenceFor(modelFor(res)) i.e apiDiscovery is not providing the model

@invincibleJai is was using modelForGroupKind as argument for referenceForModel without checking the return value of modelForGroupKind

: v?.model
? referenceForModel(modelForGroupKind(v.model?.group, v.model?.kind))
: internalModel
? referenceForModel(internalModel)
: v?.opts?.kind;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't quite look right. We shouldn't ever be falling back to just the kind if v.model.group is defined.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spadgett hi,
I think the logic is clearer now, please re-review

@@ -7,7 +7,7 @@ export const referenceForGroupVersionKind = (group: string) => (version: string)
) => [group, version, kind].join('~');

export const referenceForModel = (model: K8sKind): GroupVersionKind =>
referenceForGroupVersionKind(model.apiGroup || 'core')(model.apiVersion)(model.kind);
model && referenceForGroupVersionKind(model.apiGroup || 'core')(model.apiVersion)(model.kind);
Copy link
Member

@spadgett spadgett Oct 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rollback this change for now. I think we should discuss whether we want null checking when we move referenceForModel to the dynamic plugin SDK. We should strive to be consistent with how we handle that across APIs in the SDK.

@yaacov
Copy link
Member Author

yaacov commented Oct 27, 2021

/bugzilla refresh
to reflect change in severity (27Oct triage)

@openshift-ci openshift-ci bot added bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. and removed bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. labels Oct 27, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 27, 2021

@yaacov: This pull request references Bugzilla bug 2014420, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.10.0) matches configured target release for branch (4.10.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @sanketpathak

In response to this:

/bugzilla refresh
to reflect change in severity (27Oct triage)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yaacov yaacov force-pushed the bz-2014420-dont-crash-topology branch 2 times, most recently from bb1b402 to 7b56eea Compare October 27, 2021 09:21
@yaacov
Copy link
Member Author

yaacov commented Oct 27, 2021

@spadgett please re review

@yaacov
Copy link
Member Author

yaacov commented Oct 27, 2021

/retest

v.model?.version &&
v.model?.kind &&
referenceForExtensionModel(v.model);
const kind = internalKind || extensionKind || v?.opts?.kind;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on the previous code the priority for kind will be in below order

Suggested change
const kind = internalKind || extensionKind || v?.opts?.kind;
const kind = extensionKind || internalKind || v?.opts?.kind;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on internal and extension provided types with both priorities, still working, switching as suggested.

@yaacov yaacov force-pushed the bz-2014420-dont-crash-topology branch from 7b56eea to 67ecdfe Compare October 28, 2021 09:22
@invincibleJai
Copy link
Member

/approve

Thanks @yaacov, have verified the changes on the shared cluster with kubevirt and it works as expected.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 28, 2021
@yaacov
Copy link
Member Author

yaacov commented Oct 28, 2021

/retest

1 similar comment
@yaacov
Copy link
Member Author

yaacov commented Oct 28, 2021

/retest

@yaacov
Copy link
Member Author

yaacov commented Oct 28, 2021

@spadgett please re review

Copy link
Member

@spadgett spadgett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @yaacov. This is existing code, but I think a comment is warranted here describing the need for this logic.

cc @glekner

: v?.model
? referenceForModel(modelForGroupKind(v.model?.group, v.model?.kind))
: v?.opts?.kind;
const model = v.model && modelForGroupKind(v.model?.group, v.model?.kind); // Return null for CRDs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const model = v.model && modelForGroupKind(v.model?.group, v.model?.kind); // Return null for CRDs
const model = v.model && modelForGroupKind(v.model.group, v.model.kind); // Return null for CRDs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hee, in ExtensionK8sGroupKindModel only version is optional, removing all ? except for version


const internalKind = model && referenceForModel(model);
const extensionKind =
v?.model &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have optional chaining after v here but not above. If v can be undefined, we should have optional chaining above (line 34).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in WatchK8sResourcesGeneric model is optional, i'll add ? above

Comment on lines 38 to 40
v?.model &&
v.model?.version &&
v.model?.kind &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
v?.model &&
v.model?.version &&
v.model?.kind &&
v?.model?.version &&
v?.model?.kind &&

: v?.opts?.kind;
const model = v.model && modelForGroupKind(v.model?.group, v.model?.kind); // Return null for CRDs

const internalKind = model && referenceForModel(model);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try to use the term "reference" here rather than "kind" since "kind" means something else (although I see the return object already uses property kind).

@yaacov yaacov force-pushed the bz-2014420-dont-crash-topology branch 3 times, most recently from c384960 to bccc60b Compare October 29, 2021 07:11
@yaacov
Copy link
Member Author

yaacov commented Oct 29, 2021

Thanks, @yaacov. This is existing code, but I think a comment is warranted here describing the need for this logic.

cc @glekner

I moved the logic into a named method to give some semantics, I also added some types to hint what is optional and what is required. @spadgett please re-review

p.s.
tested with new revision, page does not crash for me :-)

@yaacov
Copy link
Member Author

yaacov commented Nov 1, 2021

@spadgett ping

2 similar comments
@yaacov
Copy link
Member Author

yaacov commented Nov 3, 2021

@spadgett ping

@yaacov
Copy link
Member Author

yaacov commented Nov 5, 2021

@spadgett ping

@invincibleJai
Copy link
Member

invincibleJai commented Nov 8, 2021

@yaacov I feel even this issue is connected https://bugzilla.redhat.com/show_bug.cgi?id=2020904 the root cause is the model not being discovered in the case of VMs with ApiDiscovery ... we need to look at why is this not getting discovered I think for VMs

@rohitkrai03 @glekner any inputs?

these are contributed vis dynamic extensions here

https://github.com/openshift/console/blob/release-4.9/frontend/packages/kubevirt-plugin/console-extensions.json#L512-L526

cc @spadgett @christianvogt

@yaacov
Copy link
Member Author

yaacov commented Nov 8, 2021

@invincibleJai hi, dynamic plugins are loaded at runtime and may be broken, as a dev-console developer I don't want to be warry about 3rd party code braking the topology view. IMHO the dev-console should fail gracefully in this cases, currently we let dynamic plugins to break the console for all users (even those who don't use this plugin)

@invincibleJai
Copy link
Member

invincibleJai commented Nov 8, 2021

@invincibleJai hi, dynamic plugins are loaded at runtime and may be broken, as a dev-console developer I don't want to be warry about 3rd party code braking the topology view. IMHO the dev-console should fail gracefully in this cases, currently we let dynamic plugins to break the console for all users (even those who don't use this plugin)

yeah we should have checks and user should not see failures in topology so your changes in this PR is needed but I was trying to highlight the other issue logged here https://bugzilla.redhat.com/show_bug.cgi?id=2020904 we need dynamic discovery for the model here which is not happening looks like and if this would have happened then the user would not have encountered current issue (we should def have checks for such runtime cases, +1 for that).

@yaacov
Copy link
Member Author

yaacov commented Nov 8, 2021

@invincibleJai yes I agree, btw/ I re-installed my cluster and the bug disappeared ... I currently don't have a cluster to reproduce and test the fix.

@yaacov
Copy link
Member Author

yaacov commented Nov 23, 2021

@spadgett please re-review

@yaacov
Copy link
Member Author

yaacov commented Dec 6, 2021

@spadgett please re review

return { namespace, ...opts };
}

if (model?.version) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (model?.version) {
if (model.version) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

}

// If can't find reference for an extention model, fall back to internal reference
const internalModel = modelForGroupKind(model.group, model.kind); // Return null for CRDs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't we using the API version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to get the API version at this point, modelForGroupKind is using the store to get it.
@spadgett what am I missing ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spadgett help, what am I missing ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spadgett these models are fetched form extension contributed via diff packages for topology

https://github.com/openshift/console/blob/master/frontend/packages/console-dynamic-plugin-sdk/src/extensions/topology.ts#L36-L54

where resources have the model with type ExtensionK8sGroupKindModel, where version is optional but kind group is required.

https://github.com/openshift/console/blob/master/frontend/packages/console-dynamic-plugin-sdk/src/api/common-types.ts#L21-L25

that is why modelForGroupKind is used here to get models with group and kind and existing logic as well does that

https://github.com/openshift/console/blob/master/frontend/packages/topology/src/data-transforms/DataModelProvider.tsx#L37

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @invincibleJai. Thanks. I'm not familiar with the topology extension points, but isn't the API version important if the extension needs to work with the data? Why was it left out?

We can make this a follow up discussion to unblock the bug fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make this a follow up discussion to unblock the bug fix.

This bug affect users, unblocking the bug and creating a follow up discussion makes a lot of sense to me too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I don't want to block the bug any longer. Let's go ahead with the fix. Can we open a bug to track the follow on issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Sam, have logged one here https://bugzilla.redhat.com/show_bug.cgi?id=2039381

@yaacov based on the original ticket, have created a new one but to reproduce team would need help to get kubevirt CI cluster and even the issue was intermittent #10272 (comment)

so if there are steps to reproduce consistently or any way to reproduce it on another cluster, pls add a comment on the ticket.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, added @gouyang to the new bugzilla.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I don't want to block the bug any longer. Let's go ahead with the fix. Can we open a bug to track the follow on issue?

@spadgett hi, do I need to change something before you can lgtm it ?

@yaacov yaacov force-pushed the bz-2014420-dont-crash-topology branch from bccc60b to f2d48ea Compare December 6, 2021 18:24
Copy link
Member

@spadgett spadgett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 13, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 13, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: invincibleJai, spadgett, yaacov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit bf8ae08 into openshift:master Jan 13, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 13, 2022

@yaacov: All pull requests linked via external trackers have merged:

Bugzilla bug 2014420 has been moved to the MODIFIED state.

In response to this:

Bug 2014420: dont crash topology page

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. component/core Related to console core functionality component/topology Related to topology lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants