Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes CEL estimated cost to propagate result sizes correctly #119800

Merged
merged 2 commits into from Aug 16, 2023

Conversation

jpbetz
Copy link
Contributor

@jpbetz jpbetz commented Aug 7, 2023

/kind bug

What this PR does / why we need it:

Fixes CEL estimated cost to propagate the result size of functions that returns strings/bytes/lists/maps. Without this fix, the estimated size is assumed to be max integer and the estimated cost limit is exceeded, preventing some valid CEL expressions from being used in CRDs for validation.

Which issue(s) this PR fixes:

Fixes #119749

Special notes for your reviewer:

See google/cel-go#787 for the fix to cel-go picked up by this PR. The rest of this PR is tests to ensure the fix applies as expected.

Does this PR introduce a user-facing change?

Fixed a bug where CEL expressions in CRD validation rules would incorrectly compute a high estimated cost for functions that return strings, lists or maps.
The incorrect cost was evident when the result of a function was used in subsequent operations.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. labels Aug 7, 2023
@k8s-ci-robot
Copy link
Contributor

Please note that we're already in Test Freeze for the release-1.28 branch. This means every merged PR will be automatically fast-forwarded via the periodic ci-fast-forward job to the release branch of the upcoming v1.28.0 release.

Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Mon Aug 7 16:31:33 UTC 2023.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/apiserver area/cloudprovider area/code-generation area/dependency Issues or PRs related to dependency changes kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 7, 2023
go.mod Outdated Show resolved Hide resolved
@jpbetz jpbetz force-pushed the cost-fix branch 3 times, most recently from c228a6e to f63c61b Compare August 7, 2023 20:52
@jpbetz
Copy link
Contributor Author

jpbetz commented Aug 7, 2023

/priority important-soon

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Aug 7, 2023
@@ -1688,12 +1688,12 @@ func TestCostEstimation(t *testing.T) {
"before": strType,
"after": strType,
})
objType = withRule(objType, "self.str.replace(self.before, self.after) == 'does not matter'")
objType = withRule(objType, "self.str.replace(self.before, self.after) == '0123456789'")
Copy link
Member

@liggitt liggitt Aug 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any explanation for why this test changed or why the new value is better / more correct?

asked another way, if we'd kept the value 'does not matter', would the expectedCalcCost / expectedSetCost have changed?

Copy link
Contributor Author

@jpbetz jpbetz Aug 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. This is the only test that exercised the behavior being fixed. But it failed to catch it.

The estimator calculates the cost of the equals check based on the min(length(LHS), length(RHS)) of the equal operation.

Before fixing the bug it was doing min(length(MAX_INT), length("does not matter")) and so the expected cost values were based on the length of "does not matter". When I fixed the bug it changed the expected values because length(<expected replace result size>) became less than "does not matter". This is what the test should have expected all along.

I actually don't need to keep this string change, I modified it when testing my assumptions locally, I'll revert it. The expected values will still change.

@jpbetz
Copy link
Contributor Author

jpbetz commented Aug 7, 2023

/hold
For until code thaw per slack discussion with @dims

EDIT: We'll backport to 1.28.1 via #119807

@jpbetz
Copy link
Contributor Author

jpbetz commented Aug 7, 2023

/retest

@alexzielenski
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 8, 2023
@sftim
Copy link
Contributor

sftim commented Aug 14, 2023

Changelog suggestion

Fixed a bug where CEL expressions in CRD validation rules would incorrectly compute a high estimated cost for functions that return strings, lists or maps.
The incorrect cost was evident when the result of a function was used in subsequent operations.

Does this definitely only apply to CRD validation and not elsewhere?

@jpbetz
Copy link
Contributor Author

jpbetz commented Aug 16, 2023

Changelog suggestion

Fixed a bug where CEL expressions in CRD validation rules would incorrectly compute a high estimated cost for functions that return strings, lists or maps.
The incorrect cost was evident when the result of a function was used in subsequent operations.

Does this definitely only apply to CRD validation and not elsewhere?

Yes, this applies only to CRD validation. It is the only feature that uses estimated cost.

@jpbetz
Copy link
Contributor Author

jpbetz commented Aug 16, 2023

/hold cancel
This is ready to merge.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 16, 2023
@liggitt
Copy link
Member

liggitt commented Aug 16, 2023

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 16, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 31c38f82edcf00bb81d314b41b7c74439caacc5d

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jpbetz, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 16, 2023
@jpbetz
Copy link
Contributor Author

jpbetz commented Aug 16, 2023

/retest

@k8s-ci-robot k8s-ci-robot merged commit 210a97e into kubernetes:master Aug 16, 2023
13 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.29 milestone Aug 16, 2023
@Rajalakshmi-Girish
Copy link
Contributor

Merge of this is giving below error in ppc architecture:

--- FAIL: TestCostEstimation (1.90s)
    --- FAIL: TestCostEstimation/extended_library_replace (0.03s)
        --- FAIL: TestCostEstimation/extended_library_replace/calc_maxLength (0.02s)
            compilation_test.go:1223: Wrong cost (expected 629152, got 629154)
        --- FAIL: TestCostEstimation/extended_library_replace/set_maxLength (0.01s)
            compilation_test.go:1223: Wrong cost (expected 14, got 16)

@jpbetz Please let us know if anything has to be changed specific to architecture.

@jpbetz
Copy link
Contributor Author

jpbetz commented Aug 17, 2023

Joe Betz Please let us know if anything has to be changed specific to architecture.

There is not. The cost of this expression changed from 16 to 14 in cel-go 1.16.1 (https://github.com/google/cel-go/commits/v0.16.1) by this exact amount. Can you point me to the test run, I'd like to validate that the dependency update was applied.

@Rajalakshmi-Girish
Copy link
Contributor

Joe Betz Please let us know if anything has to be changed specific to architecture.

There is not. The cost of this expression changed from 16 to 14 in cel-go 1.16.1 (https://github.com/google/cel-go/commits/v0.16.1) by this exact amount. Can you point me to the test run, I'd like to validate that the dependency update was applied.

https://prow.ppc64le-cloud.cis.ibm.net/view/gs/ppc64le-kubernetes/logs/periodic-kubernetes-unit-test-ppc64le/1692340684859641856 is the link to CI job that has been failing with above test.

@jpbetz
Copy link
Contributor Author

jpbetz commented Aug 18, 2023

https://prow.ppc64le-cloud.cis.ibm.net/view/gs/ppc64le-kubernetes/logs/periodic-kubernetes-unit-test-ppc64le/1692340684859641856 is the link to CI job that has been failing with above test.

Yes, the test failures exactly matche what would happen if somehow the cel-go bump from 1.16.0 to 1.16.1 was somehow not applied.

I checked the cel-go code for anything architecture specific and found nothing. I think our best lead at the moment is that the failure exactly matches what should happen if the dependency bump is not picked up. I've opened a slack thread: https://kubernetes.slack.com/archives/C09QZ4DQB/p1692373534287669

@jpbetz
Copy link
Contributor Author

jpbetz commented Aug 22, 2023

#120097 fixes the ppc64le issue.

k8s-ci-robot added a commit that referenced this pull request Sep 6, 2023
…00-origin-release-1.28

Automated cherry pick of #119800: Fixes CEL estimated cost to propagate result sizes correctly
k8s-ci-robot added a commit that referenced this pull request Sep 7, 2023
Manual cherry pick of #119800: Fixes CEL estimated cost to propagate result sizes correctly
k8s-ci-robot added a commit that referenced this pull request Sep 7, 2023
Manual cherry pick of #119800: Fixes CEL estimated cost to propagate result sizes correctly
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/cloudprovider area/code-generation area/dependency Issues or PRs related to dependency changes cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

CEL Estimated cost incorrect for lowerAscii
6 participants