Skip to content

fix: deep copy status in translator layer to avoid race#8778

Merged
zirain merged 1 commit intoenvoyproxy:mainfrom
rudrakhp:translator_coalesce_race
Apr 19, 2026
Merged

fix: deep copy status in translator layer to avoid race#8778
zirain merged 1 commit intoenvoyproxy:mainfrom
rudrakhp:translator_coalesce_race

Conversation

@rudrakhp
Copy link
Copy Markdown
Member

What type of PR is this?

fix: deep copy status in translator layer to avoid race

What this PR does / why we need it:
DeepCopy() calls were removed in #6940 but are required to avoid race between deepEqual() in watchable coalesce routine and status updates in gateway api translator routine. Replaced plain shallow copy with deep copy only status.

Also includes a minor spelling fix SeverOptions -> ServerOptions

Which issue(s) this PR fixes:

Fixes #8771

Release Notes: Yes

@rudrakhp rudrakhp requested a review from a team as a code owner April 17, 2026 09:17
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 17, 2026

Deploy Preview for cerulean-figolla-1f9435 canceled.

Name Link
🔨 Latest commit e8199ae
🔍 Latest deploy log https://app.netlify.com/projects/cerulean-figolla-1f9435/deploys/69e4dc614602da00085166eb

@rudrakhp rudrakhp force-pushed the translator_coalesce_race branch from cc8f6c8 to 99a5c61 Compare April 17, 2026 09:18
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cc8f6c8c98

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread internal/gatewayapi/envoypatchpolicy.go Outdated

for _, policy := range envoyPatchPolicies {
for i, orig := range envoyPatchPolicies {
envoyPatchPolicies[i] = deepCopyEnvoyPatchPolicyStatus(orig)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid mutating shared EnvoyPatchPolicies slice in place

Writing the deep-copied pointer back into envoyPatchPolicies[i] still mutates the snapshot object passed from GatewayAPIResources. If a new provider update arrives while translation is running, watchable’s equality/coalescing path can read that same slice concurrently, so this assignment can still trigger the control-plane race the patch is trying to eliminate. Keep the copy in a local variable and do not modify the input slice.

Useful? React with 👍 / 👎.

policy := &policy
// Clone the Object map to avoid racing with the watchable coalesce goroutine.
policy.Object = maps.Clone(policy.Object)
policies[policyIndex] = policy
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop writing cloned extension policies into input slice

policies[policyIndex] = policy mutates the shared resources.ExtensionServerPolicies snapshot before translation completes. During concurrent updates, watchable can deep-compare that same slice in another goroutine, so this write can race even after cloning policy.Object. Build status/IR from a local copied policy and leave the incoming slice unchanged.

Useful? React with 👍 / 👎.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 17, 2026

Codecov Report

❌ Patch coverage is 98.74214% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.39%. Comparing base (d80fb5b) to head (e8199ae).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
internal/gatewayapi/extensionserverpolicy.go 87.50% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8778      +/-   ##
==========================================
+ Coverage   74.35%   74.39%   +0.04%     
==========================================
  Files         245      245              
  Lines       38847    38951     +104     
==========================================
+ Hits        28883    28978      +95     
- Misses       7963     7969       +6     
- Partials     2001     2004       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread internal/gatewayapi/backend.go Outdated
// Status is mutated during translation and shares a pointer with the watchable coalesce goroutine.
func deepCopyBackendStatus(in *egv1a1.Backend) *egv1a1.Backend {
out := *in
in.Status.DeepCopyInto(&out.Status)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use backend.DeepCopy directly? which is better for understanding.

Copy link
Copy Markdown
Member Author

@rudrakhp rudrakhp Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is less efficient than this as we only need deep copy for status which is modified. Rest of the policy can be a shallow copy. We were already doing complete deep copy before #6940 tried to optimize it.

@rudrakhp rudrakhp force-pushed the translator_coalesce_race branch 2 times, most recently from c64694a to dc37ca3 Compare April 17, 2026 10:39
@rudrakhp rudrakhp requested review from a team and zirain April 17, 2026 11:14
@rudrakhp rudrakhp force-pushed the translator_coalesce_race branch from b0b74a1 to dbc7ecb Compare April 18, 2026 06:35
@jukie
Copy link
Copy Markdown
Contributor

jukie commented Apr 18, 2026

/retest

Comment thread internal/gatewayapi/securitypolicy.go Outdated
policy, found := handledPolicies[policyName]
if !found {
policy = currPolicy
policy = deepCopySecurityPolicyStatus(currPolicy)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cant we just do it once in L142 ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, creating copies once in XCopiesWithStatusDeepCopy() before reusing them.

@rudrakhp rudrakhp force-pushed the translator_coalesce_race branch from 52c3850 to 923e3b6 Compare April 19, 2026 07:41
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>
@rudrakhp rudrakhp force-pushed the translator_coalesce_race branch from 923e3b6 to e8199ae Compare April 19, 2026 13:45
@rudrakhp rudrakhp requested a review from arkodg April 19, 2026 13:56
Copy link
Copy Markdown
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks

@zirain zirain merged commit 3f70a89 into envoyproxy:main Apr 19, 2026
80 of 85 checks passed
rudrakhp added a commit that referenced this pull request Apr 24, 2026
* fix json report (#8614)

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

* fix: deep copy status in translator layer to avoid race (#8778)

Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

* chore: setup translator test (#7627)

* chore: setup gatewayapi translator test

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* remove: duplicate resources in testfiles

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* update gatewayapi test output

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

---------

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

* perf: introduce translator context (#7535)

* perf: introduce translator context

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* perf: add policy map in translator context

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* chore: update tranlate bench

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* Revert "perf: add policy map in translator context"

This reverts commit 14a3784.

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* add translator context method, remove unused function

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* update gatewayapi test output

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* add resources in translator context

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* add gatewayapi translate bench

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* fix: set resources in translator context

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* fix: go lint

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* address comments

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* feat: support egctl translate options

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* update: bench test

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* update embedded field

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* fix gatewayapi testdata output

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

---------

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>
Co-authored-by: zirain <zirain2009@gmail.com>
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

* fix gen-check

Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

---------

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>
Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>
Co-authored-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Co-authored-by: Kota Kimura <86363983+kkk777-7@users.noreply.github.com>
Co-authored-by: zirain <zirain2009@gmail.com>
skos-ninja pushed a commit to skos-ninja/envoy-gateway that referenced this pull request May 1, 2026
)

Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>
Signed-off-by: Jake Oliver <jake@truelayer.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Go panic in control plane

4 participants