Skip to content

Nexus Standalone: wire up#9837

Merged
stephanos merged 21 commits intofeature/nexus-standalonefrom
stephanos/ns-standalone-wire-up
Apr 21, 2026
Merged

Nexus Standalone: wire up#9837
stephanos merged 21 commits intofeature/nexus-standalonefrom
stephanos/ns-standalone-wire-up

Conversation

@stephanos
Copy link
Copy Markdown
Contributor

@stephanos stephanos commented Apr 7, 2026

What changed?

Wire up Nexus Standalone with CHASM.

How did you test it?

  • built
  • run locally and tested manually
  • covered by existing tests
  • added new unit test(s)
  • added new functional test(s)

@stephanos stephanos force-pushed the stephanos/ns-standalone-wire-up branch from 2253f0c to b503714 Compare April 7, 2026 03:46
@stephanos stephanos changed the title Stephanos/ns standalone wire up Nexus Standalone: wire up Apr 7, 2026
@stephanos stephanos force-pushed the feature/nexus-standalone branch 2 times, most recently from 10d3595 to 83bfb53 Compare April 7, 2026 20:38
@stephanos stephanos force-pushed the stephanos/ns-standalone-wire-up branch from b503714 to 56c2d98 Compare April 7, 2026 20:59
@stephanos stephanos force-pushed the feature/nexus-standalone branch from 83bfb53 to 6a0bd6e Compare April 8, 2026 18:31
@stephanos stephanos force-pushed the stephanos/ns-standalone-wire-up branch from 56c2d98 to 517d0bc Compare April 8, 2026 18:58
@stephanos stephanos force-pushed the feature/nexus-standalone branch from 6a0bd6e to 88281af Compare April 8, 2026 19:17
@stephanos stephanos force-pushed the stephanos/ns-standalone-wire-up branch 2 times, most recently from f69fecd to 7094ffa Compare April 8, 2026 21:52
})

t.Run("ReturnsLastAttemptFailure", func(t *testing.T) {
t.Skip("Enable once standalone Nexus task and cancellation executors are wired through public Nexus task APIs")
Copy link
Copy Markdown
Contributor Author

@stephanos stephanos Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this (was slightly mislabelled)


for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
t.Skip("Enable once standalone Nexus task and cancellation executors are wired through public Nexus task APIs")
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this (was slightly mislabelled)

@stephanos stephanos force-pushed the stephanos/ns-standalone-wire-up branch 18 times, most recently from 470834e to 643f340 Compare April 9, 2026 03:31
@stephanos stephanos force-pushed the feature/nexus-standalone branch from 22ed5f4 to 8b51e02 Compare April 13, 2026 19:48
@stephanos stephanos force-pushed the stephanos/ns-standalone-wire-up branch from b936601 to 103b0e1 Compare April 13, 2026 20:29
@stephanos stephanos force-pushed the feature/nexus-standalone branch from 8b51e02 to d13cafb Compare April 13, 2026 20:51
},
expectedStatus: enumspb.NEXUS_OPERATION_EXECUTION_STATUS_CANCELED,
expectedFailureMessage: "cancel failure",
},
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New test since cancellation is in now

Comment thread chasm/lib/nexusoperation/operation.go Outdated
Comment on lines +393 to +395
if o.Status == nexusoperationpb.OPERATION_STATUS_TIMED_OUT && o.LastAttemptFailure != nil {
return nil, o.LastAttemptFailure
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the special case added here

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the operation fails with a timeout error, we still have the last attempt failure in the describe response, but that's different from the cause of the operation failure.

Copy link
Copy Markdown
Contributor Author

@stephanos stephanos Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must have misunderstood our standup conversation about this from a while ago; I'll remove it.

Comment thread tests/nexus_standalone_test.go Outdated
return nil, err
}
return commonnexus.NexusFailureToTemporalFailure(nexusFailure)
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NexusTestEnv has a helper for this but the test hasn't been converted to use it yet; follow up ...


message StartNexusOperationRequest {
string namespace_id = 1;
string endpoint_id = 2;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"breaking change"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😱

stephanos and others added 12 commits April 14, 2026 13:25
Add API boilerplate for standalone Nexus Operations.

- [x] built
- [ ] run locally and tested manually
- [x] covered by existing tests
- [ ] added new unit test(s)
- [ ] added new functional test(s)
Add Nexus Standalone feature flag.

Tests will be added to respective API impl.
Add Nexus Standalone Describe and Start handlers.

- [ ] built
- [ ] run locally and tested manually
- [ ] covered by existing tests
- [x] added new unit test(s)
- [x] added new functional test(s)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add Nexus Standalone List and Count handlers.

- [ ] built
- [ ] run locally and tested manually
- [ ] covered by existing tests
- [ ] added new unit test(s)
- [x] added new functional test(s)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread chasm/lib/nexusoperation/config.go
Comment thread chasm/lib/nexusoperation/operation.go Outdated

switch {
case !hasOutcome:
return nil, o.LastAttemptFailure
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern with how the code is structured is that it assumes that the caller of this function checked that the operation is closed already. Otherwise the last attempt as outcome would be incorrect.

Copy link
Copy Markdown
Contributor Author

@stephanos stephanos Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method had that issue before this refactor, right (it fell back to o.LastAttemptFailure)?

But I agree, that seems wrong - I'll make it return nil instead.

Copy link
Copy Markdown
Contributor Author

@stephanos stephanos Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This originated in SAA; I see the outcome method there falls back to LastAttempt if there is no success/failure - but only if it's closed. I wonder what the use case is.

Do you think it's safe to assume that a closed operation will always have success/failure outcome set? Or do we need the fallback to LastAttemptFailure like SAA has?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to fallback to LastAttemptFailure. My point was about the function should be self-contained and not make an assumption on how it is called.

Comment thread chasm/lib/nexusoperation/operation.go Outdated
Comment on lines +393 to +395
if o.Status == nexusoperationpb.OPERATION_STATUS_TIMED_OUT && o.LastAttemptFailure != nil {
return nil, o.LastAttemptFailure
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the operation fails with a timeout error, we still have the last attempt failure in the describe response, but that's different from the cause of the operation failure.


// Result is set by standalone operations. Workflow-attached operations store
// the result in the history event and remove the operation after this transition.
if event.Result != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just always pass in the result IMHO. It doesn't matter for workflow operations since the operation would be immediately deleted from the tree and not take up any additional storage.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(which would make this if redundant)

// Result is set by standalone operations. Workflow-attached operations store
// the result in the history event and remove the operation after this transition.
if event.Result != nil {
o.outcome(ctx).Variant = &nexusoperationpb.OperationOutcome_Successful_{
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call outcome() getOrCreateOutcome() to clarify that this is what happens under the hood?

}

var links []nexus.Link
if args.nexusLink != (nexus.Link{}) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just make args.nexusLinks plural and not deal with the empty state. Also there should always be a link, so not sure why this line is needed.


message StartNexusOperationRequest {
string namespace_id = 1;
string endpoint_id = 2;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😱

Copy link
Copy Markdown
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very close now. The comments are blocking but I trust you to merge only after addressing.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are missing populating a link for the standalone nexus operation. You need to generate a link in this format in loadStartArgs:

    // A link to a standalone Nexus operation.
    message NexusOperation {
        string namespace = 1;
        string operation_id = 2;
        string run_id = 3;
    }


// Result is set by standalone operations. Workflow-attached operations store
// the result in the history event and remove the operation after this transition.
if event.Result != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(which would make this if redundant)

})
}

func TestDescribeOutcome(t *testing.T) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need a test for when we use the last attempt failure instead of the outcome field.

Copy link
Copy Markdown
Contributor Author

@stephanos stephanos Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bergundy confused about this one; I had removed (dccc430) the "return last attempt failure as outcome" behavior entirely based on the first review comment.

I'll merge for now to unblock us but happy to address this as a follow-up.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I clarified the comment above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants