Skip to content

Conversation

@olamide226
Copy link
Contributor

@olamide226 olamide226 commented Sep 15, 2025

Summary

Adds restart functionality for MCPServer pods via annotations without requiring CRD schema changes. This feature enables users to trigger graceful restarts of MCPServer instances through GitOps workflows and operational commands.

Changes Made

  • New Restart Annotations:

    • mcpserver.toolhive.stacklok.dev/restarted-at - RFC3339 timestamp to trigger restart
    • mcpserver.toolhive.stacklok.dev/restart-strategy - Optional strategy selection ("rolling" or "immediate")
    • mcpserver.toolhive.stacklok.dev/last-processed-restart - Internal tracking (set by operator)
  • Two Restart Strategies:

    • Rolling (default): Zero-downtime rolling update via deployment pod template annotation
    • Immediate: Fast restart by directly deleting pods for quick recreation
  • Non-blocking Error Handling: Invalid annotations or operational failures don't prevent other reconciliation tasks (ToolConfig updates, spec changes, etc.)

  • Duplicate Prevention: Tracks last processed restart timestamp to avoid reprocessing same requests

Usage Examples

# Rolling restart (zero downtime, production-safe)
kubectl annotate mcpserver my-server \
  mcpserver.toolhive.stacklok.dev/restarted-at="$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
  --overwrite

# Immediate restart (fast, development/testing)
kubectl annotate mcpserver my-server \
  mcpserver.toolhive.stacklok.dev/restarted-at="$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
  mcpserver.toolhive.stacklok.dev/restart-strategy="immediate" \
  --overwrite

ref #1880

Copilot AI review requested due to automatic review settings September 15, 2025 03:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds restart functionality for MCPServer pods through annotations, enabling users to trigger graceful restarts via GitOps workflows without requiring CRD schema changes.

  • Implements restart annotation support with two strategies: rolling (zero-downtime) and immediate (fast restart)
  • Adds comprehensive test coverage for restart functionality with various scenarios
  • Provides documentation and examples for the new restart annotation feature

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
cmd/thv-operator/controllers/mcpserver_controller.go Core restart functionality implementation with annotation handling and restart strategies
cmd/thv-operator/controllers/mcpserver_restart_test.go Comprehensive test suite covering restart annotation handling scenarios
docs/operator/restart-annotation.md Complete documentation for the restart annotation feature
examples/operator/mcp-servers/mcpserver_with_restart_strategy.yaml Example configuration showing restart annotation usage

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@olamide226 olamide226 force-pushed the feat/restart-mcpserver-annotation-1880 branch 2 times, most recently from 31ece55 to 48f0b58 Compare September 15, 2025 03:22
@olamide226 olamide226 requested a review from Copilot September 15, 2025 03:24
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@olamide226 olamide226 force-pushed the feat/restart-mcpserver-annotation-1880 branch 3 times, most recently from 82bc708 to 0db71b2 Compare September 15, 2025 03:46
@codecov
Copy link

codecov bot commented Sep 15, 2025

Codecov Report

❌ Patch coverage is 91.07143% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 46.91%. Comparing base (395b6ba) to head (3db6851).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
...d/thv-operator/controllers/mcpserver_controller.go 91.07% 7 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1882      +/-   ##
==========================================
+ Coverage   46.59%   46.91%   +0.31%     
==========================================
  Files         220      220              
  Lines       27201    27319     +118     
==========================================
+ Hits        12674    12816     +142     
+ Misses      13567    13528      -39     
- Partials      960      975      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JAORMX
Copy link
Collaborator

JAORMX commented Sep 15, 2025

The functionality LGTM! Seems you gotta fix the linter warnings and then it's good to go. Thanks a lot for taking a look at this @olamide226 !

@olamide226 olamide226 force-pushed the feat/restart-mcpserver-annotation-1880 branch from 0db71b2 to 0e277a0 Compare September 15, 2025 18:37
olamide226 and others added 4 commits September 16, 2025 12:41
…iate strategies

- Implement restart trigger via mcpserver.toolhive.stacklok.dev/restarted-at annotation (RFC3339 timestamp)
- Support optional mcpserver.toolhive.stacklok.dev/restart-strategy annotation (rolling or immediate)
- Rolling strategy updates deployment for zero-downtime restart (default)
- Immediate strategy deletes pods for fast restart
- Track last processed restart in status.lastRestartRequest to prevent duplicate restarts
- Update docs and examples to reflect new annotation format and usage

Signed-off-by: Ola Adebayo <34113844+olamide226@users.noreply.github.com>
…iliation

Add first test suite for MCPServer restart annotation logic in the operator controller.
Covers rolling and immediate restart strategies, including edge cases (invalid timestamp, already processed, missing resources, unknown strategy).

Signed-off-by: Ola Adebayo <34113844+olamide226@users.noreply.github.com>
Add restart functionality via annotations without CRD schema changes:
- Support mcpserver.toolhive.stacklok.dev/restarted-at annotation (RFC3339 timestamp)
- Support mcpserver.toolhive.stacklok.dev/restart-strategy annotation (rolling/immediate)
- Rolling strategy triggers zero-downtime deployment update (default)
- Immediate strategy deletes pods for fast restart
- Track last processed restart in annotations to prevent duplicates
- Update documentation and examples for new annotation usage

Resolves operator restart requirements while maintaining backward compatibility.

Signed-off-by: Ola Adebayo <34113844+olamide226@users.noreply.github.com>
… handling

Signed-off-by: Ola Adebayo <olamideadebayo2001@gmail.com>
@olamide226 olamide226 force-pushed the feat/restart-mcpserver-annotation-1880 branch from b20b6ef to 49d4fd5 Compare September 16, 2025 11:41
@JAORMX JAORMX merged commit 24a2f37 into stacklok:main Sep 17, 2025
14 checks passed
@olamide226 olamide226 deleted the feat/restart-mcpserver-annotation-1880 branch September 17, 2025 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants