app-server: Only unload threads which were unused for some time#17398
app-server: Only unload threads which were unused for some time#17398euroelessar merged 3 commits intomainfrom
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3b1bcf8548
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 51e1849342
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| self.thread_state_manager | ||
| .remove_connection(connection_id) | ||
| .await; |
There was a problem hiding this comment.
Restore stale-thread teardown on connection close
connection_closed now discards the remove_connection result, so it never reconciles threads whose last subscriber disconnected but whose core thread is already gone. The previous logic called finalize_thread_teardown in that case. Without it, stale app-server state (thread state/watch bookkeeping and pending request mappings) can persist indefinitely.
Useful? React with 👍 / 👎.
| let Some(mut unloading_state) = UnloadingState::new( | ||
| &listener_task_context, | ||
| conversation_id, | ||
| THREAD_UNLOADING_DELAY, | ||
| ) | ||
| .await | ||
| else { | ||
| return; |
There was a problem hiding this comment.
Fail subscription when listener setup cannot start
ensure_listener_task_running_task silently returns when UnloadingState::new is None, but its caller still reports Attached. In a concurrent teardown window this yields a successful subscribe response without any listener task, so the client misses turn/item/status events until it retries.
Useful? React with 👍 / 👎.
| const LOGIN_CHATGPT_TIMEOUT: Duration = Duration::from_secs(10 * 60); | ||
| const LOGIN_ISSUER_OVERRIDE_ENV_VAR: &str = "CODEX_APP_SERVER_LOGIN_ISSUER"; | ||
| const APP_LIST_LOAD_TIMEOUT: Duration = Duration::from_secs(90); | ||
| const THREAD_UNLOADING_DELAY: Duration = Duration::from_secs(30 * 60); |
There was a problem hiding this comment.
Does this mean that the issue would still happen on a task that runs for 30 minutes?
There was a problem hiding this comment.
Never mind just caught up on the Slack thread
Currently app-server may unload actively running threads once the last connection disconnects, which is not expected.
Instead track when was the last active turn & when there were any subscribers the last time, also add 30 minute idleness/no subscribers timer to reduce the churn.