Skip to content

[ML] Discrepancy in resetting of soft_limit status #1131

@droberts195

Description

@droberts195

If an anomaly detection job is approaching its memory limit then we start pruning old models more aggressively and the memory status is set to soft_limit. If the memory usage then reduces such that aggressive pruning is no longer required then we do not immediately reset the memory status to ok. However, if the job is closed and reopened, or if it switches nodes because the node it was running on leaves the cluster then the memory status is reset to ok if pruning is no longer required.

This discrepancy should be addressed. If aggressive pruning is no longer required then the job's memory status should be reset to ok without requiring a process restart.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions