-
Notifications
You must be signed in to change notification settings - Fork 368
Description
Issue
A race condition exists during task creation when multiple tasks are started in parallel within a space that has a restrictive org or space memory quota which would be exceeded by the tasks. This can cause tasks to be marked as FAILED by Cloud Controller, even though they have already been sent to to Diego and are executing successfully.
Context
The issue stems from two separate quota validations:
- An initial validation occurs when the task is first created in a
PENDINGstate. For parallel requests, this check can pass for multiple tasks before the quota is consumed. - A second validation is triggered when the task state is updated to
RUNNINGafter being submitted to Diego.
When triggered in parallel a task can pass the first check and be sent to Diego (which will also execute it). However, by the time its state is updated to RUNNING, another parallel task may have already consumed the available quota. This causes the second validation to fail, and the Cloud Controller updates the task's state to FAILED. In reality the task was successfully executed by Diego.
Steps to Reproduce
Create a space with quota and assign it
cf create-space task-race-condition-test -o <some org>
# push a dummy app and stop it
cf create-space-quota repro-quota -m 1G -a 5
cf set-space-quota task-race-condition-test repro-quota
Run Two Tasks in Parallel
COMMAND="for i in {1..12}; do echo \"Task is running at \$(date)\"; sleep 5; done"
APP_NAME="task-app"
cf run-task "$APP_NAME" --command "$COMMAND" -m 600M --name task1 & cf run-task "$APP_NAME" --command "$COMMAND" -m 600M --name task2 &
Expected Result
The second task should not be forwarded to Diego as it will exceed the memory quota.
Current Result
Run Task
cf run-task "$APP_NAME" --command "$COMMAND" -m 600M --name task1 & cf run-task "$APP_NAME" --command "$COMMAND" -m 600M --name task2 &
Creating task for app task-app in org <org> / space task-race-condition-test ...
Creating task for app task-app in org <org / space task-race-condition-test ...
Task has been submitted successfully for execution.
OK
task name: task1
task id: 3
OK
memory_in_mb exceeds space memory quota
Task Status
cf tasks task-app
Getting tasks for app task-app in org <org> / space task-race-condition-test as ...
id name state start time command
4 task2 FAILED Fri, 24 Oct 2025 14:54:41 UTC for i in {1..12}; do echo "Task is running at $(date)"; sleep 5; done
3 task1 RUNNING Fri, 24 Oct 2025 14:54:41 UTC for i in {1..12}; do echo "Task is running at $(date)"; sleep 5; done
Task Status After Completion
cf tasks task-app
Getting tasks for app task-app in org <org> / space task-race-condition-test as ...
id name state start time command
4 task2 FAILED Fri, 24 Oct 2025 14:54:41 UTC for i in {1..12}; do echo "Task is running at $(date)"; sleep 5; done
3 task1 SUCCEEDED Fri, 24 Oct 2025 14:54:41 UTC for i in {1..12}; do echo "Task is running at $(date)"; sleep 5; done
CF Logs Output
2025-10-24T16:54:45.33+0200 [APP/TASK/task1/0] OUT Invoking pre-start scripts.
2025-10-24T16:54:45.37+0200 [APP/TASK/task1/0] OUT Invoking start command.
2025-10-24T16:54:45.37+0200 [APP/TASK/task1/0] OUT Task is running at Fri Oct 24 02:54:45 PM UTC 2025
2025-10-24T16:54:45.49+0200 [APP/TASK/task2/0] OUT Invoking pre-start scripts.
2025-10-24T16:54:45.52+0200 [APP/TASK/task2/0] OUT Invoking start command.
2025-10-24T16:54:45.53+0200 [APP/TASK/task2/0] OUT Task is running at Fri Oct 24 02:54:45 PM UTC 2025
2025-10-24T16:54:50.38+0200 [APP/TASK/task1/0] OUT Task is running at Fri Oct 24 02:54:50 PM UTC 2025
2025-10-24T16:54:50.53+0200 [APP/TASK/task2/0] OUT Task is running at Fri Oct 24 02:54:50 PM UTC 2025
2025-10-24T16:54:55.38+0200 [APP/TASK/task1/0] OUT Task is running at Fri Oct 24 02:54:55 PM UTC 2025
2025-10-24T16:54:55.53+0200 [APP/TASK/task2/0] OUT Task is running at Fri Oct 24 02:54:55 PM UTC 2025
2025-10-24T16:55:00.38+0200 [APP/TASK/task1/0] OUT Task is running at Fri Oct 24 02:55:00 PM UTC 2025
2025-10-24T16:55:00.54+0200 [APP/TASK/task2/0] OUT Task is running at Fri Oct 24 02:55:00 PM UTC 2025
2025-10-24T16:55:05.39+0200 [APP/TASK/task1/0] OUT Task is running at Fri Oct 24 02:55:05 PM UTC 2025
2025-10-24T16:55:05.54+0200 [APP/TASK/task2/0] OUT Task is running at Fri Oct 24 02:55:05 PM UTC 2025
2025-10-24T16:55:10.39+0200 [APP/TASK/task1/0] OUT Task is running at Fri Oct 24 02:55:10 PM UTC 2025
2025-10-24T16:55:10.55+0200 [APP/TASK/task2/0] OUT Task is running at Fri Oct 24 02:55:10 PM UTC 2025
2025-10-24T16:55:15.39+0200 [APP/TASK/task1/0] OUT Task is running at Fri Oct 24 02:55:15 PM UTC 2025
2025-10-24T16:55:15.55+0200 [APP/TASK/task2/0] OUT Task is running at Fri Oct 24 02:55:15 PM UTC 2025
2025-10-24T16:55:20.40+0200 [APP/TASK/task1/0] OUT Task is running at Fri Oct 24 02:55:20 PM UTC 2025
2025-10-24T16:55:20.55+0200 [APP/TASK/task2/0] OUT Task is running at Fri Oct 24 02:55:20 PM UTC 2025
2025-10-24T16:55:25.40+0200 [APP/TASK/task1/0] OUT Task is running at Fri Oct 24 02:55:25 PM UTC 2025
2025-10-24T16:55:25.56+0200 [APP/TASK/task2/0] OUT Task is running at Fri Oct 24 02:55:25 PM UTC 2025
2025-10-24T16:55:30.41+0200 [APP/TASK/task1/0] OUT Task is running at Fri Oct 24 02:55:30 PM UTC 2025
2025-10-24T16:55:30.56+0200 [APP/TASK/task2/0] OUT Task is running at Fri Oct 24 02:55:30 PM UTC 2025
2025-10-24T16:55:35.41+0200 [APP/TASK/task1/0] OUT Task is running at Fri Oct 24 02:55:35 PM UTC 2025
2025-10-24T16:55:35.56+0200 [APP/TASK/task2/0] OUT Task is running at Fri Oct 24 02:55:35 PM UTC 2025
2025-10-24T16:55:40.41+0200 [APP/TASK/task1/0] OUT Task is running at Fri Oct 24 02:55:40 PM UTC 2025
2025-10-24T16:55:40.57+0200 [APP/TASK/task2/0] OUT Task is running at Fri Oct 24 02:55:40 PM UTC 2025
2025-10-24T16:55:45.42+0200 [APP/TASK/task1/0] OUT Exit status 0
2025-10-24T16:55:45.57+0200 [APP/TASK/task2/0] OUT Exit status 0
Possible Fix
No response