New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fetch active task payloads from memory #15377
Fetch active task payloads from memory #15377
Conversation
} | ||
|
||
@Nullable | ||
Task getActiveTask(String taskId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of Nullable
, this can return an Optional
.
@@ -104,7 +104,14 @@ public List<TaskStatusPlus> getTaskStatusPlusList( | |||
|
|||
public Optional<Task> getTask(final String taskid) | |||
{ | |||
return storage.getTask(taskid); | |||
// Try to fetch active task from memory | |||
final Task activeTask = taskLockbox.getActiveTask(taskid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this should always be called under the giant lock no ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've ensured that the call happens within a lock in the new TaskQueue method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be nice to have but I am not sure if it is okay to keep the payload of all active tasks in memory. This could significantly increase the memory usage of the Overlord e.g. in case of many concurrent streaming tasks.
@kfaraz thank you for your feedback. Instead of maintaining a separate map which could increase the memory usage of the Overlord, this patch now utilizes the map of active tasks that already exists in the TaskQueue |
@@ -947,6 +947,25 @@ public CoordinatorRunStats getQueueStats() | |||
return stats; | |||
} | |||
|
|||
public Optional<Task> getTask(String id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better for this method to return only active tasks, i.e. it shouldn't have to query the underlying storage on behalf of the caller. As such, this method should be renamed to getActiveTask
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
The TaskQueue maintains a map of active task ids to tasks, which can be utilized to get active task payloads, before falling back to the metadata store.
The TaskQueue maintains a map of active task ids to tasks, which can be utilized to get active task payloads, before falling back to the metadata store.
The TaskQueue maintains a map of active task ids to tasks, which can be utilized to get active task payloads, before falling back to the metadata store.
This PR has: