Use base task dir in kubernetes task runner#13880
Conversation
…e-task-dir-in-kubernetes-task-scheduler
|
I don't understand. is there any material impact this change will have? For sake of consistency, we prefer k8s task runner use the same interface as other runners. |
I see the benefits of this change to be twofold. Reduce Confusion As a druid contributor, I would be confused why this concept exists within the Reduce state that needs to be tracked by the KubernetesTaskRunner Alternatively we could add a new concept to the Another alternative would be to introduce a new interface and have two different implementations (one that tracks directories and another that returns the base task dir) however this seemed to be overkill. In the end I decided to leave task dir management policy up to the implementation of the |
gianm
left a comment
There was a problem hiding this comment.
Thanks for the explanation. The logic makes sense to me. Please have a look at the comment about using getBaseTaskDirPaths instead.
| generateCommand(task), | ||
| javaOpts(task), | ||
| dirTracker.getTaskDir(task.getId()), | ||
| new File(taskConfig.getBaseTaskDirPath()), |
There was a problem hiding this comment.
getBaseTaskDirPath is for JSON backwards compat; it may be null and isn't used elsewhere. Better to do getBaseTaskDirPaths().get(0). Please also add @Deprecated to TaskConfig.getBaseTaskDirPath to make it clearer to others that this method shouldn't be used.
…e-task-dir-in-kubernetes-task-scheduler
…d baseTaskDirPath
...es-overlord-extensions/src/main/java/org/apache/druid/k8s/overlord/KubernetesTaskRunner.java
Fixed
Show fixed
Hide fixed
* Use TaskConfig to get task dir in KubernetesTaskRunner * Use the first path specified in baseTaskDirPaths instead of deprecated baseTaskDirPath * Use getBaseTaskDirPaths in generate command
Use base task directory from from task config in kubernetes task scheduler
Get the base task directory from TaskConfig instead of TaskStorageDirTracker
The Kubernetes task runner runs each task within its own machine context (container). Given this, there is no need to allocate and deallocate a directory per task because only one task will ever use the directory, and in the case of the TaskStorageDirTracker only one of the specified directories will ever be used. More info here
Release note
Key changed/added classes in this PR
KubernetesTaskRunnerThis PR has: