Open
Description
Problem
In gen_subtask_graph
, Mars always create new out chunks even if the out chunk already exists. It costs a lot of time if there are plenty of chunks. Actually, there is no need to create out chunks except the FetchChunk
.
I did a comparison, in which one creates new out chunks and the other does not. The result shows that the cost is reduced from 122.92s to 56.63s with about 53.93% reduction.
So we should optimize the SubtaskGraph
generation and make it more efficient.