TAJO-1950: Query master uses too much memory during range shuffle#884
TAJO-1950: Query master uses too much memory during range shuffle#884jihoonson wants to merge 43 commits intoapache:masterfrom
Conversation
|
This patch is not ready for review. I need more tests including CI test. |
…into TAJO-1950
…into TAJO-1950 Conflicts: tajo-core/src/main/java/org/apache/tajo/worker/TaskImpl.java tajo-pullserver/src/main/java/org/apache/tajo/pullserver/TajoPullServerService.java
|
Hi, this patch is ready for review.
|
|
I've additionally added some codes to clean up index cache upon eb completion. |
There was a problem hiding this comment.
It should be future.channel().close()
|
@jinossy thanks for your review. |
|
I see, please go ahead. |
|
@jinossy thanks! I've updated my patch. |
|
Looks good to me! |
|
I have a concern about too long of MAXIMUM HTTP REQUEST. It may cause too large granularity for repeat fetch. |
|
@hyunsik @jihoonson |
|
+1 LGTM! |
|
@jinossy thanks for your review. I'm testing this patch against 10TB dataset. I'll share the result when it finishes and then commit this patch. |
|
Well, there remain some issues around fetch timeout. When I tested this patch against 10TB dataset, a lot of fetch timeouts occurred while transferring intermediate data between stages. The main reason looks that index lookup takes a lot of time (over 30 seconds with cache miss). So, I think the fundamental solution is to improve index search performance which need to be handled in another jira. |
|
I've decreased the default max url length to 1KB. Maybe we can find more optimized value of max url length and fetch timeout. |
|
@jihoonson |
|
@jinossy thanks for your comment. I'll commit shortly. |
No description provided.