-
Notifications
You must be signed in to change notification settings - Fork 504
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
IMPALA-4840: Fix REFRESH performance regression.
The fix for IMPALA-4172 introduced a regression in performance of the REFRESH command. The regression stems from the fact that we reload the block metadata of every valid data file without considering whether it has changed since the last load. This caused unnecessary metadata loads for unchanged files and thus increasing the runtime. The fix involves having the refresh codepath (and other operations that use the same codepath like insert etc.) to reload the metadata of only modified files by doing a listStatus() on the partition directory and checking the last modified time of each file. Without this patch, we relied on listFiles(), which fetched the block locations irrespective of whether the file has changed and it was significantly slower on unchanged tables. The initial/invalidate metadata load still fetches the block locations in bulk using listFiles(). The side effect of this change is that the refresh no longer picks up block location changes after HDFS block rebalancing. We suggest using "invalidate metadata" for that which loads the metadata from scratch. Additionally, this commit enables the reuse of metadata during table refresh (which was disabled in IMPALA-4172) to prevent reloading metadata from HMS everytime. Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8 Reviewed-on: http://gerrit.cloudera.org:8080/6009 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: Impala Public Jenkins
- Loading branch information
Bharath Vissapragada
authored and
Impala Public Jenkins
committed
Feb 16, 2017
1 parent
bd1d445
commit 26eaa26
Showing
2 changed files
with
110 additions
and
32 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters