-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-23679 Use new FileSystem objects during bulk loads #1019
Conversation
Undoes lots of fanciness about FileSystem caching because of an explicit bug that was present (FileSystem only closed on one out of N regionservers, and not every time), but also because the FS caching itself is dodgy and prone to error. Each BulkLoad request will now get its own FileSystem instance that it is responsible for closing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 pending qabot
Looks like this isn't quite sufficient. Another leak (albeit, much slower) coming here. Need to do more to push down that DFS instance we made and use that until we move the files into the final location. Added some debug to FileSystem.java to see the above. Testing is just done via IntegrationTestBulkLoad with high number of loops but small chain length. |
💔 -1 overall
This message was automatically generated. |
Ugh, there's a few of these where, down in HStore, HRegion, and even the WAL code (ugh), which is all invoked via bulk load where we do a
The fix is to just push down the FileSystem or use one that is already created, but this gets tricky in some places. Will need to step back from this all and see if there's a better way to do this. |
Thinking about this before I leave it for the night -- my initial thought was that as long as, in SecureBulkLoadManager, we managed the lifecycle of our FileSystem correctly, we'd be fine. But, the reality is that other parts of HBase are (inadvertently) relying on a FileSystem being cached. The smaller fix may be to re-add the caching, keep the reference counting, but correct the clean-up logic. |
Closing in favor of #1029 |
Undoes lots of fanciness about FileSystem caching because of
an explicit bug that was present (FileSystem only closed on one
out of N regionservers, and not every time), but also because
the FS caching itself is dodgy and prone to error.
Each BulkLoad request will now get its own FileSystem instance
that it is responsible for closing.
The change to HRegion is because ${hbase.rootdir}/data is
chmod 700
which means that a normal user cannot get the size of those files (you'll see lots of AccessDenied exceptions in the RS log). Using HRegionFilesystem instead keeps this from being an issue (reading the filesize as HBase instead of the user).