Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add caching file system to hive connector #13904

Merged
merged 1 commit into from
Jan 8, 2020
Merged

Add caching file system to hive connector #13904

merged 1 commit into from
Jan 8, 2020

Conversation

jainxrohit
Copy link
Contributor

@jainxrohit jainxrohit commented Dec 30, 2019

== RELEASE NOTES ==

Hive Changes
* Allow reading data from HDFS while caching the fetched data on local disks. Turn on the feature by specifying the cache directory config `cache.base-directory`.

Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments

Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the period in the commit title? https://chris.beams.io/posts/git-commit/ is a good commit message guideline.

@jainxrohit jainxrohit changed the title Add caching file system to hive connector. Add caching file system to hive connector Jan 1, 2020
@jainxrohit
Copy link
Contributor Author

Could you remove the period in the commit title? https://chris.beams.io/posts/git-commit/ is a good commit message guideline.

Nice article, fixed the commit message.

Copy link
Contributor

@shixuan-fan shixuan-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % nits

@jainxrohit jainxrohit changed the title Add caching file system to hive connector [WIP] Add caching file system to hive connector Jan 2, 2020
@highker
Copy link
Contributor

highker commented Jan 6, 2020

The test failure is due to permission/auth setting. Try overriding the following function in CachingFileSystem

    @Override
    public void setPermission(Path path, FsPermission permission)
            throws IOException
    {
        dataTier.setPermission(path, permission);
    }

But in general, I would suggest overriding all default functions from FileSystem. A good example is FilterFileSystem

@jainxrohit jainxrohit changed the title [WIP] Add caching file system to hive connector Add caching file system to hive connector Jan 6, 2020
Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coding style comments

Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments

Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @shixuan-fan, could you give it a final review and merge it?

Copy link
Contributor

@shixuan-fan shixuan-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, ideally we would want to have three commits:

  • Fixing caching file system
  • Raptor side change
  • Hive side change

But since it is already reviewed, I won't bother breaking it down. I'll merge it once we've completed the internal repo pull request that adapt to this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants