-
-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement WebHDFS to acces remote HDFS storage #7828
Comments
WebHDFS appears to be a REST API for HDFS. Is that commonly deployed? How much use would it get? Httpfs appears to be compatible with webhdfs so one backend would cover both of them. So I think this would need a new backend to access. Maybe your organization would like to sponsor the development of such a thing?
I don't know what that means - can you explain? |
WebHDFS is integrated with HDFS Namenode, so it is available with just a flick of configuration. So just exposing WebHDFS, is way safer and controled. Also, Apache Knox is an HTTP proxy that provides enhanced authentication/authorization omn top of Hadoop web services, such as WebHDFS. Using Knox, your HDFS namenode can sit in your internal network, and you just expose the Knox proxy (in your DMZ). About sponsoring such a development, yes, that would be a possibility. Could you give me a gross estimate of how much it would cost? (we can discuss this in private) |
That is good to know.
That is good to know as well. I have almost no practical experience with HDFS other than with the rclone backend - I don't have access to a real cluster only the docker test image we use!
Probably best to drop an email to sales@rclone.com and we can discuss. Thank you. |
If you need access to a real Hadoop cluster, just let me know :)
Sure! |
Hi,
I've been playing with the HDFS remote (with some problems, I've opened a bug report).
I'm wondering if there is any plan to add support for WebHDFS or Httpfs to allow access to remote Hadoop clusters (probably via Knox).
The current HDFS remote requires full network visibility of all implicated machines (namenode and datanodes), and this is not always the case, as when transfering from the "outside".
The text was updated successfully, but these errors were encountered: