-
Notifications
You must be signed in to change notification settings - Fork 982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support webhdfs in storageURI and storage spec #2077
Support webhdfs in storageURI and storage spec #2077
Conversation
Currently I have kerberos and hdfs python packages in the kserve sdk requirements.txt. This means all docker images need the Do you think it would be better to only install those packages and the I wonder if the storage module should be removed from the kserve sdk as it only seems to be used by the storage-initializer. That way storage-initializer dependencies are separate to the kserve sdk dependencies. Edit: I have now changed it only install in the |
2612a16
to
04f0701
Compare
I updated some of the code relating to the storage spec - namely removing the |
I've updated it now to hopefully handle all cases. Instead of splitting the env var handling and bucket injection between the python code and the controller, I'm now doing it all in the controller like storageUri does. Could you take another look please? |
Thanks @markwinter. One of the reasons I had to split the logic between controller and the python server is that we don't want to expose credentials such as |
c072fd1
to
a027ab8
Compare
@Tomcli |
thanks @markwinter |
@markwinter Can you rebase from your other PR? |
0e21b00
to
7037234
Compare
@yuzisun |
hi @markwinter, Just want to check if this PR supports general webHDFS with SSL/TLS. We have the exact requirement on webhdfs but we are not using the KerberosClient. we will need cert and key as well as some header to download our file through webhdfs from our remote Hadoop based storage. Is it possible to make the webhdfs support more general if it is not supported in this PR. I can help on it if needed. Thanks. |
Hey @lizzzcai This PR is only for Kerberized clusters but it would be good to support other methods of connecting to HDFS as well. Could you create an issue with the requirements and maybe an example of how you download a file? In this PR I've used hdfscli to download files. We can take a look at supporting it then |
Hi @markwinter , thanks, I am using the same |
bbb1709
to
4e3a39c
Compare
This PR now handles general WebHDFS support for storageURI and storage spec |
Hi @markwinter , I have tried it out and it is working fine for my use cases. One small feedback as I saw you encode the Another thing is for the protocol, should we name it as These are just some small suggestions but overall the code is working on my side! Thanks a lot!. |
@lizzzcai When using storageUri e.g. When using the storage spec (not yet released), the user has to write a json configuration in a secret called
Yes this might make more sense |
Thanks for your clarification. I have tested the BTW, I saw that this PR is under KSserve |
Thanks for testing that method as well 👍 I think 0.9 is planned for mid May |
Signed-off-by: Mark Winter <mark.winter@navercorp.com>
Signed-off-by: Mark Winter <mark.winter@navercorp.com>
@markwinter: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Hi @markwinter , for the protocol, can we update it to |
@lizzzcai |
Signed-off-by: Mark Winter <mark.winter@navercorp.com>
…rve into storage-init-hdfs-support
For now I have added support for both Reasoning: @lizzzcai would prefer the |
Thanks @markwinter ! Awesome work! /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: markwinter, yuzisun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* hdfs storage support Signed-off-by: Mark Winter <mark.winter@navercorp.com> * fix merge Signed-off-by: Mark Winter <mark.winter@navercorp.com> * check status to see if its a file or directory Signed-off-by: Mark Winter <mark.winter@navercorp.com> * remove unused import Signed-off-by: Mark Winter <mark.winter@navercorp.com> * Support webhdfs:// scheme too Signed-off-by: Mark Winter <mark.winter@navercorp.com> Signed-off-by: alexagriffith <agriffith96@gmail.com>
What this PR does / why we need it:
Adds support for
hdfs://
in the storageUri, and also in the new storage specIt requires a webhdfs enabled cluster. Optionally supports kerberized clusters too.
See the included documentation file for full usage details
Fixes #2135
Type of changes
Feature/Issue validation/testing:
Test on local cluster using storageUri
Created a secret like this and attached it to a service account
kubectl create secret generic hdfscreds --from-file=HDFS_KEYTAB=./markwinter.keytab.b64 --from-literal=HDFS_NAMENODE="https://redacted:port" --from-literal=HDFS_PRINCIPAL="markwinter@redacted" -n markwinter
Create an isvc
Test on local cluster using storage spec
storage-config
Checklist:
Release note: