-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Bug Report
Description
When using webHDFS as the remote backend, in the documentation it is stated you can provide the user key (see documentation at https://dvc.org/doc/user-guide/data-management/remote-storage/hdfs#webhdfs-configuration-parameters)
However, if you try to use provide a user key it is stated that this key is not expected.
I am trying to connect to a webhdfs server behind a proxy that requires basic auth. For this to work, I at least needed the user (but also a password as well as a data_proxy key).
I have already created a PR for the filesystem_spec for webhdfs (fsspec/filesystem_spec#1409) that adds support for basic authentication.
As soon as that is merged, I have the code already ready for dvc to include support for the basic authentication. After the filesystem_spec PR is merged, I will create a PR with the modifications for DVC
Reproduce
- dvc init
- dvc remote add foobar webhdfs://server
- dvc remote modify foobar user aaa
Expected
No error
Environment information
Output of dvc doctor:
DVC version: 3.27.0 (pip)
-------------------------
Platform: Python 3.10.13 on Linux-5.15.0-87-generic-x86_64-with-glibc2.35
Subprojects:
dvc_data = 2.18.1
dvc_objects = 1.0.1
dvc_render = 0.6.0
dvc_task = 0.3.0
scmrepo = 1.4.0
Supports:
http (aiohttp = 3.8.6, aiohttp-retry = 2.8.3),
https (aiohttp = 3.8.6, aiohttp-retry = 2.8.3),
s3 (s3fs = 2023.9.2, boto3 = 1.28.17),
webhdfs (fsspec = 2023.9.2)
Config:
Global: /home/guido/.config/dvc
System: /home/guido/.config/kdedefaults/dvc
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: webhdfs
Workspace directory: ext4 on /dev/sda2
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/05bc8ac7d6f64ba8a1c0383c98f12453
Additional Information (if any):
As mentioned above, I am awaiting the fsspec PR to be merged. After that, will create a small PR to enable the support for the new features in fsspec in DVC.
I have this already working in a locally patched dvc + fsspec
Related other PRs:
- PR to add support in dvc-webhdfs Add support for data proxy dvc-webhdfs#16
- PR to update documentation webhdfs: add password and data_proxy_target dvc.org#4980