Skip to content

push: can't use user key in webhdfs remote configuration #10062

@gdiepen

Description

@gdiepen

Bug Report

Description

When using webHDFS as the remote backend, in the documentation it is stated you can provide the user key (see documentation at https://dvc.org/doc/user-guide/data-management/remote-storage/hdfs#webhdfs-configuration-parameters)

However, if you try to use provide a user key it is stated that this key is not expected.

I am trying to connect to a webhdfs server behind a proxy that requires basic auth. For this to work, I at least needed the user (but also a password as well as a data_proxy key).

I have already created a PR for the filesystem_spec for webhdfs (fsspec/filesystem_spec#1409) that adds support for basic authentication.

As soon as that is merged, I have the code already ready for dvc to include support for the basic authentication. After the filesystem_spec PR is merged, I will create a PR with the modifications for DVC

Reproduce

  1. dvc init
  2. dvc remote add foobar webhdfs://server
  3. dvc remote modify foobar user aaa

Expected

No error

Environment information

Output of dvc doctor:

DVC version: 3.27.0 (pip)
-------------------------
Platform: Python 3.10.13 on Linux-5.15.0-87-generic-x86_64-with-glibc2.35
Subprojects:
        dvc_data = 2.18.1
        dvc_objects = 1.0.1
        dvc_render = 0.6.0
        dvc_task = 0.3.0
        scmrepo = 1.4.0
Supports:
        http (aiohttp = 3.8.6, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.6, aiohttp-retry = 2.8.3),
        s3 (s3fs = 2023.9.2, boto3 = 1.28.17),
        webhdfs (fsspec = 2023.9.2)
Config:
        Global: /home/guido/.config/dvc
        System: /home/guido/.config/kdedefaults/dvc
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: webhdfs
Workspace directory: ext4 on /dev/sda2
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/05bc8ac7d6f64ba8a1c0383c98f12453

Additional Information (if any):

As mentioned above, I am awaiting the fsspec PR to be merged. After that, will create a small PR to enable the support for the new features in fsspec in DVC.

I have this already working in a locally patched dvc + fsspec

Related other PRs:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions