Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DVC complains about missing S3 even if I specify a local remote. #9264

Open
Erotemic opened this issue Mar 28, 2023 · 2 comments
Open

DVC complains about missing S3 even if I specify a local remote. #9264

Erotemic opened this issue Mar 28, 2023 · 2 comments

Comments

@Erotemic
Copy link
Contributor

Bug Report

Making this report fairly quickly. I will submit a MWE if needed.

I have a DVC repo on my local system that contains an S3 remote. I executed the following steps:

  1. Create a fresh docker container and mount my local drive with the DVC repo on it.
  2. I clone the DVC repo into the container.
  3. I dvc remote add host <path-to-mounted-repo>/.dvc/cache so the new repo in the container can see the mounted repo as a host.
  4. I run dvc pull -r host <path-to-file>

At this point DVC complains that S3 dependencies are not installed, which is true.

ERROR: unexpected error - s3 is supported, but requires 'dvc-s3' to be installed: No module named 'dvc_s3'                                                                          

However, it should not need them. I'm only doing a local pull. If I remove the S3 remote from my .dvc/config then things work correctly.

I suppose DVC is checking for S3 if it sees that it has any S3 path in its remote, even if that remote isn't requested. That is a bug. Instead it should just care about the requested remote.

The doctor report is after I installed dvc[s3] to workaround the issue.

Platform: Python 3.11.2 on Linux-5.19.0-35-generic-x86_64-with-glibc2.31
Subprojects:
	dvc_data = 0.44.1
	dvc_objects = 0.21.1
	dvc_render = 0.3.1
	dvc_task = 0.2.0
	scmrepo = 0.1.17
Supports:
	http (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
	https (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
	s3 (s3fs = 2023.3.0, boto3 = 1.24.59)
@daavoo
Copy link
Contributor

daavoo commented Mar 29, 2023

Hi @Erotemic , could you share the verbose output: dvc pull -r host <path-to-file> -vv ?

This would help to find where this is happening:

it sees that it has any S3 path in its remote, even if that remote isn't requested.

@daavoo daavoo added the awaiting response we are waiting for your reply, please respond! :) label Mar 29, 2023
@Erotemic
Copy link
Contributor Author

I'll do that. This might be a user error. Because I'm not at my main PC I wrote a MWE, but it does not reproduce the error:

#!/bin/bash
# https://github.com/iterative/dvc/issues/9264#issuecomment-1488308658

TMP_DIR=$HOME/temp/docker-remote-mwe

# Fresh start
rm -rf "$TMP_DIR"

mkdir -p "$TMP_DIR"
mkdir -p "$TMP_DIR/my-repo"

cd "$TMP_DIR/my-repo"
git init

dvc init
dvc config core.autostage true

echo "my file" > my_file.txt
dvc add my_file.txt
dvc remote add aws s3://foo-bar-bucket/subbucket

git add .gitignore
git commit -am "a commit"


docker run --volume "$TMP_DIR"/my-repo:/host-repo -it python bash

### Inside docker

git clone /host-repo/.git /my-repo

pip install dvc

cd /my-repo
dvc remote add host /host-repo/.dvc/cache

dvc pull -r host my_file.txt

so either there is an interacting config on my system or I made a user error. I will follow up later.

@efiop efiop removed the awaiting response we are waiting for your reply, please respond! :) label Apr 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants