Skip to content

dvc pull with wrong S3 remote failed but user wasn't informed #2963

@markf94

Description

@markf94

System

My dvc version is 0.77.3, I installed & upgraded dvc through pip3 and I'm using Fedora 31.

Problem

I followed the 'Getting Started' guide and ran the following commands

$ dvc init
$ git add .dvc/* && git commit -m 'Initialized dvc'
$ dvc remote add -d s3remote https://s3.amazonaws.com/<bucket-name>
$ dvc add <some-file>
$ git add .gitignore <some-file>.dvc && git commit -m 'Added <some-file'
$ dvc push -r s3remote

The last command finished with Everything is up to date. which is weird since the S3 bucket was empty when checking it. I went on to delete <some-file> and run dvc pull -r s3remote which would restore <some-file> without throwing any errors. This was surprising since it did not get clear where dvc is storing my file backups and why my S3 bucket was still empty without dvc communicating any errors.

Only when I removed the .dvc/cache folder and ran dvc pull -r s3remote it would complain with the following:

ERROR: failed to download 'https://s3.amazonaws.com/<bucket-name>/31/69f7ce4ebb503afca037d35b7eb3a9' to '.dvc/cache/31/69f7ce4ebb503afca037d35b7eb3a9' - '301 Moved Permanently'

ERROR: failed to download 'https://s3.amazonaws.com/<bucket-name>/a3/04afb96060aad90176268345e10355' to '.dvc/cache/a3/04afb96060aad90176268345e10355' - '301 Moved Permanently'

ERROR: failed to pull data from the cloud - 2 files failed to download


Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!

The fix

On my end, it was an easy fix. After consulting the documentation in more detail, I realized that I set up my S3 remote incorrectly using the https://s3.amazonaws.com/<bucket-name> URL to the bucket whereas I should have used the s3://<bucket-name> URL. Hence, running

$ dvc remote modify s3remote url s3://<bucket-name>

did the job.

It still remains weird why I was never warned about the remote being incorrectly set up and why dvc pull -r s3remote worked without any problems even though there were no files in the S3 bucket (it seems that it restored them from local cache as a fall-back) but it should have given me a headsup!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugDid we break something?p2-mediumMedium priority, should be done, but less importantresearch

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions