Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull file with symbolic link in path doesn't work #10055

Open
clementperon opened this issue Oct 27, 2023 · 7 comments
Open

Pull file with symbolic link in path doesn't work #10055

clementperon opened this issue Oct 27, 2023 · 7 comments
Labels

Comments

@clementperon
Copy link

Bug Report

When I run:
dvc pull -f ./data_soft/file.dvc

Where data_soft is a symbolink link to "./data/software"

the ouput of dvc is wrong and think that the file is already pulled

Collecting                                                                                                                                                                                                                          |0.00 [00:00,    ?entry/s]
Fetching
Building workspace index                                                                                                                                                                                                            |2.00 [00:00,  639entry/s]
Comparing indexes                                                                                                                                                                                                                   |3.00 [00:00,  921entry/s]
Applying changes                                                                                                                                                                                                                    |0.00 [00:00,     ?file/s]
Everything is up to date.

When I run:
dvc pull -f ./data/software/file.dvc

the output of dvc is correct and pull the file

Collecting                                                                                                                                                                                                                          |0.00 [00:00,    ?entry/s]
Fetching                                                                                                                                                                                                                                                     ^Fetching       |Fetching from gs                                                                                                                                                                                                    0/1 [00:00<?,     ?file/s]
$ dvc doctor
DVC version: 3.27.0 (pip)
-------------------------
Platform: Python 3.10.12 on Linux-6.4.6-76060406-generic-x86_64-with-glibc2.35
Subprojects:
	dvc_data = 2.18.2
	dvc_objects = 1.0.1
	dvc_render = 0.6.0
	dvc_task = 0.3.0
	scmrepo = 1.4.0
Supports:
	gs (gcsfs = 2023.9.0),
	http (aiohttp = 3.8.5, aiohttp-retry = 2.8.3),
	https (aiohttp = 3.8.5, aiohttp-retry = 2.8.3)
Config:
	Global: /home/clement/.config/dvc
	System: /etc/xdg/xdg-pop/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/nvme0n1p7
Caches: local
Remotes: gs
Workspace directory: ext4 on /dev/nvme0n1p7
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/10c2f099136ce26cd5b605984c629317
@clementperon
Copy link
Author

This has been introduced in 3.10.0 by @efiop

commit 225814f
Author: Ruslan Kuprieiev kupruser@gmail.com
Date: Sat Jul 29 01:51:28 2023 +0300

dvc: don't use realpath where not needed

@efiop efiop added the research label Oct 27, 2023
@skshetry
Copy link
Member

This is something that was supported accidentally. I don't think we should support symlink to a dvcfile at all. There are potential security concerns regarding that.

@clementperon
Copy link
Author

This is something that was supported accidentally. I don't think we should support symlink to a dvcfile at all. There are potential security concerns regarding that.

Why accidentaly? There was explicit realpath everywhere in the code...

Security? It's proper to understand when there is a symbolic link and explicitly support it or not with a correct support than doing undefined behavior like what's is happening now, no?

@skshetry
Copy link
Member

Why accidentaly? There was explicit realpath everywhere in the code...

The realpath was everywhere because dvc_repo.root_dir was also a realpath forcing us to make everything else a realpath, mostly due to macOS having /var symlink to /private/var AFAIR.

@efiop
Copy link
Contributor

efiop commented Nov 1, 2023

I can confirm that it was accidentally supported in the past, but we don't even know how broken it really was.

@clementperon sure, we need to at least gracefully handle that, but I don't think we will support collecting .dvc files inside symlinked directories any time soon. Could you elaborate on your use case please, why do you need a symlink with .dvc files at all? Just trying to see if maybe we could propose a better solution.

@efiop efiop added the awaiting response we are waiting for your reply, please respond! :) label Nov 1, 2023
@clementperon
Copy link
Author

@efiop I have reorganized folder hierarchy where I store DVC files.

Instead of rewriting path everywhere I have created symlinks for a smooth transition.

@clementperon
Copy link
Author

Just sayin but all tools support properly symbolic link, not sure why DVC shouldn't.

@skshetry skshetry removed the awaiting response we are waiting for your reply, please respond! :) label Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants