Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data status returns files as "Not in remote" even though they are marked as push: false in pipeline #10317

Open
mermerico opened this issue Feb 23, 2024 · 0 comments
Labels
A: status Related to the dvc diff/list/status p2-medium Medium priority, should be done, but less important

Comments

@mermerico
Copy link

Bug Report

Description

dvc data status --not-in-remote is used to determine if some files haven't been pushed up to remote storage. In a CI/CD context it is a useful check before accepting a PR. If a DVC pipeline has a stage with outputs marked push: false, those files will appear as "Not in remote". This makes it harder to detect when files that should be pushed to remote have not been (especially in an automated manner).

Reproduce

  1. Create a dvc pipeline with a stage output marked push: false
  2. dvc repro
  3. dvc push
  4. dvc data status --not-in-remote

Expected

No files should be marked as "Not in remote" OR an option should be provided to suppress those files.

Environment information

N/A

Output of dvc doctor:

DVC version: 3.43.1 (pip)
-------------------------
Platform: Python 3.11.6 on Linux-5.15.0-92-generic-x86_64-with-glibc2.35
Subprojects:
        dvc_data = 3.9.0
        dvc_objects = 3.0.6
        dvc_render = 1.0.1
        dvc_task = 0.3.0
        scmrepo = 2.1.1
Supports:
        http (aiohttp = 3.8.5, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.5, aiohttp-retry = 2.8.3),
        s3 (s3fs = 2023.9.0, boto3 = 1.28.17)
Config:
        Global: /home/mermerico/.config/dvc
        System: /etc/xdg/dvc
Cache types: symlink
Cache directory: ext4 on /dev/md0p1
Caches: local
Remotes: s3
Workspace directory: ext4 on /dev/md0p1
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/97a24c320f9c7207aea41ca9a4dc4061
@dberenbaum dberenbaum added p2-medium Medium priority, should be done, but less important A: status Related to the dvc diff/list/status labels Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: status Related to the dvc diff/list/status p2-medium Medium priority, should be done, but less important
Projects
None yet
Development

No branches or pull requests

2 participants