Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DVC_STAGE: Added environment variable to track DVC stage name #10357

Merged
merged 1 commit into from
Mar 20, 2024

Conversation

rishabhsharma22
Copy link
Contributor

@rishabhsharma22 rishabhsharma22 commented Mar 18, 2024

Fixes #10355

Thank you for the contribution - we'll try to review it as soon as possible. 🙏

Copy link

codecov bot commented Mar 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.68%. Comparing base (017a510) to head (ca4506d).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10357      +/-   ##
==========================================
+ Coverage   90.67%   90.68%   +0.01%     
==========================================
  Files         500      500              
  Lines       38696    38698       +2     
  Branches     5600     5600              
==========================================
+ Hits        35088    35095       +7     
+ Misses       2961     2958       -3     
+ Partials      647      645       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@shcheklein shcheklein changed the title DVC_STAGE_NAME: Added environment variable to track DVC stage name DVC_STAGE: Added environment variable to track DVC stage name Mar 18, 2024
@shcheklein
Copy link
Member

Test fail due to iterative/dvc-s3#80 . I'm fine with this change. We will need to update this https://dvc.org/doc/user-guide/env#environment-variables doc at least after we merge this.

@dberenbaum
Copy link
Collaborator

@rishabhsharma22 Could you please create an accompanying docs PR as suggested in the PR template and pointed out by @shcheklein?

@shcheklein shcheklein added enhancement Enhances DVC p2-medium Medium priority, should be done, but less important awaiting response we are waiting for your reply, please respond! :) labels Mar 19, 2024
@rishabhsharma22
Copy link
Contributor Author

@dberenbaum I have added the accompanying PR in docs.

@dberenbaum
Copy link
Collaborator

@rishabhsharma22 Can you point me to it? I don't see it in https://github.com/iterative/dvc.org/pulls.

@rishabhsharma22
Copy link
Contributor Author

@dberenbaum seems like something went wrong and the PR didn't go through. I created a new PR. Should be able to see this time. If not here is the link for you. iterative/dvc.org#5191

@shcheklein shcheklein merged commit 8d8939f into iterative:main Mar 20, 2024
20 checks passed
@skshetry
Copy link
Member

skshetry commented Mar 21, 2024

Hey, can you provide more example on #10355, please?
Stackoverflow only mentions that you need access to stage name, but it's not clear to me what dvc.api are you using and why you need stage name at all.

If it's to read hyperparameters, I am curious why you do not read the param file directly instead of going this convoluted way? Not to add, your script will be unusable outside dvc, compared to the parametrization example that @shcheklein offered.

@rishabhsharma22
Copy link
Contributor Author

I attempted to find the stage name while running a command with DVC. However, I couldn't utilize any DVC API for this purpose at that point. Consequently, I didn't use any DVC API at all. I think it would be beneficial to add a feature to access the DVC stage name through its API. Alternatively, incorporating it as a system variable could serve the same purpose.

Now, to address why one might need the stage name: when a lock file generates a stage name for execution, it stores it. The challenge arises when ensuring that this stage name remains synchronized with the lock file if someone passes the stage name as a parameter. Thus, introducing a variable to capture the name of the current running stage would ensure consistency between the stage name generated in the lock file and the stage name needed for reference during execution or reproducibility.

The use cases may vary. In my case, I needed to track specific parameters, and the only identifier available was the stage name, which wasn't accessible through DVC at the time. Therefore, having the DVC stage name as an accessible identifier for the ongoing run would greatly aid in tracking features, parameters, and other relevant aspects. While it's possible to create parameters during execution to track the stage name, incorporating it as a feature ensures consistency and simplifies the process for end users.

I hope this helps answer some of your questions.

@skshetry
Copy link
Member

skshetry commented Mar 21, 2024

Thanks for the response. It makes sense for dvc to provide something like that, but I still don't understand your usecase. I don't understand what you mean by "synchronizing with the lock file"?

Can you show me an example how you plan to use it?

BradyJ27 pushed a commit to BradyJ27/dvc that referenced this pull request Apr 22, 2024
…ive#10357)

DVC_STAGE_NAME: Added environment variable to track DVC stage name

Fixes iterative#10355

Co-authored-by: Rishabh Sharma <rishabhsharma22@github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting response we are waiting for your reply, please respond! :) enhancement Enhances DVC p2-medium Medium priority, should be done, but less important
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to extract dvc Stage Name
4 participants