[FR] Flag commit hash as "dirty" in UI if git repo has local changes #1912

krasch · 2019-10-08T21:28:06Z

Describe the proposal

Add a flag in UI (and database, if necessary) that identifies a commit hash as "dirty" if the git repository has local changes.

Motivation

Version: 1.3.0 (installed via pip)

Steps:

have a clean git repository, no un-commited local changes
mlflow run .
change a file, do not commit
mlflow run .

In this situation, the first run is reproducible, the second one is not. In the UI both runs are displayed with the same (i.e. the most recent) commit hash. This gives the impression that both experiments where run with exactly the same code, which is not true. As far as I could find, there seems to be no way to identify the "dirty" second run. Please let me know if I missed something.

Proposed Changes

Other tools such as omniboard for sacred or neptune show a flag that identifies the "dirty" run as such. This does not make the run reproducible, but it does at least alert the user that this is the case.

ssYkse · 2019-10-10T08:48:00Z

I agree with the issue, but not necessarily with the proposed solution. The reason for this is, that the second run is still not reproducible - we know its 'dirty', but we still don't know what changed.
I suggest that running in a 'dirty' environment, mlflow shall refuse to run:

mlflow run .
>>> dirty git, aborting.

If you insist on running this, maybe run

mlflow run . --ignore-git-warning

or something similar.

krasch · 2019-10-10T14:34:31Z

I agree, it does not solve the actual reproducibility issues. I like your proposal. I see two possible ways that one could go with the flag

--ignore-git-warning (as you proposed)
--enforce-git-clean (for example sacred does this)

I do prefer (1), however I do see a risk that this might annoy some users, because of perceived overhead.

ryan-duve · 2021-08-16T20:02:19Z

@smurching Would the project maintainers still be receptive to an attempt to solve the problem as stated in the original post? Having an indicator that an MLflow run was started with "working tree clean" would be helpful when attempting to reproduce results.

Edit: Adding @harupy in case there's another way to answer this question. I am willing to take a crack at this issue if it's something that would be considered to merge in.

chenmoneygithub · 2023-10-05T21:07:30Z

I don't know if we need this, as a comparison, we frequently start a python console in a local branch with changed file for quick testing, and there is no prompt that "this console is based on uncommitted changed", but everything works fine.

krasch added the enhancement New feature or request label Oct 8, 2019

smurching added the area/tracking Tracking service, tracking client APIs, autologging label Jul 1, 2020

smurching added the priority/backlog We believe it is useful, but don’t see it being prioritized in the next few months. label Jul 13, 2020

chenmoneygithub closed this as completed Oct 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR] Flag commit hash as "dirty" in UI if git repo has local changes #1912

[FR] Flag commit hash as "dirty" in UI if git repo has local changes #1912

krasch commented Oct 8, 2019

ssYkse commented Oct 10, 2019

krasch commented Oct 10, 2019

ryan-duve commented Aug 16, 2021 •

edited

chenmoneygithub commented Oct 5, 2023

[FR] Flag commit hash as "dirty" in UI if git repo has local changes #1912

[FR] Flag commit hash as "dirty" in UI if git repo has local changes #1912

Comments

krasch commented Oct 8, 2019

Describe the proposal

Motivation

Proposed Changes

ssYkse commented Oct 10, 2019

krasch commented Oct 10, 2019

ryan-duve commented Aug 16, 2021 • edited

chenmoneygithub commented Oct 5, 2023

ryan-duve commented Aug 16, 2021 •

edited