Let's say I have two scripts: code/gen_features_v1.py and code/gen_features_v2.py. They generate feature files data/v1/features.pkl and data/v2/features.pkl respectively.
I run the scripts like so:
dvc run -o data/v1/features.pkl python code/gen_features_v1.py
dvc run -o data/v2/features.pkl python code/gen_features_v2.py
But now the features.pkl.dvc only tracks the features.pkl from the second run even though I might need both.
cmd: python code/gen_features_v2.py
md5: some_md5
outs:
- cache: true
md5: some_md5
path: data/v2/features.pkl
Quite often I have have files with the same name. So how do I handle this? Or should dvc run be extended?