Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple config filename from results filename and remove need for config file from tools which consume results. #48

Closed
vext01 opened this issue Oct 8, 2015 · 12 comments · Fixed by #86

Comments

@vext01
Copy link
Member

vext01 commented Oct 8, 2015

Currently the result filename is derived from the config filename. E.g.:

warmup.krun -> warmup_results.json

Tools that consume results use a function in krun to discover the json filename.

I think it would be better if:

  • The results file was named similarly to the log files. e.g. warmup_20151008.bz2
  • Tools consuming results should not need the config file at all.

This is really just an annoyance and isn't all that high priority.

@snim2
Copy link
Collaborator

snim2 commented Oct 27, 2015

I'm not sure that is so bad to name the file the way we have. The real issue is that consuming the results is tricky. I have a script somewhere that converts the results data to .csv format. I would suggest a good way to resolve this issue might be to add a --csv flag to krun.py and allow the end user to dump the results to CSV and maybe the audit to a separate CSV file (the config can be ignored because the user already has it).

The Json file is really only useful to krun, because it contains the extra information needed for resume-mode and so on.

@vext01
Copy link
Member Author

vext01 commented Oct 27, 2015

I'm not sure I understand the CSV stuff.

What I mean by bullet point 2 of this bug is that tools should not need to have both a config file and a results file present in order to work.

Look at mk_graphs2x2.py from the warmup experiment (ah, this bug partially applies to that repo too). It's basic usage is:

mk_graphs2x2.py <mode> <config file1> <machine name1> <config file2>  <machine name2> 

whereas it would be nicer if it were:

mk_graphs2x2.py <mode> <results file1> <machine name1> <results file2>  <machine name2> 

Seems silly at first, but the only remaining reason the config file is passed used is to derive a results file name (as described in original report). And it imports krun just to do this (uses a function in krun.util). Once upon a time it made more sense to specify the config file, as the tools used to read warm_upon_iter for each vm so as to know how many iterations to chop for warmup...

If we were to make this change, we would then be free to name the results files as we wish, with a datestamp if we wanted.

I dunno, maybe it doesn't matter.

@vext01
Copy link
Member Author

vext01 commented Oct 27, 2015

FWIW, I'm not much a fan of CSV as a format. I chose JSON as it's hierarchical and is easily parsed in just about any language.

@snim2
Copy link
Collaborator

snim2 commented Oct 27, 2015

I'm a bit baffled by this. I'll have another read through the code when I've finished the current batch of commits.

@vext01
Copy link
Member Author

vext01 commented Oct 27, 2015

Let me try again from a more practical standpoint.

Currently you run krun on a config file, say warmup.krun. Krun emits warmup_results.json.bz2.

I find this annoying as it's too easy to overwrite results, as all results files (for that config) are named the same. I've overwritten results a couple of times accidentally.

We could add a datestamp to the results file name (like with the log files), but we would have to rework existing tools that take as an argument a config file, and expect to be able to derive the name of the results file from this. Basically anything that depends upon krun.util.output_name()

Hope that helps.

@ltratt
Copy link
Member

ltratt commented Oct 27, 2015

I'm neutral on this, but an alternative is for krun to say "I won't overwrite a results file unless you specify --force" or similar.

@snim2
Copy link
Collaborator

snim2 commented Oct 27, 2015

It sort of helps. A simple change I can add in to my current commits would be to pass the result filename in as an argument to krun:

$ python2 krun.py --outfile=flibble.bz2

If we make the change above the mk_graphs script will be slightly more broken, but since it already has a bug we can deal with that all in one go.

I still don't quite understand why the current tools need the config file, because the config is stored in the results file. So surely they can just read the config from the dumped results?

@vext01
Copy link
Member Author

vext01 commented Oct 27, 2015

The tools no longer need anything from the config (in either the config file, or the config stashed in the results file). So they should not depend upon the config file at all.

The config file in the results file is only there for future reference, should we wish to know what parameters were used.

@vext01
Copy link
Member Author

vext01 commented Oct 27, 2015

On the flipside, we may find a need for a tool to require something from the config file later...

@snim2
Copy link
Collaborator

snim2 commented Oct 27, 2015

True. Do you want me to add an --outfile option?

@vext01
Copy link
Member Author

vext01 commented Oct 27, 2015

I'm no longer sure what is best, but some part of me thinks the result files should be date-stamped so as to match the log filename.

@snim2
Copy link
Collaborator

snim2 commented Oct 27, 2015

In which case I won't act on this now; we can come back to it another time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants