-
Notifications
You must be signed in to change notification settings - Fork 45.5k
Add benchmark utility functions for metric logging #3619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading through this, I'm not clear on where this will be called from-- in the graph, or in the outer loop? The decision there will probably have implications as to how different input types and IO processes have to be handled and tested; maybe the best thing to do is to mock an incorporation of this into MNIST, and that will clarify what we need the logger to be able to handle.
official/benchmark/logger.py
Outdated
import json | ||
import os | ||
|
||
from tensorflow.python.platform import gfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to use the public API for this, which is preferable-- tf.gfile.GFile, etc. See tensorflow/python/platform/gfile.py for allowed symbols.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, Thanks for pointing this out. I will change to use the public API.
@@ -0,0 +1,64 @@ | |||
# Copyright 2018 The TensorFlow Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have this here instead of in utils/logging? I'm not opposed, but we should have reasons, given the similarity of purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, nit: since we have utils
, and will probably eventually have models
, and will certainly have multiple benchmarks, how do you feel about naming this package benchmarks
instead of the singular benchmark
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original intention is to put all benchmark related code into official/benchmark, but not the benchmark themselves. I will have other code like upload to bigstore and other libs in future change. I think they are more logically grouped together.
I could move those code to officials/utils/benchmark if you prefer
official/benchmark/logger.py
Outdated
if not gfile.IsDirectory(self._logging_dir): | ||
gfile.MakeDirs(self._logging_dir) | ||
|
||
def logMetric(self, name, value, unit=None, global_step=None, extras=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be log_metric
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ops, yes.
official/benchmark/logger.py
Outdated
|
||
Args: | ||
name: string, the name of the metric to log. | ||
value: number, the value of the metric. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't actually do type checking here-- should we? Or maybe relax the stated requirement, and just say this should be json-dumpable (which, IIRC, has all sorts of limitations, but float/str/int should be okay-- though that raises the question of how tensors would get handled here?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the moment, i think I will assume the input are simple value, and even in the case of the tensor value at runtime, this logger should be passed into some run session hook and get the loggable value there.
I will put a type check here for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a similar note, we should decide how to handle failures. I'm inclined to not have a logging hook bring down the whole run, but just silently bailing doesn't seem good either. Perhaps salvage what info can be dumped and write that along with an indication that it is not a clean record?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, probably we could add a logger to local warning for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Added a warning instead of throwing valueError here.
name: string, the name of the metric to log. | ||
value: number, the value of the metric. | ||
unit: string, the unit of the metric, E.g "image per second". | ||
global_step: int, the global_step when the metric is logged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you just run tf.train.get_global_step, you get a Tensor. Not sure what will happen if we try to log that directly? Do we have to convert tensor output to raw? We probably want to have a stance and add tests accordingly. I guess it depends on where we are calling this from-- the main loop, or from within the Estimator and related tf code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From https://github.com/tensorflow/tensorflow/blob/r1.6/tensorflow/python/training/training_util.py#L45, tf.train.global_step() will return a int. In the case that the its not in a tf runtime, tf.train.global_step() will throw an Error.
official/benchmark/logger.py
Outdated
"timestamp": datetime.datetime.now().strftime( | ||
_DATE_TIME_FORMAT_PATTERN), | ||
"extras": extras} | ||
json.dump(metric, f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does json.dump play nicely with the tf file writer? If this is actually happening deferred in a graph, what happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chatted offline, I think we will stick with the simple number value for the moment. When user try to use this in the context of tf run session, they should be able to get the simple value from the run session.
official/benchmark/logger_test.py
Outdated
|
||
from official.benchmark import logger | ||
import tensorflow as tf | ||
from tensorflow.python.platform import gfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto on not about about public API. Also, I know pylint complains, but we have the main TF import first, with the local package imports after. (Note that this is also important for the import-to-google rules to work as they are now.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
official/benchmark/logger_test.py
Outdated
def tearDown(self): | ||
gfile.DeleteRecursively(self.get_temp_dir()) | ||
|
||
def testCreateLoggingDir(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm less particular about test names, but I think the rest of our tests are underscored rather than camelCase-- probably best to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
official/benchmark/logger_test.py
Outdated
def testLogMetric(self): | ||
log_dir = tempfile.mkdtemp(dir=self.get_temp_dir()) | ||
log = logger.BenchmarkLogger(log_dir) | ||
log.logMetric("accuracy", 0.999, global_step=1e4, extras={"name": "value"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stated above, but just to record in the proper place: we should take a stance on Tensor input, and make sure we test accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in the previous comment.
""" | ||
with gfile.GFile( | ||
os.path.join(self._logging_dir, _METRIC_LOG_FILE_NAME), "a") as f: | ||
metric = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we assign a unique ID to each class instance and include it? That way if a bunch of records get jammed together in the same file (i.e. the default path) it is easy to separate runs later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the logging_dir is the flag that caller should tweak. Similar as the log_dir flag for summary writer, the caller should make sure to specify different log dir for different run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chatted offline, we actually have the UUID field in the design for record keeping. Will have the ID field in future change.
official/benchmark/logger.py
Outdated
|
||
Args: | ||
name: string, the name of the metric to log. | ||
value: number, the value of the metric. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a similar note, we should decide how to handle failures. I'm inclined to not have a logging hook bring down the whole run, but just silently bailing doesn't seem good either. Perhaps salvage what info can be dumped and write that along with an indication that it is not a clean record?
So there's good news and bad news. 👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there. 😕 The bad news is that it appears that one or more commits were authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request. Note to project maintainer: This is a terminal state, meaning the |
…ram." This reverts commit 6d829ca.
Poor Git. Try:
|
1. Update logger to convert the value type to float to please JSON. 2. Update logger test to call parent test tearDown.
See qlzh727@6ad9e0b as a demo for metric logger |
Btw, the metric log from the qlzh727@6ad9e0b looks like below: {"name": "train_accuracy", "timestamp": "2018-03-19T11:36:57.969879Z", "value": 0.9988235235214233, "extras": null, "unit": null, "global_step": 25604} |
That is great-- thanks. I do feel like the example makes the case stronger that the logger should live in utils, since one might want to use to to write arbitrary JSON logs separate from benchmarking. But very nice regardless. |
Done. Moved to offical/utils/logging. |
Thanks- LGTM, though we should still try to clean up the git record so that the CLA is clear. |
Currently it only log the metric, which we could use in any of the hooks.