Add benchmark utility functions for metric logging #3619

qlzh727 · 2018-03-15T20:39:50Z

Currently it only log the metric, which we could use in any of the hooks.

karmel

Reading through this, I'm not clear on where this will be called from-- in the graph, or in the outer loop? The decision there will probably have implications as to how different input types and IO processes have to be handled and tested; maybe the best thing to do is to mock an incorporation of this into MNIST, and that will clarify what we need the logger to be able to handle.

karmel · 2018-03-16T15:27:16Z

official/benchmark/logger.py

+import json
+import os
+
+from tensorflow.python.platform import gfile


You should be able to use the public API for this, which is preferable-- tf.gfile.GFile, etc. See tensorflow/python/platform/gfile.py for allowed symbols.

Ah, Thanks for pointing this out. I will change to use the public API.

karmel · 2018-03-16T15:28:28Z

official/benchmark/logger.py

@@ -0,0 +1,64 @@
+# Copyright 2018 The TensorFlow Authors. All Rights Reserved.


Why have this here instead of in utils/logging? I'm not opposed, but we should have reasons, given the similarity of purpose.

Also, nit: since we have utils, and will probably eventually have models, and will certainly have multiple benchmarks, how do you feel about naming this package benchmarks instead of the singular benchmark?

My original intention is to put all benchmark related code into official/benchmark, but not the benchmark themselves. I will have other code like upload to bigstore and other libs in future change. I think they are more logically grouped together.

I could move those code to officials/utils/benchmark if you prefer

karmel · 2018-03-16T15:32:44Z

official/benchmark/logger.py

+    if not gfile.IsDirectory(self._logging_dir):
+      gfile.MakeDirs(self._logging_dir)
+
+  def logMetric(self, name, value, unit=None, global_step=None, extras=None):


Shouldn't this be log_metric?

karmel · 2018-03-16T15:35:11Z

official/benchmark/logger.py

+
+    Args:
+      name: string, the name of the metric to log.
+      value: number, the value of the metric.


We don't actually do type checking here-- should we? Or maybe relax the stated requirement, and just say this should be json-dumpable (which, IIRC, has all sorts of limitations, but float/str/int should be okay-- though that raises the question of how tensors would get handled here?)

For the moment, i think I will assume the input are simple value, and even in the case of the tensor value at runtime, this logger should be passed into some run session hook and get the loggable value there.

I will put a type check here for now.

On a similar note, we should decide how to handle failures. I'm inclined to not have a logging hook bring down the whole run, but just silently bailing doesn't seem good either. Perhaps salvage what info can be dumped and write that along with an indication that it is not a clean record?

Good point, probably we could add a logger to local warning for this case.

Done. Added a warning instead of throwing valueError here.

karmel · 2018-03-16T15:37:43Z

official/benchmark/logger.py

+      name: string, the name of the metric to log.
+      value: number, the value of the metric.
+      unit: string, the unit of the metric, E.g "image per second".
+      global_step: int, the global_step when the metric is logged.


If you just run tf.train.get_global_step, you get a Tensor. Not sure what will happen if we try to log that directly? Do we have to convert tensor output to raw? We probably want to have a stance and add tests accordingly. I guess it depends on where we are calling this from-- the main loop, or from within the Estimator and related tf code.

From https://github.com/tensorflow/tensorflow/blob/r1.6/tensorflow/python/training/training_util.py#L45, tf.train.global_step() will return a int. In the case that the its not in a tf runtime, tf.train.global_step() will throw an Error.

karmel · 2018-03-16T15:39:22Z

official/benchmark/logger.py

+          "timestamp": datetime.datetime.now().strftime(
+              _DATE_TIME_FORMAT_PATTERN),
+          "extras": extras}
+      json.dump(metric, f)


Does json.dump play nicely with the tf file writer? If this is actually happening deferred in a graph, what happens?

Chatted offline, I think we will stick with the simple number value for the moment. When user try to use this in the context of tf run session, they should be able to get the simple value from the run session.

karmel · 2018-03-16T15:40:59Z

official/benchmark/logger_test.py

+
+from official.benchmark import logger
+import tensorflow as tf
+from tensorflow.python.platform import gfile


Ditto on not about about public API. Also, I know pylint complains, but we have the main TF import first, with the local package imports after. (Note that this is also important for the import-to-google rules to work as they are now.)

karmel · 2018-03-16T15:41:50Z

official/benchmark/logger_test.py

+  def tearDown(self):
+    gfile.DeleteRecursively(self.get_temp_dir())
+
+  def testCreateLoggingDir(self):


I'm less particular about test names, but I think the rest of our tests are underscored rather than camelCase-- probably best to be consistent.

karmel · 2018-03-16T15:43:08Z

official/benchmark/logger_test.py

+  def testLogMetric(self):
+    log_dir = tempfile.mkdtemp(dir=self.get_temp_dir())
+    log = logger.BenchmarkLogger(log_dir)
+    log.logMetric("accuracy", 0.999, global_step=1e4, extras={"name": "value"})


Stated above, but just to record in the proper place: we should take a stance on Tensor input, and make sure we test accordingly.

addressed in the previous comment.

robieta · 2018-03-16T17:23:41Z

official/benchmark/logger.py

+    """
+    with gfile.GFile(
+        os.path.join(self._logging_dir, _METRIC_LOG_FILE_NAME), "a") as f:
+      metric = {


Can we assign a unique ID to each class instance and include it? That way if a bunch of records get jammed together in the same file (i.e. the default path) it is easy to separate runs later.

I think the logging_dir is the flag that caller should tweak. Similar as the log_dir flag for summary writer, the caller should make sure to specify different log dir for different run.

Chatted offline, we actually have the UUID field in the design for record keeping. Will have the ID field in future change.

robieta · 2018-03-16T17:33:18Z

official/benchmark/logger.py

+
+    Args:
+      name: string, the name of the metric to log.
+      value: number, the value of the metric.


On a similar note, we should decide how to handle failures. I'm inclined to not have a logging hook bring down the whole run, but just silently bailing doesn't seem good either. Perhaps salvage what info can be dumped and write that along with an indication that it is not a clean record?

googlebot · 2018-03-16T23:55:53Z

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

…ram." This reverts commit 6d829ca.

karmel · 2018-03-19T18:07:12Z

Poor Git. Try:

git merge origin master
git diff > ~/benchmark.patch # Open that patch. Should look like your set of changes and only your set.
git push # Not necessary, but just in case
# Grab the commit hash for current master HEAD
git reset --hard <commit hash for master HEAD>
git apply ~/benchmark.patch
git diff # Make sure this looks right
git push

1. Update logger to convert the value type to float to please JSON. 2. Update logger test to call parent test tearDown.

qlzh727 · 2018-03-19T18:51:24Z

See qlzh727@6ad9e0b as a demo for metric logger

qlzh727 · 2018-03-19T19:20:48Z

Btw, the metric log from the qlzh727@6ad9e0b looks like below:

{"name": "train_accuracy", "timestamp": "2018-03-19T11:36:57.969879Z", "value": 0.9988235235214233, "extras": null, "unit": null, "global_step": 25604}
{"name": "loss", "timestamp": "2018-03-19T11:37:07.436055Z", "value": 1.953561877598986e-05, "extras": null, "unit": null, "global_step": 25704}
{"name": "train_accuracy", "timestamp": "2018-03-19T11:37:07.436379Z", "value": 0.9988889098167419, "extras": null, "unit": null, "global_step": 25704}
{"name": "loss", "timestamp": "2018-03-19T11:37:16.885552Z", "value": 0.01344460342079401, "extras": null, "unit": null, "global_step": 25804}
{"name": "train_accuracy", "timestamp": "2018-03-19T11:37:16.886348Z", "value": 0.9984210729598999, "extras": null, "unit": null, "global_step": 25804}
{"name": "loss", "timestamp": "2018-03-19T11:37:26.488251Z", "value": 0.00038322151522152126, "extras": null, "unit": null, "global_step": 25904}
{"name": "train_accuracy", "timestamp": "2018-03-19T11:37:26.488567Z", "value": 0.9984999895095825, "extras": null, "unit": null, "global_step": 25904}
{"name": "loss", "timestamp": "2018-03-19T11:37:36.103425Z", "value": 8.149698260240257e-05, "extras": null, "unit": null, "global_step": 26004}
{"name": "train_accuracy", "timestamp": "2018-03-19T11:37:36.103754Z", "value": 0.9985714554786682, "extras": null, "unit": null, "global_step": 26004}
{"name": "loss", "timestamp": "2018-03-19T11:37:45.991382Z", "value": 0.0013781175948679447, "extras": null, "unit": null, "global_step": 26104}
{"name": "train_accuracy", "timestamp": "2018-03-19T11:37:45.991713Z", "value": 0.9986363649368286, "extras": null, "unit": null, "global_step": 26104}
{"name": "loss", "timestamp": "2018-03-19T11:37:55.977899Z", "value": 0.0007561420788988471, "extras": null, "unit": null, "global_step": 26204}
{"name": "train_accuracy", "timestamp": "2018-03-19T11:37:55.978276Z", "value": 0.9986956715583801, "extras": null, "unit": null, "global_step": 26204}
{"name": "loss", "timestamp": "2018-03-19T11:38:05.737698Z", "value": 0.008610671386122704, "extras": null, "unit": null, "global_step": 26304}

karmel · 2018-03-19T20:14:47Z

That is great-- thanks. I do feel like the example makes the case stronger that the logger should live in utils, since one might want to use to to write arbitrary JSON logs separate from benchmarking. But very nice regardless.

qlzh727 · 2018-03-19T20:33:32Z

Done. Moved to offical/utils/logging.

karmel · 2018-03-19T20:41:04Z

Thanks- LGTM, though we should still try to clean up the git record so that the CLA is clear.

qlzh727 requested review from k-w-w, karmel and nealwu as code owners March 15, 2018 20:39

googlebot added the cla: yes label Mar 15, 2018

qlzh727 requested review from robieta, yhliang2018 and tfboyd March 15, 2018 20:40

karmel suggested changes Mar 16, 2018

View reviewed changes

robieta reviewed Mar 16, 2018

View reviewed changes

Remove the default None for data_path, which is a required param.

6d829ca

qlzh727 requested review from aquariusjay, ebrevdo, lukaszkaiser and YknZhu as code owners March 16, 2018 23:55

googlebot added cla: no and removed cla: yes labels Mar 16, 2018

Revert the rebased change.

57bc50c

qlzh727 force-pushed the model_test branch from 8b4f321 to 57bc50c Compare March 17, 2018 00:05

Revert "Remove the default None for data_path, which is a required pa…

ff17f5d

…ram." This reverts commit 6d829ca.

qlzh727 removed request for lukaszkaiser, ebrevdo, YknZhu and aquariusjay March 17, 2018 00:10

Update logger value type and test

82a25a3

1. Update logger to convert the value type to float to please JSON. 2. Update logger test to call parent test tearDown.

Move the official/benchmark files to offical/utils/logging.

0308e7e

karmel approved these changes Mar 19, 2018

View reviewed changes

karmel added cla: yes and removed cla: no cla: yes labels Mar 19, 2018

qlzh727 merged commit 875fcb3 into tensorflow:master Mar 19, 2018

		@@ -0,0 +1,64 @@
		# Copyright 2018 The TensorFlow Authors. All Rights Reserved.

Add benchmark utility functions for metric logging #3619

Add benchmark utility functions for metric logging #3619

Uh oh!

Conversation

qlzh727 commented Mar 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

karmel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

googlebot commented Mar 16, 2018

Uh oh!

karmel commented Mar 19, 2018

Uh oh!

qlzh727 commented Mar 19, 2018

Uh oh!

qlzh727 commented Mar 19, 2018

Uh oh!

karmel commented Mar 19, 2018

Uh oh!

qlzh727 commented Mar 19, 2018

Uh oh!

karmel commented Mar 19, 2018

Uh oh!

Uh oh!

qlzh727 commented Mar 15, 2018 •

edited

Loading