[tf.data] Support eager mode benchmarking #46320

kvignesh1420 · 2021-01-10T16:53:17Z

This PR adds support for eager execution based benchmarks based on the benchmarking context. This is achieved by checking for the eager_execution() value within the benchmarking context and iterating over the dataset eagerly in the DatasetBenchmarkBase class.

Additionally, the ListFilesBenchmark in python/data/benchmarks/list_files_benchmark.py has been refactored to use the DatasetBenchmarkBase class.

A sample comparison w.r.t RangeBenchmark is as follows:

GRAPH MODE:

entry {
  name: "RangeBenchmark.modeling_on.graph"
  iters: 5
  wall_time: 1.7887322902679443e-07
  extras {
    key: "num_elements"
    value {
      double_value: 10000000.0
    }
  }
}

entry {
  name: "RangeBenchmark.modeling_off.graph"
  iters: 5
  wall_time: 1.541191530227661e-07
  extras {
    key: "num_elements"
    value {
      double_value: 50000000.0
    }
  }
}

EAGER MODE:

entry {
  name: "RangeBenchmark.modeling_on.eager"
  iters: 5
  wall_time: 2.0390429496765137e-07
  extras {
    key: "num_elements"
    value {
      double_value: 10000000.0
    }
  }
}

entry {
  name: "RangeBenchmark.modeling_off.eager"
  iters: 5
  wall_time: 1.8014323711395263e-07
  extras {
    key: "num_elements"
    value {
      double_value: 50000000.0
    }
  }
}

w.r.t ListFilesBenchmark

GRAPH MODE
entry {
  name: "ListFilesBenchmark.nested_directory(1024*16).graph"
  iters: 3
  wall_time: 8.092029020190239e-07
  extras {
    key: "num_elements"
    value {
      double_value: 2048.0
    }
  }
}

EAGER MODE
entry {
  name: "ListFilesBenchmark.nested_directory(1024*16).eager"
  iters: 3
  wall_time: 7.745111361145973e-07
  extras {
    key: "num_elements"
    value {
      double_value: 2048.0
    }
  }
}

NOTE: Description has been updated as per review comments.

kvignesh1420 · 2021-01-12T19:00:47Z

cc: @aaudiber for review.

aaudiber · 2021-01-20T18:26:06Z

I don't think eager benchmarks are worth maintaining, since the backend dataset implementation works the same in both eager and graph mode. WDYT @jsimsa?

jsimsa · 2021-01-20T18:52:59Z

I think there is value in being able to run benchmarks in eager mode but this support should certainly not result in duplicating benchmark code. Instead, the benchmark base should be modified to make it possible to run a benchmark both in graph mode or eagerly. Which mode to use could be a based on which parameter is passed to python/data/benchmarks/benchmark_base.py run_benchmark method, defaulting to the mode of the context in which the benchmark is executed.

kvignesh1420 · 2021-01-20T19:29:42Z

I see that not all the benchmarks located in python/data/benchmarks/ and python/data/experimental/benchmarks/ subclass the DatasetBenchmarkBase class. Thus, making changes to the DatasetBenchmarkBase class will cover a few of them but not all. This is due to the custom _benchmark logic that is needed instead of the default one available in DatasetBenchmarkBase.

As per @jsimsa 's suggestion, I can modify the run_benchmark method of DatasetBenchmarkBase class to take a parameter named eager which will either be True or False. It will default to False so that it doesn't disturb the current benchmark scenarios. I can then add conditional logic to handle eager=True case within that method. The benchmarks which inherit the DatasetBenchmarkBase can now call the run_and_report_benchmark method with the option of benchmarking eagerly.

I can do the same for benchmarks which do not inherit the DatasetBenchmarkBase class and modify the _benchmark method to take the eager parameter and conditionalize the logic.

What do you think? @jsimsa @aaudiber?

UPDATE: The benchmarks do not take the eager parameter but execute in either eager or graph mode based on context.

jsimsa · 2021-01-20T19:39:51Z

Thanks @kvignesh1420. You plan for modifying DatasetBenchmarkBase sounds good to me. It would also be good to understand why do some benchmarks need custom _benchmark logic and whether it would be possible to refactor things so that all benchmarks inherit from DatasetBenchmarkBase.

kvignesh1420 · 2021-01-20T20:02:05Z

@jsimsa thanks for the go-ahead. I will update the PR with the required changes and try to refactor things.

kvignesh1420 · 2021-01-21T18:25:06Z

I have modified the DatasetBenchmarkBase class to support eager iteration over the datasets depending on the context. Additionally, the ListFilesBenchmark in python/data/benchmarks/list_files_benchmark.py has been refactored to use the DatasetBenchmarkBase class. The refactor changes the current output of the benchmark in order to be in sync with the style of other benchmarks.

cc: @jsimsa @aaudiber let me know how it looks. I can take up the benchmarks in data/experimental/benchmarks after this.

tensorflow/python/data/benchmarks/benchmark_base.py

jsimsa

Thank you. As per one of my comments, we want to test that the new implementation works as expected before merging the PR (and I provided you with instructions on how to do so).

tensorflow/python/data/benchmarks/benchmark_base.py

kvignesh1420 · 2021-01-22T21:15:04Z

@jsimsa I have modified the run_and_report_benchmark method in DatasetBenchmarkBase to append the mode in which the benchmarks are running to the benchmark name. Also, a sample benchmark result has been added to the description. If you want additional results from eager mode benchmarks and its graph counter-part, I can attach the files with the log outputs. Let me know.

jsimsa

This looks great. Thank you for the changes. Left one minor comment. Other than that it looks good.

tensorflow/python/data/benchmarks/benchmark_base.py

Add eager benchmark for choosing fastest benchmark

e3c2198

google-ml-butler bot added the size:M CL Change Size: Medium label Jan 10, 2021

google-cla bot added the cla: yes label Jan 10, 2021

kvignesh1420 changed the title ~~[tf.data] Add eager benchmark for choosing fastest benchmark~~ [tf.data] Add eager benchmark for choosing fastest dataset Jan 10, 2021

kvignesh1420 added 2 commits January 10, 2021 22:26

increase default num_experiments count

6b4327d

bump to trigger ci builds

735d414

gbaned self-assigned this Jan 11, 2021

gbaned added the comp:data tf.data related issues label Jan 11, 2021

gbaned added this to Assigned Reviewer in PR Queue via automation Jan 11, 2021

gbaned requested a review from aaudiber January 11, 2021 04:35

gbaned added the awaiting review Pull request awaiting review label Jan 20, 2021

kvignesh1420 added 5 commits January 21, 2021 22:42

eager benchmarking based on context

b4c8fa6

refactor list_files_benchmark

3161e2b

use benchmark_base for running tests

be1a691

create new iterator instance for each iter

72643e1

cleanup list_files_benchmark

75e43e1

kvignesh1420 changed the title ~~[tf.data] Add eager benchmark for choosing fastest dataset~~ [tf.data] Support eager mode benchmarking for tf.data.Datasets Jan 21, 2021

kvignesh1420 changed the title ~~[tf.data] Support eager mode benchmarking for tf.data.Datasets~~ [tf.data] Support eager mode benchmarking Jan 21, 2021

jsimsa requested changes Jan 22, 2021

View reviewed changes

tensorflow/python/data/benchmarks/benchmark_base.py Show resolved Hide resolved

PR Queue automation moved this from Assigned Reviewer to Reviewer Requested Changes Jan 22, 2021

kvignesh1420 added 3 commits January 22, 2021 13:42

perform skip on dataset

ab73616

remove TODO's

d6a1121

preserve dataset options

6e8616e

handle StopIteration during warmups

f4aa635

kvignesh1420 commented Jan 22, 2021

View reviewed changes

tensorflow/python/data/benchmarks/benchmark_base.py Show resolved Hide resolved

kvignesh1420 requested a review from jsimsa January 22, 2021 16:35

jsimsa requested changes Jan 22, 2021

View reviewed changes

tensorflow/python/data/benchmarks/benchmark_base.py Outdated Show resolved Hide resolved

tensorflow/python/data/benchmarks/benchmark_base.py Show resolved Hide resolved

kvignesh1420 added 4 commits January 22, 2021 23:40

update the note on dataset.skip()

9df00e6

modify note on skip()

7e07704

modify report name to identify execution context

50845b2

modify name extensions to be prevent overlap

f923406

jsimsa reviewed Jan 22, 2021

View reviewed changes

tensorflow/python/data/benchmarks/benchmark_base.py Outdated Show resolved Hide resolved

modified suffix while reporting

bc57176

jsimsa approved these changes Jan 22, 2021

View reviewed changes

google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Jan 22, 2021

PR Queue automation moved this from Reviewer Requested Changes to Approved by Reviewer Jan 22, 2021

kokoro-team removed kokoro:force-run Tests on submitted change labels Jan 22, 2021

rthadur added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process awaiting review Pull request awaiting review labels Jan 22, 2021

gbaned added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process labels Jan 26, 2021

copybara-service bot merged commit e3e5a24 into tensorflow:master Jan 26, 2021

PR Queue automation moved this from Approved by Reviewer to Merged Jan 26, 2021

kvignesh1420 deleted the eager-benchmark branch January 26, 2021 17:40

kvignesh1420 mentioned this pull request Jan 28, 2021

[tf.data] eager mode support for experimental benchmarks part1 #46761

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tf.data] Support eager mode benchmarking #46320

[tf.data] Support eager mode benchmarking #46320

kvignesh1420 commented Jan 10, 2021 •

edited

kvignesh1420 commented Jan 12, 2021 •

edited

aaudiber commented Jan 20, 2021

jsimsa commented Jan 20, 2021

kvignesh1420 commented Jan 20, 2021 •

edited

jsimsa commented Jan 20, 2021

kvignesh1420 commented Jan 20, 2021

kvignesh1420 commented Jan 21, 2021 •

edited

jsimsa left a comment

kvignesh1420 commented Jan 22, 2021 •

edited

jsimsa left a comment

[tf.data] Support eager mode benchmarking #46320

[tf.data] Support eager mode benchmarking #46320

Conversation

kvignesh1420 commented Jan 10, 2021 • edited

kvignesh1420 commented Jan 12, 2021 • edited

aaudiber commented Jan 20, 2021

jsimsa commented Jan 20, 2021

kvignesh1420 commented Jan 20, 2021 • edited

jsimsa commented Jan 20, 2021

kvignesh1420 commented Jan 20, 2021

kvignesh1420 commented Jan 21, 2021 • edited

jsimsa left a comment

Choose a reason for hiding this comment

kvignesh1420 commented Jan 22, 2021 • edited

jsimsa left a comment

Choose a reason for hiding this comment

kvignesh1420 commented Jan 10, 2021 •

edited

kvignesh1420 commented Jan 12, 2021 •

edited

kvignesh1420 commented Jan 20, 2021 •

edited

kvignesh1420 commented Jan 21, 2021 •

edited

kvignesh1420 commented Jan 22, 2021 •

edited