Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tf.data] Support eager mode benchmarking #46320

Merged
merged 17 commits into from Jan 26, 2021
Merged

[tf.data] Support eager mode benchmarking #46320

merged 17 commits into from Jan 26, 2021

Conversation

kvignesh1420
Copy link
Member

@kvignesh1420 kvignesh1420 commented Jan 10, 2021

This PR adds support for eager execution based benchmarks based on the benchmarking context. This is achieved by checking for the eager_execution() value within the benchmarking context and iterating over the dataset eagerly in the DatasetBenchmarkBase class.

Additionally, the ListFilesBenchmark in python/data/benchmarks/list_files_benchmark.py has been refactored to use the DatasetBenchmarkBase class.

A sample comparison w.r.t RangeBenchmark is as follows:

GRAPH MODE:

entry {
  name: "RangeBenchmark.modeling_on.graph"
  iters: 5
  wall_time: 1.7887322902679443e-07
  extras {
    key: "num_elements"
    value {
      double_value: 10000000.0
    }
  }
}

entry {
  name: "RangeBenchmark.modeling_off.graph"
  iters: 5
  wall_time: 1.541191530227661e-07
  extras {
    key: "num_elements"
    value {
      double_value: 50000000.0
    }
  }
}

EAGER MODE:

entry {
  name: "RangeBenchmark.modeling_on.eager"
  iters: 5
  wall_time: 2.0390429496765137e-07
  extras {
    key: "num_elements"
    value {
      double_value: 10000000.0
    }
  }
}

entry {
  name: "RangeBenchmark.modeling_off.eager"
  iters: 5
  wall_time: 1.8014323711395263e-07
  extras {
    key: "num_elements"
    value {
      double_value: 50000000.0
    }
  }
}

w.r.t ListFilesBenchmark

GRAPH MODE
entry {
  name: "ListFilesBenchmark.nested_directory(1024*16).graph"
  iters: 3
  wall_time: 8.092029020190239e-07
  extras {
    key: "num_elements"
    value {
      double_value: 2048.0
    }
  }
}

EAGER MODE
entry {
  name: "ListFilesBenchmark.nested_directory(1024*16).eager"
  iters: 3
  wall_time: 7.745111361145973e-07
  extras {
    key: "num_elements"
    value {
      double_value: 2048.0
    }
  }
}

NOTE: Description has been updated as per review comments.

@google-ml-butler google-ml-butler bot added the size:M CL Change Size: Medium label Jan 10, 2021
@google-cla google-cla bot added the cla: yes label Jan 10, 2021
@kvignesh1420 kvignesh1420 changed the title [tf.data] Add eager benchmark for choosing fastest benchmark [tf.data] Add eager benchmark for choosing fastest dataset Jan 10, 2021
@gbaned gbaned self-assigned this Jan 11, 2021
@gbaned gbaned added the comp:data tf.data related issues label Jan 11, 2021
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Jan 11, 2021
@gbaned gbaned requested a review from aaudiber January 11, 2021 04:35
@kvignesh1420
Copy link
Member Author

kvignesh1420 commented Jan 12, 2021

cc: @aaudiber for review.

@gbaned gbaned added the awaiting review Pull request awaiting review label Jan 20, 2021
@aaudiber
Copy link
Contributor

I don't think eager benchmarks are worth maintaining, since the backend dataset implementation works the same in both eager and graph mode. WDYT @jsimsa?

@jsimsa
Copy link
Contributor

jsimsa commented Jan 20, 2021

I think there is value in being able to run benchmarks in eager mode but this support should certainly not result in duplicating benchmark code. Instead, the benchmark base should be modified to make it possible to run a benchmark both in graph mode or eagerly. Which mode to use could be a based on which parameter is passed to python/data/benchmarks/benchmark_base.py run_benchmark method, defaulting to the mode of the context in which the benchmark is executed.

@kvignesh1420
Copy link
Member Author

kvignesh1420 commented Jan 20, 2021

I see that not all the benchmarks located in python/data/benchmarks/ and python/data/experimental/benchmarks/ subclass the DatasetBenchmarkBase class. Thus, making changes to the DatasetBenchmarkBase class will cover a few of them but not all. This is due to the custom _benchmark logic that is needed instead of the default one available in DatasetBenchmarkBase.

As per @jsimsa 's suggestion, I can modify the run_benchmark method of DatasetBenchmarkBase class to take a parameter named eager which will either be True or False. It will default to False so that it doesn't disturb the current benchmark scenarios. I can then add conditional logic to handle eager=True case within that method. The benchmarks which inherit the DatasetBenchmarkBase can now call the run_and_report_benchmark method with the option of benchmarking eagerly.

I can do the same for benchmarks which do not inherit the DatasetBenchmarkBase class and modify the _benchmark method to take the eager parameter and conditionalize the logic.

What do you think? @jsimsa @aaudiber?

UPDATE: The benchmarks do not take the eager parameter but execute in either eager or graph mode based on context.

@jsimsa
Copy link
Contributor

jsimsa commented Jan 20, 2021

Thanks @kvignesh1420. You plan for modifying DatasetBenchmarkBase sounds good to me. It would also be good to understand why do some benchmarks need custom _benchmark logic and whether it would be possible to refactor things so that all benchmarks inherit from DatasetBenchmarkBase.

@kvignesh1420
Copy link
Member Author

@jsimsa thanks for the go-ahead. I will update the PR with the required changes and try to refactor things.

@kvignesh1420 kvignesh1420 changed the title [tf.data] Add eager benchmark for choosing fastest dataset [tf.data] Support eager mode benchmarking for tf.data.Datasets Jan 21, 2021
@kvignesh1420 kvignesh1420 changed the title [tf.data] Support eager mode benchmarking for tf.data.Datasets [tf.data] Support eager mode benchmarking Jan 21, 2021
@kvignesh1420
Copy link
Member Author

kvignesh1420 commented Jan 21, 2021

I have modified the DatasetBenchmarkBase class to support eager iteration over the datasets depending on the context. Additionally, the ListFilesBenchmark in python/data/benchmarks/list_files_benchmark.py has been refactored to use the DatasetBenchmarkBase class. The refactor changes the current output of the benchmark in order to be in sync with the style of other benchmarks.

cc: @jsimsa @aaudiber let me know how it looks. I can take up the benchmarks in data/experimental/benchmarks after this.

PR Queue automation moved this from Assigned Reviewer to Reviewer Requested Changes Jan 22, 2021
Copy link
Contributor

@jsimsa jsimsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. As per one of my comments, we want to test that the new implementation works as expected before merging the PR (and I provided you with instructions on how to do so).

tensorflow/python/data/benchmarks/benchmark_base.py Outdated Show resolved Hide resolved
@kvignesh1420
Copy link
Member Author

kvignesh1420 commented Jan 22, 2021

@jsimsa I have modified the run_and_report_benchmark method in DatasetBenchmarkBase to append the mode in which the benchmarks are running to the benchmark name. Also, a sample benchmark result has been added to the description. If you want additional results from eager mode benchmarks and its graph counter-part, I can attach the files with the log outputs. Let me know.

Copy link
Contributor

@jsimsa jsimsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Thank you for the changes. Left one minor comment. Other than that it looks good.

tensorflow/python/data/benchmarks/benchmark_base.py Outdated Show resolved Hide resolved
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Jan 22, 2021
PR Queue automation moved this from Reviewer Requested Changes to Approved by Reviewer Jan 22, 2021
@kokoro-team kokoro-team removed kokoro:force-run Tests on submitted change labels Jan 22, 2021
@rthadur rthadur added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process awaiting review Pull request awaiting review labels Jan 22, 2021
@gbaned gbaned added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process labels Jan 26, 2021
@copybara-service copybara-service bot merged commit e3e5a24 into tensorflow:master Jan 26, 2021
PR Queue automation moved this from Approved by Reviewer to Merged Jan 26, 2021
@kvignesh1420 kvignesh1420 deleted the eager-benchmark branch January 26, 2021 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes comp:data tf.data related issues ready to pull PR ready for merge process size:M CL Change Size: Medium
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

6 participants