Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new op BytesInUse, similar to MaxBytesInUse #13107

Merged
merged 4 commits into from
Sep 29, 2017

Conversation

yaroslavvb
Copy link
Contributor

Adding BytesInUse
This is more useful than MaxBytesInUse for getting peak memory for a given session.run call because the latter gives maximum memory usage over lifetime of allocator, which can span multiple session.run calls/multiple session objects

@mention-bot
Copy link

@yaroslavvb, thanks for your PR! By analyzing the history of the files in this pull request, we identified @wujingyue, @lukeiwanski and @tensorflower-gardener to be potential reviewers.

@tensorflow-jenkins
Copy link
Collaborator

Can one of the admins verify this patch?

@yaroslavvb
Copy link
Contributor Author

Note that bytes_in_use is also what's used by timeline for memory tracking -- 7b5f590

}
};

// register this op on GPU only, see comment for MaxBytesInUse for reason
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "// Register"


#ifdef TENSORFLOW_USE_SYCL
REGISTER_KERNEL_BUILDER(
Name("MaxBytesInUse").Device(DEVICE_SYCL).HostMemory("out"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean "BytesInUse" instead of "MaxBytesInUse"?

@@ -16,6 +16,31 @@ limitations under the License.

namespace tensorflow {

// Op that measures current memory in bytes.
class BytesInUseOp : public MemoryStatsOp {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, I suggest move this op after MemoryStatsOp.

self.assertGreaterEqual(max_bytes_in_use, matrix_size_in_bytes * 3)
self.assertLess(max_bytes_in_use, matrix_size_in_bytes * 4)
self.assertGreaterEqual(bytes_in_use, matrix_size_in_bytes * 3)
Copy link
Contributor

@wujingyue wujingyue Sep 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests you added don't distinguish BytesInUse from MaxBytesInUse. One idea is to launch a second session with a smaller graph. MaxBytesInUse should return the larger size, and BytesInUse should return the smaller size.

@yaroslavvb yaroslavvb added the stat:awaiting response Status - Awaiting response from author label Sep 18, 2017
@yaroslavvb
Copy link
Contributor Author

OK, I guess for adding new test I'll actually need to setup TensorFlow development env :) I'll ping this thread after updating

@yaroslavvb
Copy link
Contributor Author

added test to make sure intermediate memory usage gets measured correctly, ptal

@yaroslavvb yaroslavvb removed the stat:awaiting response Status - Awaiting response from author label Sep 23, 2017
@sb2nov sb2nov added the awaiting review Pull request awaiting review label Sep 26, 2017
@sb2nov
Copy link
Contributor

sb2nov commented Sep 26, 2017

Jenkins, test this please.

# intermediate result allocates 1 matrix, max usage is at least 2
self.assertGreaterEqual(bytes_in_use, matrix_size_in_bytes * 1)
self.assertLess(bytes_in_use, matrix_size_in_bytes * 2)
self.assertGreaterEqual(max_bytes_in_use, matrix_size_in_bytes * 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this more strict? max_bytes_in_use should >= matrix_size_in_bytes * 3, because as you said it's sticky to the allocator.

Copy link
Contributor Author

@yaroslavvb yaroslavvb Sep 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it that way because I wasn't sure about guarantees about lifetime of the allocator. If the allocator gets reset between run calls this test would be too strict. I'll assume allocator lifetime is guaranteed to span 2 session.run calls and update this in a bit

Copy link
Contributor

@wujingyue wujingyue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor comments.

@sb2nov
Copy link
Contributor

sb2nov commented Sep 28, 2017

Jenkins, test this please.

@sb2nov sb2nov self-assigned this Sep 28, 2017
@yaroslavvb
Copy link
Contributor Author

MacOS failures unrelated to this cl (GCS cloud tests)

@sb2nov
Copy link
Contributor

sb2nov commented Sep 28, 2017

That is a known flaky test so re-running the tests.

Jenkins, test this please.

@sb2nov sb2nov merged commit 0b13150 into tensorflow:master Sep 29, 2017
@yaroslavvb yaroslavvb deleted the bytes_in_use branch September 29, 2017 23:30
copybara-service bot pushed a commit that referenced this pull request May 29, 2024
Imported from GitHub PR openxla/xla#13107

Copybara import of the project:

--
6555b6d216bb18b1f98eb36c5dac3f58f4d09c05 by mmakevic <Milica.Makevic@amd.com>:

Fix reduce_row_vectorized.hlo.test

Merging this change closes #13107

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#13107 from ROCm:ci_reduce_row_vectorized 6555b6d216bb18b1f98eb36c5dac3f58f4d09c05
PiperOrigin-RevId: 638442671
copybara-service bot pushed a commit that referenced this pull request May 29, 2024
Imported from GitHub PR openxla/xla#13107

Copybara import of the project:

--
6555b6d216bb18b1f98eb36c5dac3f58f4d09c05 by mmakevic <Milica.Makevic@amd.com>:

Fix reduce_row_vectorized.hlo.test

Merging this change closes #13107

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#13107 from ROCm:ci_reduce_row_vectorized 6555b6d216bb18b1f98eb36c5dac3f58f4d09c05
PiperOrigin-RevId: 638442671
copybara-service bot pushed a commit that referenced this pull request May 30, 2024
Imported from GitHub PR openxla/xla#13107

Copybara import of the project:

--
6555b6d216bb18b1f98eb36c5dac3f58f4d09c05 by mmakevic <Milica.Makevic@amd.com>:

Fix reduce_row_vectorized.hlo.test

Merging this change closes #13107

PiperOrigin-RevId: 638453162
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting review Pull request awaiting review cla: yes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants