Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow larger block size in HIP #3165

Merged
merged 10 commits into from
Jul 11, 2020
Merged

Conversation

masterleinad
Copy link
Contributor

This is a rebase of #3102. In that pull request, we saw that we had failing tests in Debug mode.
It turns out that it is sufficient to just force NDEBUG (which is what this pull request does). This is always what we were observing in ArborX.
Obviously, this is not a long-term solution but I would be happy going with this workaround and move forward.

@masterleinad
Copy link
Contributor Author

Retest this please.

@crtrott crtrott added the Blocks Promotion Overview issue for release-blocking bugs label Jul 8, 2020
// FIXME_HIP Not defining NDEBUG makes some tests fail.
#ifndef NDEBUG
#define NDEBUG
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that propagate to user code? I am not sure everybody would be happy with this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, this is needed for this pull request for all the tests to pass. IMHO it doesn't help if we go only half of the way and don't enforce it in user code just to break there code if they don't do it themselves.
Of course, it's a judgment call if this pull request rectifies messing with NDEBUG in user code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My problem is that we silently change the way assert works in user code. From a user perspective, it's going to be a headache to understand what's going on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does propagate to user code. whats the actual problem solved by this? Is it maybe largely a register limit issue?

@@ -207,7 +207,7 @@ struct HIPParallelLaunch<DriverType, Kokkos::LaunchBounds<0, 0>,
HIP_SAFE_CALL(hipMemcpyAsync(d_driver, &driver, sizeof(DriverType),
hipMemcpyHostToDevice,
hip_instance->m_stream));
hip_parallel_launch_local_memory<DriverType>
hip_parallel_launch_local_memory<DriverType, 1024, 1>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm that is weird guys. You are saying for this now that the launch HAS to work with block size of 1024. And thus you HAVE to be below a certain register limit.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least, please don't put large unnamed constants in your code without at least a comment explaining them.

&hip_parallel_launch_local_memory<DriverType>));
hipFuncGetAttributes(
&attr, reinterpret_cast<void *>(
&hip_parallel_launch_local_memory<DriverType, 1024, 1>));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

Copy link

@dhollman dhollman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These pull requests are really hard to review without some explanation of what's copied over from CUDA and what's new code. It's probably good to comment those things anyway, and it will make it much more reasonable to review these sorts of things without asking for changes that really should be made in both places (and it will make it easier to fix those things in both places when we do fix them at some later point)

Comment on lines +92 to +93
#if (HIP_VERSION_MAJOR > 3 || HIP_VERSION_MINOR > 5 || \
HIP_VERSION_PATCH >= 20226)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ummm.... this will be true for HIP 2.7 or whatever (for instance; I know that probably doesn't exist) or 3.4 with version patch 31415. I doubt that's the intention.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just temporary. We only support the latest version of the compiler so HIP 2.7 or 3.4 won't work anyway. As soon as HIP 3.6 is released, we will require it and this can be removed.

@@ -207,7 +207,7 @@ struct HIPParallelLaunch<DriverType, Kokkos::LaunchBounds<0, 0>,
HIP_SAFE_CALL(hipMemcpyAsync(d_driver, &driver, sizeof(DriverType),
hipMemcpyHostToDevice,
hip_instance->m_stream));
hip_parallel_launch_local_memory<DriverType>
hip_parallel_launch_local_memory<DriverType, 1024, 1>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least, please don't put large unnamed constants in your code without at least a comment explaining them.

Copy link
Member

@crtrott crtrott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with it now. I assume the entire backend is under the comment that we have workarounds for issues in the tool chain.

@crtrott crtrott merged commit bb2bcde into kokkos:develop Jul 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocks Promotion Overview issue for release-blocking bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants