Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix release assert #6696

Merged
merged 11 commits into from
Nov 10, 2020
Merged

Conversation

jrhemstad
Copy link
Contributor

The release_assert macro is used in device code to signal that an error occurred and kill the kernel.

There were two issues with release_assert:

  1. It was incorrectly guarded with a macro that doesn't exist __CUDACC_ARCH__.
    • This means release_assert has just been a no-op this whole time.
  2. It incorrectly use the __assertfail built-in instead of the __assert_fail built-in.
    • This was never caught because of the problems with the above ifdef.

This should have been caught by tests, but it seems that the test that used to verify this behavior was dropped. Likely sometime in the libcudf++ transition.

// Unsuported gdf_dtypes in device code should set appropriate error code
// and invalidates device context
TEST_F(DispatcherDeathTest, DeviceDispatchFunctor) {
testing::FLAGS_gtest_death_test_style = "threadsafe";
thrust::device_vector<bool> result(1);
auto call_kernel = [&result](gdf_dtype t) {
dispatch_test_kernel<<<1, 1>>>(t, result.data().get());
auto error_code = cudaDeviceSynchronize();
// Kernel should fail with `cudaErrorAssert` on an unsupported gdf_dtype
// This error invalidates the current device context, so we need to kill
// the current process. Running with EXPECT_DEATH spawns a new process for
// each attempted kernel launch
EXPECT_EQ(cudaErrorAssert, error_code);
exit(-1);
};
for (auto const& t : unsupported_dtypes) {
EXPECT_DEATH(call_kernel(t), "");
}

This PR corrects the release_assert and adds tests that it works as expected.

@GPUtester
Copy link
Collaborator

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

Copy link
Contributor

@karthikeyann karthikeyann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this time I thought, it's supposed to be no-op!
Great catch.

@codecov
Copy link

codecov bot commented Nov 6, 2020

Codecov Report

Merging #6696 (4db8ed9) into branch-0.17 (8dd9323) will increase coverage by 0.14%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.17    #6696      +/-   ##
===============================================
+ Coverage        82.18%   82.32%   +0.14%     
===============================================
  Files               94       94              
  Lines            15468    15585     +117     
===============================================
+ Hits             12712    12830     +118     
+ Misses            2756     2755       -1     
Impacted Files Coverage Δ
python/cudf/cudf/_fuzz_testing/fuzzer.py 0.00% <0.00%> (ø)
python/cudf/cudf/utils/hash_vocab_utils.py 100.00% <0.00%> (ø)
...ython/dask_cudf/dask_cudf/io/tests/test_parquet.py 100.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/parquet.py 91.96% <0.00%> (+0.60%) ⬆️
python/cudf/cudf/core/abc.py 91.48% <0.00%> (+4.25%) ⬆️
python/cudf/cudf/utils/gpu_utils.py 58.53% <0.00%> (+4.87%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8dd9323...4db8ed9. Read the comment docs.

@trivialfis
Copy link
Member

Strangely I have undefined indentifier error with assert_fail

@jrhemstad
Copy link
Contributor Author

rerun tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants