Get people to stop improperly using AT_ASSERT #20287

ezyang · 2019-05-08T21:05:52Z

Somehow, we have ended up in a situation where a lot of people in the PyTorch codebase are using AT_ASSERT when they should be using AT_CHECK (just do a grep for AT_ASSERT in aten/src/ATen/native, I audited the first five files and every occurrence of AT_ASSERT was wrong). Just to review the intended semantics:

AT_ASSERT means "internal error, someone on the PyTorch dev team fucked up." If an AT_ASSERT triggers, that's a bug, and you should report a bug to PyTorch. In fact, the error message on AT_ASSERT says exactly this.
AT_CHECK means "user error, report a message to the user." Users might trigger AT_CHECK and that's fine, it just means they misused the API. Give them a nice, human-friendly message, and let them try something else in their Jupyter session, whatever,

Use of AT_ASSERT when you mean AT_CHECK is bad for a few reasons:

At some point in the future, we may legitimately want to add a compilation mode that disables asserts (since, in principle, they should never be triggered). We cannot do this if half of the asserts are legitimate user input validation checks; disabling them will lead to hard to understand crashes on this "optimized" build.
When an improperly tagged AT_ASSERT fails, we get spurious bug reports where users report a bug to PyTorch, because the error message told them to. Examples: C++ API, IValue toTensor() didn't work #13503, Tensor type mismatch, caller expects elements to be int, while tensor contains float #13052, isTensorList() ASSERT FAILED #17911
I hate it!

Let's do some renaming to make the distinction clearer.

We propose the final state be just two macros (I'm lying; there's also an IndexError macro; I intend to preserve it but most people won't use it):

AT_INTERNAL_ASSERT(cond, ...); // panic if cond is not true; with optional message after
AT_CHECK(cond, ...); // error if cond is not true; with optional message after

Translation guide:

AT_ASSERT(cond) ==> AT_INTERNAL_ASSERT(cond)
AT_ASSERTM(cond, msg) ==> AT_INTERNAL_ASSERT(cond, msg)
AT_CHECK(cond, msg) ==> AT_CHECK(cond, msg)
AT_ERROR(msg) ==> AT_CHECK(false, msg)

I'm OK with bikeshedding names.

For BC-reasons, we will need to retain the old macros with deprecation warnings. We'll use a similar strategy to AT_DISPATCH_ here. The transition won't be done all in one go; at the same time we'll audit AT_ASSERT sites to see if they really are internal errors or not.

The text was updated successfully, but these errors were encountered:

ezyang · 2019-05-08T21:07:18Z

cc @gchanan

ezyang · 2019-05-08T21:11:56Z

Proposal was modified to use AT_ instead of C10_ on the grounds that it is easier to type.

gchanan · 2019-05-08T21:11:57Z

lgtm.

smessmer · 2019-05-08T21:12:12Z

I fully agree we should fix this and choose names that better describe what these do.

Note that it's not always obvious if AT_ASSERT or AT_CHECK is the right choice.
For example, in my current work on the operator library, say a kernel uses the registration API wrongly. The registration API asserts against this and should (probably?) use AT_CHECK because the bad kernel could be a custom kernel supplied by the user. However, for internal kernels that are part of PyTorch core, you could also argue it should be AT_ASSERT.

ezyang · 2019-05-08T21:15:35Z

For example, in my current work on the operator library, say a kernel uses the registration API wrongly. The registration API asserts against this and should (probably?) use AT_CHECK because the bad kernel could be a custom kernel supplied by the user. However, for internal kernels that are part of PyTorch core, you could also argue it should be AT_ASSERT.

Matters will always be murky for the C++ API, because we publish the very same API we use internally and so you can't distinguish between user and internal code. However, one way to solve this problem is to distinguish between the internal API, and the public API (as @mike7ant has asked us to do). Then you can put proper checks in the public API.

Krovatkin · 2019-05-08T21:15:46Z

it would be nice if these macros all returned an ostream object for composing more comprehensive error messages.

ASSERT_AT(vec.size()) << "Vector size should never be equal to zero  context ( node =  " << *node <<  " )  some more useful information " ;

EDIT: grammar

ezyang · 2019-05-08T21:18:18Z

@Krovatkin The ASSERT macro, for internal implementation reasons, currently requires you use to a separate ASSERTM for messages.

AT_ASSERTM(vec.size(), "Vector size should never be equal to zero  context ( node =  ", *node,  " )  some more useful information ");

I plan to eliminate this inconsistency. (I'll have to construct two strings on error case because of variadic macro comma pasting shenanigans, but that's a small price to pay).

suo · 2019-05-08T21:25:39Z

For informational purposes: the way people are trained to do in fbcode is glog-style: CHECK() will crash the process (not catchable), DCHECK() is the same but gets compiled out in opt. If you want to throw a user-facing error instead of crashing everything, you have to throw explicitly.

I don't think we should have a crashing CHECK(), since there's basically no circumstance where it's okay for library code to crash a production server, and it would be too easy for an errant CHECK() to slip through.

But having a macro vs. throw rule might be easier than remembering which of the macros you're supposed to use (or having an explicit-but-long name like AT_INTERNAL_ASSERT).

ezyang · 2019-05-08T21:44:21Z

@suo Are you OK with the AT_CHECK name in that case? It sounds like it might be toxic for fbcode people, and we should change it to something that leads to a less visceral reaction (like AT_ENFORCE). (I personally prefer AT_CHECK but I'm willing to compromise on this.)

zdevito · 2019-05-08T23:30:54Z

This seems like a great improvement. It would be nice if the error messages knew what do with source range information, so, for instance, AT_ASSERT(cond, node->range()) would give useful error reporting information.

eellison · 2019-05-08T23:39:02Z

There's JIT_SCRIPT_ASSERT in error_report.h, but no one ever uses that

zdevito · 2019-05-08T23:40:35Z

I am just going to clean up SourceRange so it has an operator<<, that way it will work trivially with AT_CHECK and AT_ASSERT

zdevito · 2019-05-09T16:46:44Z

This patch makes SourceRange printable, so AT_ASSERT(cond, "error message\n", node->sourceRange()) will print context information.
#20300

Summary: Pull Request resolved: #20321 First part of #20287 - Rename `AT_ASSERT` to `TORCH_INTERNAL_ASSERT` - Make `TORCH_INTERNAL_ASSERT` work with variadic inputs - Deprecated `AT_ASSERT` and `AT_ASSERTM` - Rename `AT_CHECK` to `TORCH_CHECK` - Make `TORCH_CHECK` give a better error message when no arguments are provided - Deprecate `AT_ERROR` in favor of `TORCH_CHECK(false, ...)` - Deprecate `AT_INDEX_ERROR` in favor of `TORCH_CHECK_INDEX(false, ...)` - Rename `AT_WARN` to `TORCH_WARN` No use sites are changed; I'll work on that in follow up patches (or disable the deprecation, if necessary.) Differential Revision: D15278439 fbshipit-source-id: 7e0ed489d4e89e5f56b8ad7eafa72cb9a06065ee

ezyang · 2019-09-30T14:08:41Z

We fixed this!

dashesy · 2019-10-22T22:44:02Z

Just a note. All the docs for cpp extensions still use AT_ASSERTM that is why many users (like me) hooked in using them.

Also, is AT_CHECK also deprecated?

warning: ‘void c10::detail::deprecated_AT_CHECK()’ is deprecated [-Wdeprecated-declarations] ::c10::detail::deprecated_AT_CHECK();

So all custom cpp extensions will have are deprecated now.

ezyang · 2019-10-23T15:30:02Z

@dashesy Thank you for the note. See pytorch/tutorials#711

onbigion13 · 2023-02-28T14:21:46Z

nice advice, and thanks for you explanation of the utilities of these two states. I'm exactly struggling understanding the difference.

ezyang mentioned this issue May 9, 2019

Rename AT_ASSERT to TORCH_INTERNAL_ASSERT; other macro updates #20321

Closed

ailzhang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 9, 2019

ezyang closed this as completed Sep 30, 2019

ezyang mentioned this issue Sep 30, 2019

[ivalue] Add tuple constructor + to<std::tuple<Args...>> #26668

Closed

andreaskoepf mentioned this issue Oct 6, 2019

RuntimeError: tensor.ndimension() == static_cast<int64_t>(expected_size.size()) INTERNAL ASSERT FAILED #26400

Open

ezyang mentioned this issue Oct 8, 2019

Comprehensive-ish instrumentation for CUDA memory allocator #27361

Closed

ezyang mentioned this issue Dec 16, 2019

Add TORCH_DCHECK macro that checks only in debug builds #31240

Closed

ezyang mentioned this issue May 11, 2020

Sparse softmax support (CPU) #36305

Closed

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get people to stop improperly using AT_ASSERT #20287

Get people to stop improperly using AT_ASSERT #20287

ezyang commented May 8, 2019 •

edited

ezyang commented May 8, 2019

ezyang commented May 8, 2019

gchanan commented May 8, 2019

smessmer commented May 8, 2019 •

edited

ezyang commented May 8, 2019 •

edited

Krovatkin commented May 8, 2019 •

edited

ezyang commented May 8, 2019

suo commented May 8, 2019

ezyang commented May 8, 2019

zdevito commented May 8, 2019

eellison commented May 8, 2019

zdevito commented May 8, 2019

zdevito commented May 9, 2019

ezyang commented Sep 30, 2019

dashesy commented Oct 22, 2019 •

edited

ezyang commented Oct 23, 2019

onbigion13 commented Feb 28, 2023

Get people to stop improperly using AT_ASSERT #20287

Get people to stop improperly using AT_ASSERT #20287

Comments

ezyang commented May 8, 2019 • edited

ezyang commented May 8, 2019

ezyang commented May 8, 2019

gchanan commented May 8, 2019

smessmer commented May 8, 2019 • edited

ezyang commented May 8, 2019 • edited

Krovatkin commented May 8, 2019 • edited

ezyang commented May 8, 2019

suo commented May 8, 2019

ezyang commented May 8, 2019

zdevito commented May 8, 2019

eellison commented May 8, 2019

zdevito commented May 8, 2019

zdevito commented May 9, 2019

ezyang commented Sep 30, 2019

dashesy commented Oct 22, 2019 • edited

ezyang commented Oct 23, 2019

onbigion13 commented Feb 28, 2023

ezyang commented May 8, 2019 •

edited

smessmer commented May 8, 2019 •

edited

ezyang commented May 8, 2019 •

edited

Krovatkin commented May 8, 2019 •

edited

dashesy commented Oct 22, 2019 •

edited