cmd: make string/error code more robust against errno leaking#9844
Conversation
The i386 sid sbuild fails because apparently some sbuild code is calling functions that are not implemented so errno is set during the tests when it is not expected. This leads to test failures because the die() code will append errno status if errno is set. This commit fixes this and makes the nightly test also run on i386.
3c4bb70 to
a0bda4c
Compare
|
The build log is here: https://buildd.debian.org/status/fetch.php?pkg=snapd&arch=i386&ver=2.48.2-2&stamp=1610734191&raw=0 There were errors in unit tests: I've looked at the errors are both use g_test_subprocess(), however there are other tests using the same, which were successful. Also, the error in the tested code happens right away, before any heavy code runs. I'm counting only va_start() and sc_panicv(), vfprintf() on the call path. However, the errno is copied before vfprintf(), so it's either va_start() (which afaik does not set errno), or something that happens before the test, maybe some glib test integration left it. Now, thinking about a proper fix, there's some other code that could die in sanity checks, that has not called any 3rd party library code, and for which errno could have been mangled before. There's also some other code which actually sets errno to indicate errors. Looks like we should review all those calls and fix errno handling there. |
| // Set errno in case we die. | ||
| errno = 0; | ||
| if (err == NULL) { | ||
| die("cannot obtain error domain from NULL error"); |
There was a problem hiding this comment.
maybe it should be errno = EINVAL in this branch and likely in other places doing sanity checks? Though this would break the test, as they expect this particular error string, without the following : Invalid argument🤔
There was a problem hiding this comment.
Thanks, yeah, this PR tries to fix the issue in a minimal way but we can of course do a followup that does more changes.
|
Fwiw, I used this as a distro patch to unblock the debian build and with that we have a clean 2.48.3 build in unstable now: https://tracker.debian.org/pkg/snapd |
bboozzoo
left a comment
There was a problem hiding this comment.
LGTM. Let's land it and start looking into cleanup the errno mess in our internal libraries.
| const char *msgfmt, va_list ap) | ||
| { | ||
| // Set errno in case we die. | ||
| errno = 0; |
There was a problem hiding this comment.
Shouldn't we save & restore errno before returning, or am I paranoid?
There was a problem hiding this comment.
It depends on whether the API promises to not modify errno. In this case I would say no, there's a separate call sc_error_init_from_errno which takes a copy of errno if you want to set in the error. There's a separate SC_ERRNO_DOMAIN domain for errors carrying errno, such that when initializing the error one is expected to pass the errno (copy thereof) in the arguments.
There was a problem hiding this comment.
I think we can. Note that we plan to change the API of die() to something like die() and die_with_errno() - once that is done this errno = 0; will be removed again.
There was a problem hiding this comment.
My main concern is obviously cases where sc_error_initv(..) is used to create and initialize sc_error, but after calling it we want to do something extra based on last errno value; and yes @bboozzoo is right in that this is about promise of the API regarding errno modifications. I think it would make sense to make such promise and keep errno untouched on return.
Since this is going to be changed in followup and errno=0 will be removed I think it's fine as is for this PR.
stolowski
left a comment
There was a problem hiding this comment.
+1 (assuming a followup PR introduces the changes about die_with_errno and avoids resetting errno).
| const char *msgfmt, va_list ap) | ||
| { | ||
| // Set errno in case we die. | ||
| errno = 0; |
There was a problem hiding this comment.
My main concern is obviously cases where sc_error_initv(..) is used to create and initialize sc_error, but after calling it we want to do something extra based on last errno value; and yes @bboozzoo is right in that this is about promise of the API regarding errno modifications. I think it would make sense to make such promise and keep errno untouched on return.
Since this is going to be changed in followup and errno=0 will be removed I think it's fine as is for this PR.
The i386 sid sbuild fails because apparently some sbuild code
is calling functions that are not implemented so errno is set
during the tests when it is not expected. This leads to test
failures because the die() code will append errno status if
errno is set. This commit fixes this and makes the nightly
test also run on i386.