-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve error reporting and logging #2705
Conversation
Signed-off-by: Yashodhan Joshi <yjdoc2@gmail.com>
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #2705 +/- ##
==========================================
- Coverage 65.40% 65.21% -0.19%
==========================================
Files 133 133
Lines 16942 16981 +39
==========================================
- Hits 11081 11075 -6
- Misses 5861 5906 +45 |
Signed-off-by: Yashodhan Joshi <yjdoc2@gmail.com>
4f1dff6
to
62431aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'm not sure is converting the
is_true_root
,is_in_new_userns
androotless_required
this way is a good idea or not. The only error case here is failing to read/proc/self/uid_map
, which I'm not sure how much possible it is, and it bloats the signatures and forces to use IO error everywhere it is called. I'd be happy to convert it back to original if someone else also feels the same way - While doing this I have found that right now youki does not work with
podman exec
at all. I have found the rc, and have a fix in another branch ; but some unit tests are failing. If that works, then the failing podman tests goes down from 161 to 99 🎉 This PR will help me diagnose further errors, so opening this first and waiting for merge before that one.
if let Some(weight) = wd.weight() { | ||
common::write_cgroup_file( | ||
root_path.join(CGROUP_BFQ_IO_WEIGHT), | ||
format!("{}:{} {}", wd.major(), wd.minor(), weight), | ||
)?; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed this based upon https://github.com/opencontainers/runc/blob/main/libcontainer/cgroups/fs/blkio.go#L29-L33 from runc ; and if the weight is None
the existing code would just panic. However, I'm not sure if this is the correct approach, or the None
wight should be defaulted to 0 or reported as error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this too much? For now we don't support checkpoint at all, so I feel this might be overkill. We can also potentially do these changes when we fix the support for checkpointing.
@@ -78,6 +78,8 @@ pub enum UserNamespaceError { | |||
UnknownUnprivilegedUsernsClone(u8), | |||
#[error(transparent)] | |||
IDMapping(#[from] MappingError), | |||
#[error(transparent)] | |||
OtherIO(#[from] std::io::Error), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the OtherIO
be some other higher level error, as this is duplicated in another error enum below.
if let ContainerType::TenantContainer { exec_notify_fd } = args.container_type { | ||
let buf = format!("{e}"); | ||
if let Err(err) = write(exec_notify_fd, buf.as_bytes()) { | ||
tracing::error!(?err, "failed to write to exec notify fd"); | ||
return -1; | ||
} | ||
if let Err(err) = close(exec_notify_fd) { | ||
tracing::error!(?err, "failed to close exec notify fd"); | ||
return -1; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the return -1
from individual arms, as we should still try to close the socket even if we fail to write it, and we are returning -1 at the block end
Overall, it looks good to me, but I have one question. How about using |
I checked that out, and while it would be good, right now there are a lot of places where we will need to manually add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome improvement!
I'll wait maybe a day or two more and see if anyone has any comments, if not will merge. |
Going ahead and merging this, as no further comments were added . Thanks :) |
ref : #2695
This converts a lot of
.unwrap
calls to propagated errors, and also improves general error reporting that is seen by users of youki.I'll suggest to review both of the commits separately, as they focus on two different aspects of this :
7f8422f deals with the issue that when init process fails to exec, it does not report back to main process, thus the error gets reported as
channel broken
which is not at all useful. This now sends the error back via the channel. This also changes theInvalidArg
error that is given by validation of executable toArgValdationError
. I feel the latter is better suited, and does not collide with another error of the same name that we have. This might be a breaking change for runwasi If they are match-ing on the enum values.62431aa : This converts various
.unwrap
s we have into error propagation so that users of the libraries can decide how they want to deal with it. I have left some unwraps where we first check that it cannot fail, and then call unwrap, as in those places, converting it to more idiomatic ways is either not possible to too complex code. I have also left out unwraps in all the tests, as it does not make sense to convert them. While the error propagation behavior is changed, otherwise the behavior should not change.I also have some other comments, which I'm leaving as review comments below.
Thanks :)