Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vhd_tool_wrapper.ml errors should include stderr message in the detail #3794

Open
nraynaud opened this issue Jan 7, 2019 · 9 comments
Open

Comments

@nraynaud
Copy link
Contributor

nraynaud commented Jan 7, 2019

When a VDI_IO_ERROR happens because of vhd-tool, the xapi only reports the stacktrace of the xapi process down to vhd_tool_wrapper.ml, which is never helpful. The actual error is written by vhd-tool on its stderr.

I think adding the stderr of the vhd-tool subprocess to the field Task.error_info of the Task object related to VDI operations in the case of error would be more helpful.

I suspect that the edit could be done here: https://github.com/xapi-project/xen-api/blob/master/ocaml/xapi/vhd_tool_wrapper.ml#L55

Currently user support is complicated because in the case of VDI_IO_ERROR we have to go explore the xensource.log file, there are 2 stacktraces per error in the file (one for xapi and one for vhd-tool) and customers routinely report only the unhelpful one (it's the lowest one in the file).

@lindig
Copy link
Contributor

lindig commented Jan 8, 2019

Thanks for the report. At first glance, it seems we are not capturing stderr. How common are these errors? I would like to think that VDI IO errors should almost never happen and there is no good reaction to them from the user and hence it is difficult to provide a good error message that a user could act on.

@nraynaud
Copy link
Contributor Author

nraynaud commented Jan 8, 2019

I have seen various things, from no disk space left on device to impossible to find the VDI, corrupted hard drive, lost the connection to NFS, import didn't go well, etc.

My gut feeling is a bit opposite of yours: it's often an error where the users could have solved the problem themselves if they had access to precise underlying unix error message.

@olivierlambert
Copy link

olivierlambert commented Jan 8, 2019

Those errors are VERY common, as soon you use the VHD export wrapper, anything wrong with it will lead to this error 😢

@lindig
Copy link
Contributor

lindig commented Jan 9, 2019

I haven't looked at it but I believe vhd-tool should signal the kind of error using its return code such that xapi can create a more specific error without having to analyse stderr. Looking at vhd-tool I see that it is not really careful about capturing the reason for a failure and doesn't use the return code effectively.

@nraynaud
Copy link
Contributor Author

nraynaud commented Jan 9, 2019

I don't think vhd-tool packages its error un a usable way: https://github.com/xapi-project/vhd-tool/blob/master/cli/main.ml#L354

But I really suggest returning the unix error and the file it happened to (it's not always the one we think because of the VHD chain) in vhd_tool in the xapi error. More precisely the result of uerror() in the FFI (eg. https://github.com/xapi-project/vhd-tool/blob/master/src/sendfile64_stubs.c#L90 ) has really been a great help for us.

Thanks for looking into it.

@lindig
Copy link
Contributor

lindig commented Jan 9, 2019

I have created internal issue CA-307178. Fixing this should not be too difficult if the stderr output is not too large.

@olivierlambert
Copy link

Thanks a lot @lindig ! Feel free to link the commit to this issue when you have something, we are interested to see the "how", so we could maybe give you PRs next time we need something like this 👍

@lindig
Copy link
Contributor

lindig commented Jan 10, 2019

I was thinking about something like this - maybe you can test it (this is just a sketch):

diff --git a/ocaml/xapi/vhd_tool_wrapper.ml b/ocaml/xapi/vhd_tool_wrapper.ml
index 957bcce6d..5c62a3512 100644
--- a/ocaml/xapi/vhd_tool_wrapper.ml
+++ b/ocaml/xapi/vhd_tool_wrapper.ml
@@ -28,6 +28,7 @@ let update_task_progress __context x = TaskHelper.set_progress ~__context (float
 let run_vhd_tool progress_cb args s s' path =
   let vhd_tool = !Xapi_globs.vhd_tool in
   info "Executing %s %s" vhd_tool (String.concat " " args);
+  let exeption VHD_Tool in
   let open Forkhelpers in
   let pipe_read, pipe_write = Unix.pipe () in
   let to_close = ref [ pipe_read; pipe_write ] in
@@ -52,10 +53,13 @@ let run_vhd_tool progress_cb args s s' path =
                   let (_, status) = waitpid pid in
                   if status <> Unix.WEXITED 0 then begin
                     error "vhd-tool failed, returning VDI_IO_ERROR";
-                    raise (Api_errors.Server_error (Api_errors.vdi_io_error, ["Device I/O errors"]))
+                    raise VHD_tool
                   end
                ) with
        | Success(out, _) -> debug "%s" out
+       | Failure(out, VHD_Tool) ->
+         error "vhd-tool output: %s" out;
+         raise (Api_errors.Server_error (Api_errors.vdi_io_error, [out]))
        | Failure(out, e) -> error "vhd-tool output: %s" out; raise e
     ) (fun () -> close pipe_read; close pipe_write)

@olivierlambert
Copy link

We'll test that, thanks a lot for the draft!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants