New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RDY] Refactor event loop layer #2980

Merged
merged 6 commits into from Jul 17, 2015

Conversation

Projects
None yet
7 participants
@tarruda
Member

tarruda commented Jul 7, 2015

Like #2650, this extracts some changes from my private #2371/fix branch. This PR separates the event loop layer from the rest of the code and builds a small abstraction around libuv/watcher API, which is done to keep some sanity for when the event loop runs in a separate thread. The goal is to contain multithreaded code under src/nvim/event and leave the rest as untouched as possible, with minimum amount of data/modules shared between threads.

@marvim marvim added the WIP label Jul 7, 2015

Show outdated Hide outdated src/nvim/event/rstream.c
// be processed between reads.
uv_idle_init(UV(loop), &rstream->uv.idle);
rstream->uv.idle.data = NULL;
rstream->uv.idle.data = rstream;

This comment has been minimized.

@oni-link

oni-link Jul 7, 2015

Contributor

Previous line could be removed.

@oni-link

oni-link Jul 7, 2015

Contributor

Previous line could be removed.

This comment has been minimized.

@tarruda

tarruda Jul 7, 2015

Member

👍

@tarruda

@tarruda tarruda changed the title from [WIP] Separate event loop layer to [RDY] Separate event loop layer Jul 8, 2015

@marvim marvim added RDY and removed WIP labels Jul 8, 2015

@@ -262,7 +263,7 @@ static void read_cb(RStream *rstream, RBuffer *buf, void *data, bool eof)
break;
}
}
} while (rbuffer_size(input->read_buffer));
} while (rbuffer_size(input->read_stream.buffer));
// Make sure the next input escape sequence fits into the ring buffer
// without wrap around, otherwise it could be misinterpreted.

This comment has been minimized.

@oni-link

oni-link Jul 8, 2015

Contributor

rbuffer_reset(input->read_buffer); needs to be replaced with rbuffer_reset(input->read_stream.buffer); in the following line.

@oni-link

oni-link Jul 8, 2015

Contributor

rbuffer_reset(input->read_buffer); needs to be replaced with rbuffer_reset(input->read_stream.buffer); in the following line.

This comment has been minimized.

@tarruda

tarruda Jul 8, 2015

Member

👍

@tarruda

@tarruda tarruda changed the title from [RDY] Separate event loop layer to [WIP] Separate event loop layer Jul 8, 2015

@marvim marvim added WIP and removed RDY labels Jul 8, 2015

@justinmk justinmk added this to the 0.1-first-public-release milestone Jul 11, 2015

Show outdated Hide outdated src/nvim/event/loop.c
void loop_run(Loop *loop)
{
uv_run(&loop->uv, UV_RUN_DEFAULT);

This comment has been minimized.

@oni-link

oni-link Jul 11, 2015

Contributor

Is the return value intentionally dropped?

@oni-link

oni-link Jul 11, 2015

Contributor

Is the return value intentionally dropped?

This comment has been minimized.

@tarruda

tarruda Jul 11, 2015

Member

Not intentionally, for now this is just a stub module that I plan to finish later

@tarruda

tarruda Jul 11, 2015

Member

Not intentionally, for now this is just a stub module that I plan to finish later

Show outdated Hide outdated src/nvim/os/job.c
uv_close((uv_handle_t *)job->proc_std##stream, close_cb); \
} \
if (!job->stream.closed) { \
stream_close(&job->stream, on_##stream_close); \

This comment has been minimized.

@oni-link

oni-link Jul 12, 2015

Contributor

on_##stream_close -> on_stream_close

@oni-link

oni-link Jul 12, 2015

Contributor

on_##stream_close -> on_stream_close

This comment has been minimized.

@tarruda

tarruda Jul 12, 2015

Member

👍

@tarruda
Show outdated Hide outdated src/nvim/os/job.c
if (opts.stderr_cb) {
uv_close((uv_handle_t *)job->proc_stderr, close_cb);
if (job->opts.stderr_cb) {
uv_close((uv_handle_t *)job->proc_stderr, NULL);
}
process_close(job);
event_poll(0);

This comment has been minimized.

@oni-link

oni-link Jul 12, 2015

Contributor

event_poll(0) should be called before process_close(job), to prevent freeing the job structure too soon.
Without this change the following patch (simulates an error while spawning a pty) and usage of :term should produce an crash:

diff --git a/src/nvim/os/pty_process.c b/src/nvim/os/pty_process.c
index cb59355..ebffac1 100644
--- a/src/nvim/os/pty_process.c
+++ b/src/nvim/os/pty_process.c
@@ -72,6 +72,7 @@ bool pty_process_spawn(Job *job) FUNC_ATTR_NONNULL_ALL
     init_child(job);
     abort();
   }
+  goto error;

   // make sure the master file descriptor is non blocking
   int master_status_flags = fcntl(master, F_GETFL);
@oni-link

oni-link Jul 12, 2015

Contributor

event_poll(0) should be called before process_close(job), to prevent freeing the job structure too soon.
Without this change the following patch (simulates an error while spawning a pty) and usage of :term should produce an crash:

diff --git a/src/nvim/os/pty_process.c b/src/nvim/os/pty_process.c
index cb59355..ebffac1 100644
--- a/src/nvim/os/pty_process.c
+++ b/src/nvim/os/pty_process.c
@@ -72,6 +72,7 @@ bool pty_process_spawn(Job *job) FUNC_ATTR_NONNULL_ALL
     init_child(job);
     abort();
   }
+  goto error;

   // make sure the master file descriptor is non blocking
   int master_status_flags = fcntl(master, F_GETFL);

This comment has been minimized.

@tarruda

tarruda Jul 12, 2015

Member

I'm currently refactoring the job module in a similar way that the server/rstream/wstream modules were refactored in this PR. I will take this into consideration for the refactored version, thanks

@tarruda

tarruda Jul 12, 2015

Member

I'm currently refactoring the job module in a similar way that the server/rstream/wstream modules were refactored in this PR. I will take this into consideration for the refactored version, thanks

This comment has been minimized.

@tarruda

tarruda Jul 13, 2015

Member

This should no longer be a problem, the process memory is now managed by the caller

@tarruda

tarruda Jul 13, 2015

Member

This should no longer be a problem, the process memory is now managed by the caller

return true;
}
static void chld_handler(uv_signal_t *handle, int signum)

This comment has been minimized.

@oni-link

oni-link Jul 13, 2015

Contributor

Are UvProcesses on a different loop than PtyProcesses? Otherwise an UvProcess could be handled here and in its exit callback (also two calls to internal_exit_cb()).

@oni-link

oni-link Jul 13, 2015

Contributor

Are UvProcesses on a different loop than PtyProcesses? Otherwise an UvProcess could be handled here and in its exit callback (also two calls to internal_exit_cb()).

This comment has been minimized.

@tarruda

tarruda Jul 13, 2015

Member

They are handled in the same loop, but it's only possible to have waitpid return a pid > 0 once(at least in my tests). So if libuv SIGCHLD handler receives it first, this one will short circuit because the process will have already exited. The same will happen for libuv if we catch it first: https://github.com/libuv/libuv/blob/v1.x/src/unix/process.c#L78-L80

@tarruda

tarruda Jul 13, 2015

Member

They are handled in the same loop, but it's only possible to have waitpid return a pid > 0 once(at least in my tests). So if libuv SIGCHLD handler receives it first, this one will short circuit because the process will have already exited. The same will happen for libuv if we catch it first: https://github.com/libuv/libuv/blob/v1.x/src/unix/process.c#L78-L80

@@ -156,31 +98,36 @@ bool pty_process_spawn(Job *job) FUNC_ATTR_NONNULL_ALL
return false;

This comment has been minimized.

@oni-link

oni-link Jul 13, 2015

Contributor

After kill(pid, SIGKILL) we should call waitpid(pid,NULL,0).

@oni-link

oni-link Jul 13, 2015

Contributor

After kill(pid, SIGKILL) we should call waitpid(pid,NULL,0).

This comment has been minimized.

@tarruda

tarruda Jul 13, 2015

Member

👍

@tarruda
Show outdated Hide outdated src/nvim/eval.c
if (argvars[1].v_type == VAR_STRING) {
char *stream = (char *)argvars[1].vval.v_string;
if (!strcmp(stream, "stdin")) {
job_close_in(job);
((TerminalJobData *)job_data(job))->stdin_closed = true;
stream_close(proc->in, NULL);

This comment has been minimized.

@oni-link

oni-link Jul 13, 2015

Contributor

jobclose() could be called for the same stream multiple times. stream_close() would assert(), but process_close_{in,out,err,streams} should be safe to use here and in the following cases.

@oni-link

oni-link Jul 13, 2015

Contributor

jobclose() could be called for the same stream multiple times. stream_close() would assert(), but process_close_{in,out,err,streams} should be safe to use here and in the following cases.

This comment has been minimized.

@tarruda

tarruda Jul 13, 2015

Member

👍

@tarruda
Show outdated Hide outdated src/nvim/eval.c
return;
}
TerminalJobData *data = opts.data;
TerminalOptions topts = TERMINAL_OPTIONS_INIT;

This comment has been minimized.

@oni-link

oni-link Jul 13, 2015

Contributor

Initialization should not be necessary. All fields are set in the following lines.

@oni-link

oni-link Jul 13, 2015

Contributor

Initialization should not be necessary. All fields are set in the following lines.

This comment has been minimized.

@tarruda

tarruda Jul 13, 2015

Member

👍

@tarruda
return false;
} else if (pid == 0) {
init_child(job);
init_child(ptyproc);
abort();
}

This comment has been minimized.

@oni-link

oni-link Jul 14, 2015

Contributor

Adding here a goto error; leads to a seg fault when using :term, because later on the uninitialized callback pointer internal_close_cb is used in pty_process_close().

@oni-link

oni-link Jul 14, 2015

Contributor

Adding here a goto error; leads to a seg fault when using :term, because later on the uninitialized callback pointer internal_close_cb is used in pty_process_close().

This comment has been minimized.

@tarruda

tarruda Jul 16, 2015

Member

👍

@tarruda
Show outdated Hide outdated src/nvim/eval.c
EMSG(_(e_invjob));
return;
}
if (((TerminalJobData *)job_data(job))->stdin_closed) {
if (data->stdin_closed) {

This comment has been minimized.

@oni-link

oni-link Jul 16, 2015

Contributor

Do we still need stdin_closed in TerminalJobData?
We could now do something like this:

  Process *proc = (Process *)&data->proc;
  if (!proc->in || proc->in->closed) {
@oni-link

oni-link Jul 16, 2015

Contributor

Do we still need stdin_closed in TerminalJobData?
We could now do something like this:

  Process *proc = (Process *)&data->proc;
  if (!proc->in || proc->in->closed) {

This comment has been minimized.

@tarruda

tarruda Jul 16, 2015

Member

👍

@tarruda
@tarruda

This comment has been minimized.

Show comment
Hide comment
@tarruda

tarruda Jul 16, 2015

Member

Been trying to find out why the OSX/travis tests are hanging on this PR, and so far had little success. I cannot reproduce it running on a local OSX vm, and I don't think it is related to OSX.

Here's what I know so far:

  • It happens with the job_wait/timeout test
  • It is caused by the sleep 10000 shell command which somehow causing the travis terminal/shell hang after the tests exit
  • Using a lower sleep value, 10 for example, allows the build to continue after the sleep exits
  • I can confirm that, in both linux and OSX, this PR and the master branch, the sleep process sticks around after the tests(until killed or the terminal is closed). In all cases nvim kills all children on exit(in this case, sleep is a grandchild)

Considering these facts, it seems the reasonable thing to do is to use a smaller sleep value, which would not only fix the "zombie" sleep processes sticking around after tests, but also make the OSX/travis build pass on this PR.

What's messing with my head is trying to figure out why this doesn't happen on master, I simply cannot see how the process cleanup refactor would cause this issue on OSX/travis, especially because I already verified that in all branches/platforms the result is the same: The sleep process sticks after the tests.

I'd love to hear ideas about what is causing this, but in any case I'm not going to let this stop me from proceeding, I will just lower the sleep value and finish the PR.

Member

tarruda commented Jul 16, 2015

Been trying to find out why the OSX/travis tests are hanging on this PR, and so far had little success. I cannot reproduce it running on a local OSX vm, and I don't think it is related to OSX.

Here's what I know so far:

  • It happens with the job_wait/timeout test
  • It is caused by the sleep 10000 shell command which somehow causing the travis terminal/shell hang after the tests exit
  • Using a lower sleep value, 10 for example, allows the build to continue after the sleep exits
  • I can confirm that, in both linux and OSX, this PR and the master branch, the sleep process sticks around after the tests(until killed or the terminal is closed). In all cases nvim kills all children on exit(in this case, sleep is a grandchild)

Considering these facts, it seems the reasonable thing to do is to use a smaller sleep value, which would not only fix the "zombie" sleep processes sticking around after tests, but also make the OSX/travis build pass on this PR.

What's messing with my head is trying to figure out why this doesn't happen on master, I simply cannot see how the process cleanup refactor would cause this issue on OSX/travis, especially because I already verified that in all branches/platforms the result is the same: The sleep process sticks after the tests.

I'd love to hear ideas about what is causing this, but in any case I'm not going to let this stop me from proceeding, I will just lower the sleep value and finish the PR.

@justinmk

This comment has been minimized.

Show comment
Hide comment
@justinmk

justinmk Jul 16, 2015

Member

I didn't see the OSX logs before you updated this PR, but we have been seeing random hangs on OSX on master. I just assumed it was a travis performance issue as it was in May.

Member

justinmk commented Jul 16, 2015

I didn't see the OSX logs before you updated this PR, but we have been seeing random hangs on OSX on master. I just assumed it was a travis performance issue as it was in May.

job_decref(job);
pty_process_close_master(ptyproc);
Process *proc = (Process *)ptyproc;
if (proc->internal_close_cb) {

This comment has been minimized.

@oni-link

oni-link Jul 17, 2015

Contributor

proc->internal_close_cb is not initialized at this point if the spawn fails.

@oni-link

oni-link Jul 17, 2015

Contributor

proc->internal_close_cb is not initialized at this point if the spawn fails.

This comment has been minimized.

@tarruda

tarruda Jul 17, 2015

Member

👍

@tarruda
@tarruda

This comment has been minimized.

Show comment
Hide comment
@tarruda

tarruda Jul 17, 2015

Member

I didn't see the OSX logs before you updated this PR, but we have been seeing random hangs on OSX on master. I just assumed it was a travis performance issue as it was in May.

The hangs in this branch are consistent, I just don't understand why they aren't present in master since the cause is seems to be the sleep program spawned in job_spec.lua

Member

tarruda commented Jul 17, 2015

I didn't see the OSX logs before you updated this PR, but we have been seeing random hangs on OSX on master. I just assumed it was a travis performance issue as it was in May.

The hangs in this branch are consistent, I just don't understand why they aren't present in master since the cause is seems to be the sleep program spawned in job_spec.lua

tarruda added some commits Jul 8, 2015

test: lower sleep value in job test
Since sleep is a grandchild of nvim, it is not killed after the test ends.
Using a low sleep value allows it to exit automatically after a small interval.

@tarruda tarruda changed the title from [WIP] Separate event loop layer to [RDY] Refactor event loop layer Jul 17, 2015

@marvim marvim added RDY and removed WIP labels Jul 17, 2015

tarruda added some commits Jul 17, 2015

event loop: New abstraction layer with refactored time/signal API
- Add event loop abstraction module under src/nvim/event. The
  src/nvim/event/loop module replaces src/nvim/os/event
- Remove direct dependency on libuv signal/timer API and use the new abstraction
  instead.
- Replace all references to uv_default_loop() by &loop.uv, a new global variable
  that wraps libuv main event loop but allows the event loop functions to be
  reused in other contexts.
rstream/wstream: Unify structures and simplify API
- Simplify RStream/WStream API and make it more consistent with libuv.
- Move into the event loop layer(event subdirectory)
- Remove uv_helpers module.
- Simplify job/process internal modules/API.
- Unify RStream and WStream into a single structure. This is necessary because
  libuv streams can be readable and writable at the same time(and because the
  uv_helpers.c hack to associate multiple streams with libuv handle was removed)
- Make struct definition public, allowing more flexible/simple memory
  management by users of the module.
- Adapt channel/job modules to cope with the changes.
server: Extract most logic into the new socket abstraction
- Move event loop code into event/socket
- Reimplement server.c on top of the new SocketWatcher class
- Adapt msgpack_rpc/channel.c
job: Replace by a better process abstraction layer
- New libuv/pty process abstraction with simplified API and no globals.
- Remove nvim/os/job*. Jobs are now a concept that apply only to programs
  spawned by vimscript job* functions.
- Refactor shell.c/channel.c to use the new module, which brings a number of
  advantages:
  - Simplified API, less code
  - No slots in the user job table are used
  - Not possible to acidentally receive data from vimscript
- Implement job table in eval.c, which is now a hash table with unilimited job
  slots and unique job ids.
@tarruda

This comment has been minimized.

Show comment
Hide comment
@tarruda

tarruda Jul 17, 2015

Member

Merging now

The quickbuild seems to have failed because it couldn't publish the test results, so I'm going to disregard: http://neovim-qb.szakmeister.net/build/2089/step_status

Mingw link seems to be broken, so I will ignore it too.

Member

tarruda commented Jul 17, 2015

Merging now

The quickbuild seems to have failed because it couldn't publish the test results, so I'm going to disregard: http://neovim-qb.szakmeister.net/build/2089/step_status

Mingw link seems to be broken, so I will ignore it too.

@tarruda tarruda merged commit aa9cb48 into neovim:master Jul 17, 2015

2 of 4 checks passed

continuous-integration/travis-ci/pr The Travis CI build could not complete due to an error
Details
default Build pr-2980 finished with status FAILED
Details
continuous-integration/appveyor AppVeyor build succeeded
Details
coverage/coveralls Coverage decreased (-0.007%) to 66.901%
Details

tarruda added a commit that referenced this pull request Jul 17, 2015

Merge PR #2980 'Refactor event loop layer'
Helped-by: oni-link <knil.ino@gmail.com>
Reviewed-by: oni-link <knil.ino@gmail.com>
Reviewed-by: Scott Prager <splinterofchaos@gmail.com>

@jszakmeister jszakmeister removed the RDY label Jul 17, 2015

@tarruda

This comment has been minimized.

Show comment
Hide comment
@tarruda

tarruda Jul 17, 2015

Member

Done, thanks for the help.

Tomorrow I start integrating the threading code. With this clearer event loop separation, the changes will be more contained.

Member

tarruda commented Jul 17, 2015

Done, thanks for the help.

Tomorrow I start integrating the threading code. With this clearer event loop separation, the changes will be more contained.

// Read error or EOF, either way stop the stream and invoke the callback
// with eof == true
uv_read_stop(uvstream);
stream->read_cb(stream, stream->buffer, stream->data, true);

This comment has been minimized.

@oni-link

oni-link Jul 17, 2015

Contributor

stream->read_cb needs a NULL check before making the call. The default value for the variable is NULL.

@oni-link

oni-link Jul 17, 2015

Contributor

stream->read_cb needs a NULL check before making the call. The default value for the variable is NULL.

This comment has been minimized.

@tarruda

tarruda Jul 18, 2015

Member

Is there any reason for allowing NULL read_cb? How about putting an assertion or FUNC_NONNULL_ATTR_ALL in rstream_start?

@tarruda

tarruda Jul 18, 2015

Member

Is there any reason for allowing NULL read_cb? How about putting an assertion or FUNC_NONNULL_ATTR_ALL in rstream_start?

This comment has been minimized.

@oni-link

oni-link Jul 18, 2015

Contributor

In do_os_system() the output rstreams could be started with callback NULL.

@oni-link

oni-link Jul 18, 2015

Contributor

In do_os_system() the output rstreams could be started with callback NULL.

This comment has been minimized.

@tarruda

tarruda Jul 18, 2015

Member

👍

@tarruda

@oni-link oni-link referenced this pull request Jul 18, 2015

Closed

core dump in centos #3040

goto error;
}
ptyproc->tty_fd = master;
job->pid = pid;
proc->pid = pid;
return true;
error:

This comment has been minimized.

@oni-link

oni-link Jul 18, 2015

Contributor

When this label is reached, the child ignores the signal SIGTERM and we have to wait for 5 seconds until we can send signal SIGKILL.
If that happens, nvim hangs with

Vim: Caught deadly signal 'SIGTERM'

Vim: Finished.

Can this error handling be simplified to

close(master);
kill(pid,SIGKILL);
waitpid(pid,NULL,0);
return false;

?

@oni-link

oni-link Jul 18, 2015

Contributor

When this label is reached, the child ignores the signal SIGTERM and we have to wait for 5 seconds until we can send signal SIGKILL.
If that happens, nvim hangs with

Vim: Caught deadly signal 'SIGTERM'

Vim: Finished.

Can this error handling be simplified to

close(master);
kill(pid,SIGKILL);
waitpid(pid,NULL,0);
return false;

?

This comment has been minimized.

@tarruda

tarruda Jul 18, 2015

Member

I suppose it can. This cleanup code was created in @a2b45357 so I opted to not mess with it right now.

@elmart, is there any specific reason you added this timeout?

@tarruda

tarruda Jul 18, 2015

Member

I suppose it can. This cleanup code was created in @a2b45357 so I opted to not mess with it right now.

@elmart, is there any specific reason you added this timeout?

This comment has been minimized.

@elmart

elmart Jul 18, 2015

Member

Well, intention was just giving some time for spawned child to quit orderly, responding to SIGTERM, and only use SIGKILL if no response after some time (KILL_RETRIES * KILL_TIMEOUT).

@elmart

elmart Jul 18, 2015

Member

Well, intention was just giving some time for spawned child to quit orderly, responding to SIGTERM, and only use SIGKILL if no response after some time (KILL_RETRIES * KILL_TIMEOUT).

data->proc.pty.width = curwin->w_width;
data->proc.pty.height = curwin->w_height;
data->proc.pty.term_name = xstrdup("xterm-256color");
if (!common_job_start(data, rettv)) {
return;

This comment has been minimized.

@oni-link

oni-link Jul 18, 2015

Contributor

TerminalJobData and term_name are not freed:
Cleanup could be:

    free(data->proc.pty.term_name);
    free_term_job_data(data);
@oni-link

oni-link Jul 18, 2015

Contributor

TerminalJobData and term_name are not freed:
Cleanup could be:

    free(data->proc.pty.term_name);
    free_term_job_data(data);

This comment has been minimized.

@tarruda

tarruda Jul 18, 2015

Member

👍

@tarruda

This comment has been minimized.

@tarruda

tarruda Jul 18, 2015

Member

Added this and some other fixes to the beginning of #3029

@tarruda

tarruda Jul 18, 2015

Member

Added this and some other fixes to the beginning of #3029

@jszakmeister

This comment has been minimized.

Show comment
Hide comment
@jszakmeister

jszakmeister Jul 18, 2015

Member

Quickbuild checks the outcome of the tests on the publish step, so it's not a good idea to ignore it. One thing I really dislike about busted is that it doesn't have a nice setup for the junit style testing, and it mixes in other stuff into the XML output--which corrupts the document. In this case, there was an error, and there was this emitted into the beginning of the document:

Warning: Screen changes have been received after the expected state was seen.
This is probably due to an indeterminism in the test. Try adding
`wait()` (or even a separate `screen:expect(...)`) at a point of possible
indeterminism, typically in between a `feed()` or `execute()` which is non-
synchronous, and a synchronous api call.

Note that sometimes a `wait` can trigger redraws and consequently generate more
indeterminism. If adding `wait` calls seems to increase the frequency of these
messages, try removing every `wait` call in the test.

If everything else fails, use Screen:redraw_debug to help investigate what is
  causing the problem.

stack traceback:
    ./test/functional/ui/screen.lua:270: in function 'wait'
    ./test/functional/ui/screen.lua:219: in function 'expect'
    test/functional/ui/screen_basic_spec.lua:492: in function <test/functional/ui/screen_basic_spec.lua:490>

The test case that failed was this one:

        <testcase name='Screen resize has minimum width/height values' classname='test/functional/ui/screen_basic_spec.lua:490'>
            <error>./test/functional/ui/screen.lua:276: Row 1 didn&apos;t match.
Expected: &quot;-- INS^ERT --&quot;
Actual:   &quot;resize^      &quot;
stack traceback:
    ./test/functional/ui/screen.lua:276: in function &apos;wait&apos;
    ./test/functional/ui/screen.lua:219: in function &apos;expect&apos;
    test/functional/ui/screen_basic_spec.lua:492: in function &lt;test/functional/ui/screen_basic_spec.lua:490&gt;
</error>
        </testcase>
Member

jszakmeister commented Jul 18, 2015

Quickbuild checks the outcome of the tests on the publish step, so it's not a good idea to ignore it. One thing I really dislike about busted is that it doesn't have a nice setup for the junit style testing, and it mixes in other stuff into the XML output--which corrupts the document. In this case, there was an error, and there was this emitted into the beginning of the document:

Warning: Screen changes have been received after the expected state was seen.
This is probably due to an indeterminism in the test. Try adding
`wait()` (or even a separate `screen:expect(...)`) at a point of possible
indeterminism, typically in between a `feed()` or `execute()` which is non-
synchronous, and a synchronous api call.

Note that sometimes a `wait` can trigger redraws and consequently generate more
indeterminism. If adding `wait` calls seems to increase the frequency of these
messages, try removing every `wait` call in the test.

If everything else fails, use Screen:redraw_debug to help investigate what is
  causing the problem.

stack traceback:
    ./test/functional/ui/screen.lua:270: in function 'wait'
    ./test/functional/ui/screen.lua:219: in function 'expect'
    test/functional/ui/screen_basic_spec.lua:492: in function <test/functional/ui/screen_basic_spec.lua:490>

The test case that failed was this one:

        <testcase name='Screen resize has minimum width/height values' classname='test/functional/ui/screen_basic_spec.lua:490'>
            <error>./test/functional/ui/screen.lua:276: Row 1 didn&apos;t match.
Expected: &quot;-- INS^ERT --&quot;
Actual:   &quot;resize^      &quot;
stack traceback:
    ./test/functional/ui/screen.lua:276: in function &apos;wait&apos;
    ./test/functional/ui/screen.lua:219: in function &apos;expect&apos;
    test/functional/ui/screen_basic_spec.lua:492: in function &lt;test/functional/ui/screen_basic_spec.lua:490&gt;
</error>
        </testcase>
@tarruda

This comment has been minimized.

Show comment
Hide comment
@tarruda

tarruda Jul 18, 2015

Member

@jszakmeister I admit that sometimes I have trouble parsing quickbuild output. This particular error was already present though, it's a indeterminism I failed to fix in this PR.

Member

tarruda commented Jul 18, 2015

@jszakmeister I admit that sometimes I have trouble parsing quickbuild output. This particular error was already present though, it's a indeterminism I failed to fix in this PR.

if (!is_user_job(job)) {
TerminalJobData *data = pmap_get(uint64_t)(jobs, argvars[0].vval.v_number);
if (!data || data->stopped) {

This comment has been minimized.

@oni-link

oni-link Jul 18, 2015

Contributor

Most of the other checks for a valid "job" do not check for data->stopped. Should probably also be added there.

@oni-link

oni-link Jul 18, 2015

Contributor

Most of the other checks for a valid "job" do not check for data->stopped. Should probably also be added there.

This comment has been minimized.

@tarruda
@jszakmeister

This comment has been minimized.

Show comment
Hide comment
@jszakmeister

jszakmeister Jul 18, 2015

Member

@jszakmeister I admit that sometimes I have trouble parsing quickbuild output.

Me too. I wish it QuickBuild was better in that regard.

Member

jszakmeister commented Jul 18, 2015

@jszakmeister I admit that sometimes I have trouble parsing quickbuild output.

Me too. I wish it QuickBuild was better in that regard.

@@ -133,6 +137,41 @@ static const char *err_extra_cmd =
N_("Too many \"+command\", \"-c command\" or \"--cmd command\" arguments");
void event_init(void)
{
loop_init(&loop, NULL);

This comment has been minimized.

@oni-link

oni-link Jul 21, 2015

Contributor

This call to loop_init() is too late. command_line_scan() in main() could already need the loop initialized.
See #3045 (comment).

@oni-link

oni-link Jul 21, 2015

Contributor

This call to loop_init() is too late. command_line_scan() in main() could already need the loop initialized.
See #3045 (comment).

This comment has been minimized.

@tarruda

tarruda Jul 21, 2015

Member

👍

@tarruda

This comment has been minimized.

@mhinz

mhinz Jul 21, 2015

Member

👍

@mhinz

mhinz Jul 21, 2015

Member

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment