Skip to content
This repository was archived by the owner on Feb 8, 2021. It is now read-only.

Conversation

@Crazykev
Copy link
Contributor

No description provided.

@Crazykev
Copy link
Contributor Author

Anyone could help take a look at this test log? Something just not make sense.
I add a test case here to test this api. And the order of operation to exec is:

execId, err := s.client.ContainerExecCreate(cName, []string{"sh", "-c", "top"}, false)
err = s.client.ContainerExecStart(cName, execId, nil, nil, nil, false)
err = s.client.ContainerExecSignal(cName, execId, sigKill)

While test log in Jenkins is Create Exec->Kill Exec->Start Exec

12:08:16 I0113 12:08:16.636046    7947 exec.go:12] create exec containerID:"test-exec-signal" command:"sh" command:"-c" command:"top" 
12:08:16 I0113 12:08:16.636073    7947 exec.go:34] Create Exec for container test-exec-signal
12:08:16 I0113 12:08:16.636633    7947 vm_console.go:46] SB[vm-JrkUcrJWgM] [CNL] hyper_modify_event modify event fd 3, 0x61b568, event 1
12:08:16 I0113 12:08:16.637094    7947 vm_console.go:46] SB[vm-JrkUcrJWgM] [CNL] hyper_handle_event event EPOLLOUT, he 0x1959eb8, fd 7, 0x61b500
12:08:16 I0113 12:08:16.637109    7947 vm_console.go:46] SB[vm-JrkUcrJWgM] [CNL] write_to_stdin, seq 1
12:08:16 I0113 12:08:16.637396    7947 vm_console.go:46] SB[vm-JrkUcrJWgM] [CNL] hyper_modify_event modify event fd 7, 0x1959eb8, event 0
12:08:16 I0113 12:08:16.638507    7947 vm_console.go:46] SB[vm-JrkUcrJWgM] [CNL] pid 329 exit normally, status 0
12:08:16 I0113 12:08:16.638846    7947 vm_console.go:46] SB[vm-JrkUcrJWgM] [CNL] pid 330 exit normally, status 0
12:08:16 I0113 12:08:16.639016    7947 vm_console.go:46] SB[vm-JrkUcrJWgM] [CNL] pid 331 exit normally, status 0
12:08:16 I0113 12:08:16.641498    7947 vm_console.go:46] SB[vm-JrkUcrJWgM] [CNL] hyper_install_process_stdio
12:08:16 I0113 12:08:16.660299    7947 exec.go:108] ExecSignal with request containerID:"test-exec-signal" execID:"exec-vcOJsBBZBr" signal:9 
12:08:16 I0113 12:08:16.660362    7947 exec.go:58] Kill Exec for container efdf3e785127da47bc287a0c761c24456e5062222c23b43f8c1ec317918d4b1b
12:08:16 I0113 12:08:16.660388    7947 hypervisor.go:29] vm vm-JrkUcrJWgM: main event loop got message 16(GENERIC_OPERATION)
12:08:16 I0113 12:08:16.660396    7947 vm_states.go:228] handle GenericOperation(SignalProcess) on state(RUNNING)
12:08:16 I0113 12:08:16.660405    7947 init_comm.go:189] got cmd:24
12:08:16 I0113 12:08:16.660432    7947 init_comm.go:281] send command 24 to init, payload: '{"container":"efdf3e785127da47bc287a0c761c24456e5062222c23b43f8c1ec317918d4b1b","process":"exec-vcOJsBBZBr","signal":9}'.
12:08:16 I0113 12:08:16.660433    7947 exec.go:46] Start Exec for container test-exec-signal
12:08:16 I0113 12:08:16.660457    7947 exec.go:117] Pod[busybox] Con[efdf3e785127] Exec[exec-vcOJsBBZBr] the sync chan is empty
12:08:16 I0113 12:08:16.660475    7947 hypervisor.go:29] vm vm-JrkUcrJWgM: main event loop got message 16(GENERIC_OPERATION)
12:08:16 I0113 12:08:16.660481    7947 vm_states.go:228] handle GenericOperation(AddProcess) on state(RUNNING)
12:08:16 I0113 12:08:16.660521    7947 init_comm.go:294] write 127 to hyperstart.

The later kernel panic part could be fixed through hyperhq/hyperstart#253, while that is not the original reason.

BTW: That passed label on Hykins seems not reasonable either. @Jimmy-Xu

@Jimmy-Xu
Copy link
Contributor

retest this please @hykins

@Jimmy-Xu
Copy link
Contributor

@Crazykev The error in hykins and travis are same now.
FAIL: hyper_test.go:533: TestSuite.TestSendExecSignal

@Crazykev
Copy link
Contributor Author

@Jimmy-Xu Yep, thanks. For now, I just couldn't understand the log behavior described in #507 (comment), @gnawux could you help review this sometime?

@gao-feng
Copy link
Contributor

I0113 12:07:16.935281   23599 vm_console.go:46] SB[vm-FaqvuiPhrh] [CNL] hyper_modify_event modify event fd 4, 0x61a588, event 1
E0113 12:07:21.756979   23599 exec.go:170] Pod[busybox] Con[dc8df5016929] Exec[exec-jRZoYAsQAf] wait exec exit code timeout
E0113 12:07:21.757011   23599 exec.go:97] Wait error: wait exec exit code timeout
hyper_test.go:564:
    c.Assert(err, IsNil)
... value grpc.rpcError = grpc.rpcError{code:0x2, desc:"wait exec exit code timeout"} ("rpc error: code = 2 desc = \"wait exec exit code timeout\"")

@Crazykev
Copy link
Contributor Author

@gao-feng Yes, this is the final reason here, and that is not the only error message I got when try to run this test over and over. When I try to debug, just couldn't understand the log in CI(and also in my local env).
This log in travis seems "normal" than jenkins, while handle GenericOperation(StartStdin) on state(RUNNING) is still behind ExecSignal with request....
And the failure in hykins is because when trying to kill process in hyperstart, cloud not find process with that exec-id, so I was wondering if there is some race condition.

@gao-feng
Copy link
Contributor

Seems like ExecStart api doesn't wait the result of AddProcess from server, it only return the stream. the Execstart request may haven't been delivered to server or handled by server when the next ExecSignal request arrived.

@gao-feng
Copy link
Contributor

hyperhq/hyperstart#257 fix the missing of process finished event.

@Crazykev
Copy link
Contributor Author

Crazykev commented Feb 9, 2017

Test case had some issues, too. Should work once hyperhq/hyperstart#257 be merged.

daemon/exec.go Outdated
}

func (daemon *Daemon) KillExec(containerId string, execId string, signal int64) error {
p, id, ok := daemon.PodList.GetByContainerIdOrName(containerId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

daemon/exec.go:51: id declared and not used

Signed-off-by: HaoZhang <crazykev@zju.edu.cn>
Signed-off-by: HaoZhang <crazykev@zju.edu.cn>
@Crazykev
Copy link
Contributor Author

@gao-feng fixed. CI don't complain now.

@Crazykev Crazykev changed the title [Not-test-yet]add ExecSignal grpc api add ExecSignal grpc api Feb 15, 2017
@gao-feng
Copy link
Contributor

LGTM

@gao-feng gao-feng merged commit 0c77074 into hyperhq:master Feb 15, 2017
@Crazykev Crazykev deleted the exec-signal branch February 15, 2017 04:33
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants