Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After suspending the job, executing bg causes Arthas to exit #694

Closed
hengyunabc opened this issue May 13, 2019 · 5 comments
Closed

After suspending the job, executing bg causes Arthas to exit #694

hengyunabc opened this issue May 13, 2019 · 5 comments
Milestone

Comments

@hengyunabc
Copy link
Collaborator

用 arthas-demo 来测试:

先执行 watch demo.MathGame primeFactors target &

再执行 jobs ,得到job id。 再执行 ctrl + z 挂起job,再执行 bg jobId ,Arthas就会马上退出。

01 2019-05-13 19:42:20.386 ERROR [nioEventLoopGroup-2-2:i.t.c.io_error] Reported io error => closing
java.lang.IllegalStateException: null
        at com.taobao.arthas.core.shell.term.impl.TermImpl.readline(TermImpl.java:81) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at com.taobao.arthas.core.shell.impl.ShellImpl.readline(ShellImpl.java:145) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at com.taobao.arthas.core.shell.handlers.shell.ShellLineHandler.handleBackground(ShellLineHandler.java:125) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at com.taobao.arthas.core.shell.handlers.shell.ShellLineHandler.handle(ShellLineHandler.java:54) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at com.taobao.arthas.core.shell.handlers.shell.ShellLineHandler.handle(ShellLineHandler.java:17) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at com.taobao.arthas.core.shell.handlers.term.RequestHandler.accept(RequestHandler.java:22) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at com.taobao.arthas.core.shell.handlers.term.RequestHandler.accept(RequestHandler.java:10) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.readline.Readline$Interaction.end(Readline.java:233) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.readline.Readline$Interaction.access$000(Readline.java:189) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.readline.Readline$2.apply(Readline.java:479) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.readline.Readline$Interaction.handle(Readline.java:264) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.readline.Readline$Interaction.access$200(Readline.java:189) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.readline.Readline.deliver(Readline.java:123) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.readline.Readline.access$500(Readline.java:41) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.readline.Readline$Interaction$2.accept(Readline.java:427) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.readline.Readline$Interaction$2.accept(Readline.java:421) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.tty.TtyEventDecoder.accept(TtyEventDecoder.java:93) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.tty.TtyEventDecoder.accept(TtyEventDecoder.java:26) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.tty.ReadBuffer.accept(ReadBuffer.java:44) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.tty.ReadBuffer.accept(ReadBuffer.java:28) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.io.BinaryDecoder.write(BinaryDecoder.java:117) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.io.BinaryDecoder.write(BinaryDecoder.java:65) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.telnet.TelnetTtyConnection.onData(TelnetTtyConnection.java:139) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.telnet.TelnetConnection.flushData(TelnetConnection.java:267) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.telnet.TelnetConnection.flushDataIfNecessary(TelnetConnection.java:257) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.telnet.TelnetConnection.receive(TelnetConnection.java:108) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.termd.core.telnet.netty.TelnetChannelHandler.channelRead(TelnetChannelHandler.java:46) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1408) [arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:930) [arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:682) [arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:617) [arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:534) [arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) [arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:906) [arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [arthas-core.jar:3.1.1-SNAPSHOT]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [arthas-core.jar:3.1.1-SNAPSHOT]

@hengyunabc hengyunabc added this to the 3.1.1 milestone May 13, 2019
@hengyunabc
Copy link
Collaborator Author

在 bg 执行时,有可能出现这种情况:

java.lang.IllegalStateException: Cannot write to standard output when stopped
        at com.taobao.arthas.core.shell.system.impl.ProcessImpl$CommandProcessImpl.write(ProcessImpl.java:453) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at com.taobao.arthas.core.command.monitor200.WatchAdviceListener.watching(WatchAdviceListener.java:93) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at com.taobao.arthas.core.command.monitor200.WatchAdviceListener.finishing(WatchAdviceListener.java:67) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at com.taobao.arthas.core.command.monitor200.WatchAdviceListener.afterReturning(WatchAdviceListener.java:51) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at com.taobao.arthas.core.advisor.ReflectAdviceListenerAdapter.afterReturning(ReflectAdviceListenerAdapter.java:133) ~[arthas-core.jar:3.1.1-SNAPSHOT]
        at com.taobao.arthas.core.advisor.AdviceWeaver.afterReturning(AdviceWeaver.java:317) [arthas-core.jar:3.1.1-SNAPSHOT]
        at com.taobao.arthas.core.advisor.AdviceWeaver.methodOnEnd(AdviceWeaver.java:158) [arthas-core.jar:3.1.1-SNAPSHOT]
        at com.taobao.arthas.core.advisor.AdviceWeaver.methodOnReturnEnd(AdviceWeaver.java:111) [arthas-core.jar:3.1.1-SNAPSHOT]
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) ~[na:na]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_181]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_181]
        at demo.MathGame.primeFactors(MathGame.java:61) [arthas-demo.jar:3.1.0]
        at demo.MathGame.run(MathGame.java:24) [arthas-demo.jar:3.1.0]
        at demo.MathGame.main(MathGame.java:16) [arthas-demo.jar:3.1.0]

@hengyunabc
Copy link
Collaborator Author

实际上是由 #410 引起的,因为配置了一个 QExitHandler,导致 com.taobao.arthas.core.shell.term.impl.TermImpl.readline(String, Handler<String>, Handler<Completion>) 里判断时,抛出异常:

    public void readline(String prompt, Handler<String> lineHandler, Handler<Completion> completionHandler) {
        if (conn.getStdinHandler() != echoHandler) {
            throw new IllegalStateException();
        }
        if (inReadline) {
            throw new IllegalStateException();
        }
        inReadline = true;
        readline.readline(conn, prompt, new RequestHandler(this, lineHandler), new CompletionHandler(completionHandler, session));
    }

因为job有多种状态,所以当job在切换状态时,可能需要保证在readline时,stdinHandler是 echoHandler

public enum ExecStatus {

    /**
     * The job is ready, it can be running or terminated.
     */
    READY,

    /**
     * The job is running, it can be stopped or terminated.
     */
    RUNNING,

    /**
     * The job is stopped, it can be running or terminated.
     */
    STOPPED,

    /**
     * The job is terminated.
     */
    TERMINATED
}

另外,可能目前的 com.taobao.arthas.core.shell.handlers.shell.ShellLineHandler实现逻辑不太对。

    private void handleBackground(List<CliToken> tokens) {
        String arg = TokenUtils.findSecondTokenText(tokens);
        Job job;
        if (arg == null) {
            job = shell.getForegroundJob();
        } else {
            job = shell.jobController().getJob(getJobId(arg));
        }
        if (job == null) {
            term.write(arg + " : no such job\n");
            shell.readline();
        } else {
            if (job.status() == ExecStatus.STOPPED) {
                job.resume(false);
                term.echo(shell.statusLine(job, ExecStatus.RUNNING));
                shell.readline();
            } else {
                term.write("job " + job.id() + " is already running\n");
                shell.readline();
            }
        }
    }

resume之后,调用了 shell.readline();,但job resume之后,有可能是 bg 运行状态,这时不应该读数据,并且这里读到的数据类似是 bg 1,这个数据传到 Command里了,感觉这个逻辑也不对。

@hengyunabc
Copy link
Collaborator Author

看了下 vertx-shell 里的实现,bg后的确调用了 readline

https://github.com/vert-x3/vertx-shell/blob/3.7.0/src/main/java/io/vertx/ext/shell/impl/ShellImpl.java#L225

hengyunabc added a commit that referenced this issue May 14, 2019
@hengyunabc
Copy link
Collaborator Author

找到最根本原因是 resume里没有正确把 foreground 参数传递过去。传过去之后可以正常工作。

@hengyunabc
Copy link
Collaborator Author

hengyunabc commented May 14, 2019

修复的测试包: arthas-3.1.1-SNAPSHOT-bin.zip

XhinLiang pushed a commit to XhinLiang/arthas-mvel that referenced this issue Aug 5, 2019
lzc-alioo pushed a commit to lzc-alioo/arthas that referenced this issue May 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant