Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

在k8s 临时容器 ephemeral container里 attach 问题 #1874

Open
hengyunabc opened this issue Jul 28, 2021 · 11 comments
Open

在k8s 临时容器 ephemeral container里 attach 问题 #1874

hengyunabc opened this issue Jul 28, 2021 · 11 comments

Comments

@hengyunabc
Copy link
Collaborator

测试版本信息

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.3", GitCommit:"ca643a4d1f7bfe34773c74f79527be4afd95bf39", GitTreeState:"clean", BuildDate:"2021-07-15T21:04:39Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:53:14Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}

start k8s

minikube start --feature-gates=EphemeralContainers=true

start a simple java pod:

kubectl run arthas-demo --image=hengyunabc/atest:0.0.3

Dockerfile:

FROM openjdk:8-jdk
RUN wget https://arthas.aliyun.com/math-game.jar

ENTRYPOINT ["/bin/sh", "-c", "java -jar math-game.jar"]

Check pods status:

$ kubectl get pods
NAME          READY   STATUS    RESTARTS   AGE
arthas-demo   1/1     Running   0          5m58s

Debug ephemeral containers

kubectl debug -it arthas-demo --image=openjdk:8-jdk --target=arthas-demo

ps 可以看到进程,但是jps看不到,并且jstack -l失败

root@arthas-demo:/# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.0   2392   744 ?        Ss   10:00   0:00 /bin/sh -c java -jar math-game.jar
root           8  0.1  0.7 4054728 62724 ?       Sl   10:00   0:10 java -jar math-game.jar
root          18  0.0  0.0   5756  3600 pts/0    Ss+  10:00   0:00 bash
root         115  1.0  0.0   5756  3552 pts/0    Ss   11:50   0:00 bash
root         121  0.0  0.0   9396  3012 pts/0    R+   11:50   0:00 ps aux
root@arthas-demo:/# jps
122 Jps
root@arthas-demo:/# jstack -l 8
8: Unable to open socket file: target process not responding or HotSpot VM not loaded
The -F option can be used when the target process is not responding

尝试

尝试把应用容器里的 /tmp/hsperfdata_root/目录复制到临时容器里,仍然失败:

root@arthas-demo:/# cp -r /proc/8/root/tmp/hsperfdata_root/ /tmp
root@arthas-demo:/# jps
148 Jps
8 jar
root@arthas-demo:/# jstack -l 8
8: Unable to open socket file: target process not responding or HotSpot VM not loaded
The -F option can be used when the target process is not responding
@hengyunabc
Copy link
Collaborator Author

测试 jdk17

启动应用容器

kubectl run arthas-demo --image=hengyunabc/atest:0.0.4

doockerfile

FROM openjdk:17-jdk
RUN curl https://arthas.aliyun.com/math-game.jar -o math-game.jar
ENTRYPOINT ["/bin/sh" "-c" "java -jar math-game.jar"]

用临时容器debug

临时容器的dockerfile

FROM openjdk:17-jdk
CMD ["sh"]

执行命令

kubectl debug -it arthas-demo --image=hengyunabc/atest:0.0.4-debug --target=arthas-demo

然后在容器里直接执行jps是没有结果的,复制进程1的 /proc/1/root/tmp/hsperfdata_root/ 到临时容器的 /tmp目录,才可以jps/jstack 成功。

sh-4.4# /usr/java/openjdk-17/bin/jps
82 Jps
sh-4.4# cp -r /proc/1/root/tmp/hsperfdata_root/ /tmp
sh-4.4# /usr/java/openjdk-17/bin/jps
97 Jps
1 math-game.jar
sh-4.4# /usr/java/openjdk-17/bin/jstack -l 1
2021-07-28 16:42:04
Full thread dump OpenJDK 64-Bit Server VM (17-ea+32-2679 mixed mode, sharing):

再在临时容器里测试 arthas:

curl -O https://arthas.aliyun.com/arthas-boot.jar
java -jar arthas-boot.jar

结果attach时出错Agent JAR not found or no Agent-Class attribute

sh-4.4# java -jar arthas-boot.jar
[INFO] arthas-boot version: 3.5.3
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 1 math-game.jar

[INFO] Start download arthas from remote server: https://arthas.aliyun.com/download/3.5.3?mirror=center
[INFO] File size: 12.72 MB, downloaded size: 1.41 MB, downloading ...
[INFO] File size: 12.72 MB, downloaded size: 4.69 MB, downloading ...
[INFO] File size: 12.72 MB, downloaded size: 9.04 MB, downloading ...
[INFO] Download arthas success.
[INFO] arthas home: /root/.arthas/lib/3.5.3/arthas
[INFO] Try to attach process 1
[ERROR] Start arthas failed, exception stack trace:
com.sun.tools.attach.AgentLoadException: Agent JAR not found or no Agent-Class attribute
	at jdk.attach/sun.tools.attach.HotSpotVirtualMachine.loadAgent(HotSpotVirtualMachine.java:160)
	at com.taobao.arthas.core.Arthas.attachAgent(Arthas.java:120)
	at com.taobao.arthas.core.Arthas.<init>(Arthas.java:26)
	at com.taobao.arthas.core.Arthas.main(Arthas.java:139)
[INFO] Attach process 1 success.
[INFO] arthas-client connect 127.0.0.1 3658
Connect to telnet server error: 127.0.0.1 3658
java.net.ConnectException: Connection refused
	at java.base/sun.nio.ch.Net.pollConnect(Native Method)
	at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:672)
	at java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:542)
	at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597)
	at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
	at java.base/java.net.Socket.connect(Socket.java:633)
	at org.apache.commons.net.SocketClient.connect(SocketClient.java:188)
	at org.apache.commons.net.SocketClient.connect(SocketClient.java:209)
	at com.taobao.arthas.client.TelnetConsole.process(TelnetConsole.java:306)
	at com.taobao.arthas.client.TelnetConsole.main(TelnetConsole.java:166)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at com.taobao.arthas.boot.Bootstrap.main(Bootstrap.java:615)
Usage: arthas-client [--help] [-c <value>] [-f <value>] [-w <value>] [-t
       <value>] [-h <value>] [target-ip] [port]

原因是arthas-boot把arthas文件下载到临时容器的 ~/.arthas 目录里,但是这个在应用容器里是看不到的。

要解决这个问题,要把arthas的文件放到 应用容器能读取到的目录下。
比如放到 /proc/1/root 下。

sh-4.4# cp -r /root/.arthas/ /proc/1/root/root/

然后就可以attach成功了。

sh-4.4# java -jar arthas-boot.jar
[INFO] arthas-boot version: 3.5.3
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 1 math-game.jar

[INFO] arthas home: /root/.arthas/lib/3.5.3/arthas
[INFO] Try to attach process 1
[INFO] Attach process 1 success.
[INFO] arthas-client connect 127.0.0.1 3658

所以总结下来,本质上要解决两个问题:

  • jvm自身的attach机制问题
  • 应用容器要能读取到 arthas下载的文件问题

最简便的办法是 临时容器能和应用容器共享 /tmp目录。但kubectl命令好像无法指定,可能要用api的方式。

@yjustdo
Copy link

yjustdo commented Aug 10, 2021

您好。按照您的思路,我在k8s中进行了尝试。希望通过边车模式,主容器与副容器共享/tmp目录,然后在主容器中运行Java程序,副容器运行arthas对主容器中应用进行监控。
yaml文件如下,是拉取了您前文中的镜像:

apiVersion: v1
kind: Pod
metadata:
  name: arthas4
spec:
  shareProcessNamespace: true
  containers:
  - name: glassfish2
    image: hengyunabc/atest:0.0.4
    volumeMounts:
    - name: html
      mountPath: /tmp/
  - name: glassfish
    image: hengyunabc/atest:0.0.4
    volumeMounts:
    - name: html
      mountPath: /tmp/
  volumes:
  - name: html
    emptyDir: {}

pod成功启动后,进入到副容器的/tmp目录下,下载arthas-boot.jar并运行,然后cp -r /root/.arthas/ /tmp/,得到如下界面,能够看到主容器中的java进程(进程号为6)

sh-4.4# java -jar arthas-boot.jar
[INFO] arthas-boot version: 3.5.3
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 6 math-game.jar
  [2]: 24 math-game.jar

但当输入数字1之后,仍然出现了跟您上面一样的报错信息

sh-4.4# java -jar arthas-boot.jar
[INFO] arthas-boot version: 3.5.3
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 6 math-game.jar
  [2]: 24 math-game.jar
1
[INFO] arthas home: /root/.arthas/lib/3.5.3/arthas
[INFO] Try to attach process 6
[ERROR] Start arthas failed, exception stack trace: 
com.sun.tools.attach.AgentLoadException: Agent JAR not found or no Agent-Class attribute
        at jdk.attach/sun.tools.attach.HotSpotVirtualMachine.loadAgent(HotSpotVirtualMachine.java:160)
        at com.taobao.arthas.core.Arthas.attachAgent(Arthas.java:120)
        at com.taobao.arthas.core.Arthas.<init>(Arthas.java:26)
        at com.taobao.arthas.core.Arthas.main(Arthas.java:139)
[INFO] Attach process 6 success.
[INFO] arthas-client connect 127.0.0.1 3658
Connect to telnet server error: 127.0.0.1 3658
java.net.ConnectException: Connection refused
        at java.base/sun.nio.ch.Net.pollConnect(Native Method)
        at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:672)
        at java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:542)
        at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597)
        at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
        at java.base/java.net.Socket.connect(Socket.java:633)
        at org.apache.commons.net.SocketClient.connect(SocketClient.java:188)
        at org.apache.commons.net.SocketClient.connect(SocketClient.java:209)
        at com.taobao.arthas.client.TelnetConsole.process(TelnetConsole.java:306)
        at com.taobao.arthas.client.TelnetConsole.main(TelnetConsole.java:166)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:568)
        at com.taobao.arthas.boot.Bootstrap.main(Bootstrap.java:615)
Usage: arthas-client [--help] [-c <value>] [-f <value>] [-w <value>] [-t
       <value>] [-h <value>] [target-ip] [port]

Arthas Telnet Client

EXAMPLES:
  java -jar arthas-client.jar 127.0.0.1 3658
  java -jar arthas-client.jar -c 'dashboard -n 1'
  java -jar arthas-client.jar -f batch.as 127.0.0.1

但是我已将两个容器共享了/tmp目录,并将安装arthas的副容器的/root/.arthas/文件移动到了/tmp目录下。
请您赐教还可能是哪些方面的问题呢?谢谢

@hengyunabc
Copy link
Collaborator Author

hengyunabc commented Aug 10, 2021

@yjustdo 和 arthas查找 jar 目录有关。 要cd到 arthas目录下面执行启动。 出错的信息来看,还是应用进程加载不到 副容器里的文件的问题。 试下启动前增加 export ARTHAS_LIB_DIR=/tmp ,把arthas 下载lib目录配置到 /tmp下面。

@yjustdo
Copy link

yjustdo commented Aug 11, 2021

谢谢。cd arthas后再执行java -jar arthas-boot.jar,可以attach到进程上了

sh-4.4# cd arthas
sh-4.4# java -jar arthas-boot.jar
[INFO] arthas-boot version: 3.5.3
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 6 math-game.jar
  [2]: 24 math-game.jar
1
[INFO] arthas home: /tmp/arthas
[INFO] Try to attach process 6
[INFO] Attach process 6 success.
[INFO] arthas-client connect 127.0.0.1 3658
  ,---.  ,------. ,--------.,--.  ,--.  ,---.   ,---.                           
 /  O  \ |  .--. ''--.  .--'|  '--'  | /  O  \ '   .-'                          
|  .-.  ||  '--'.'   |  |   |  .--.  ||  .-.  |`.  `-.                          
|  | |  ||  |\  \    |  |   |  |  |  ||  | |  |.-'    |                         
`--' `--'`--' '--'   `--'   `--'  `--'`--' `--'`-----'                          
                                                                                

wiki       https://arthas.aliyun.com/doc                                        
tutorials  https://arthas.aliyun.com/doc/arthas-tutorials.html                  
version    3.5.3                                                                
main_class                                                                      
pid        6                                                                    
time       2021-08-11 01:16:15                                                  

[arthas@6]$ 

@hengyunabc
Copy link
Collaborator Author

用最新版本的 jattach 可以加载 agent,不过这个每个平台的二进制文件都不同,会增大复杂度和文件体积。
https://github.com/apangin/jattach/releases/tag/v2.0

另外,使用 jattach 仍然要把文件复制到 /proc/$pid/root/ 目录下。

@east4ming
Copy link

@hengyunabc 能具体说一下如何 用jattach 加载 arthas吗?谢谢

@hengyunabc
Copy link
Collaborator Author

@hengyunabc 能具体说一下如何 用jattach 加载 arthas吗?谢谢

  1. 先自己写个最简单的java agent,然后测�试用 jattach去加载
  2. 最简单的方式就是直接用 jattach 加载 arthas-core.jar

@east4ming
Copy link

@hengyunabc 能具体说一下如何 用jattach 加载 arthas吗?谢谢

  1. 先自己写个最简单的java agent,然后测�试用 jattach去加载
  2. 最简单的方式就是直接用 jattach 加载 arthas-core.jar

我的容器是基于alpine的 jre(没有jdk)。
安装了jattach 后,我通过如下命令加载 arthas:

bash-4.4# ps -ef
PID   USER     TIME  COMMAND
    1 root      0:00 bash /usr/local/tomcat/bin/catalina.sh run | tee -a logs/catalina.out
    9 root     13:07 /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties -Djava.util.logging.mana

bash-4.4# pwd
/proc/9/root

bash-4.4# jattach 9 load instrument false "/proc/9/root/arthas-core.jar"
Connected to remote JVM
Response code = 0
100

请问我该如何使用 arthas 的 Dashboard 等功能呢?

@dafu-wu
Copy link

dafu-wu commented May 5, 2022

请教下jdk8,目前有办法attach上么?@hengyunabc

@hengyunabc
Copy link
Collaborator Author

请教下jdk8,目前有办法attach上么?@hengyunabc

试下共享 /tmp/ 目录,并且把 arthas下载解压到 /tmp/ 目录,并且cd到目录里执行。

@dafu-wu
Copy link

dafu-wu commented May 5, 2022

请教下jdk8,目前有办法attach上么?@hengyunabc

试下共享 /tmp/ 目录,并且把 arthas下载解压到 /tmp/ 目录,并且cd到目录里执行。

谢谢,以通过参考方法和#362已解决。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants