Skip to content

Conversation

@slovdahl
Copy link
Contributor

@slovdahl slovdahl commented Jan 30, 2024


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issue

  • JDK-8226919: attach in linux hangs due to permission denied accessing /proc/pid/root (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17628/head:pull/17628
$ git checkout pull/17628

Update a local copy of the PR:
$ git checkout pull/17628
$ git pull https://git.openjdk.org/jdk.git pull/17628/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17628

View PR using the GUI difftool:
$ git pr show -t 17628

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17628.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 30, 2024

👋 Welcome back slovdahl! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@slovdahl
Copy link
Contributor Author

/issue add JDK-8226919

@openjdk
Copy link

openjdk bot commented Jan 30, 2024

@slovdahl
Adding additional issue to issue list: 8226919: attach in linux hangs due to permission denied accessing /proc/pid/root.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 30, 2024
@openjdk
Copy link

openjdk bot commented Jan 30, 2024

@slovdahl The following label will be automatically applied to this pull request:

  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the serviceability serviceability-dev@openjdk.org label Jan 30, 2024
@mlbridge
Copy link

mlbridge bot commented Jan 30, 2024

Webrevs

@slovdahl
Copy link
Contributor Author

slovdahl commented Jan 30, 2024

I have poked around in the JDK sources but not found any tests related to this. Is there some prior art to look at?

Anyway, this is how I reproduced it locally, and verified that the fix works.

Basic environment information:

slovdahl@ubuntu2204:~/reproducer$ systemd --version
systemd 249 (249.11-0ubuntu3.12)
+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY -P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified
slovdahl@ubuntu2204:~/reproducer$ sudo apt-get install openjdk-17-jdk-headless
slovdahl@ubuntu2204:~/reproducer$ /usr/lib/jvm/java-17-openjdk-amd64/bin/java -version
openjdk version "17.0.9" 2023-10-17
OpenJDK Runtime Environment (build 17.0.9+9-Ubuntu-122.04)
OpenJDK 64-Bit Server VM (build 17.0.9+9-Ubuntu-122.04, mixed mode, sharing)
slovdahl@ubuntu2204:~/reproducer$ /home/slovdahl/dev/external/jdk/build/linux-x86_64-server-release/images/jdk/bin/java -version
openjdk version "23-internal" 2024-09-17
OpenJDK Runtime Environment (build 23-internal-adhoc.slovdahl.jdk)
OpenJDK 64-Bit Server VM (build 23-internal-adhoc.slovdahl.jdk, mixed mode, sharing)

Reproducer systemd unit that can bind to port 81 as non-root user thanks to AmbientCapabilities=CAP_NET_BIND_SERVICE (https://man7.org/linux/man-pages/man7/capabilities.7.html):

slovdahl@ubuntu2204:~/reproducer$ cat Reproducer.java
import java.io.IOException;
import java.net.InetSocketAddress;
import java.net.ServerSocket;

public class Reproducer {
  public static void main(String[] args) throws InterruptedException, IOException {
    System.out.println("Hello, World!");
    try (var server = new ServerSocket()) {
      server.bind(new InetSocketAddress("localhost", 81));
      System.out.println("Bound to port 81");
      while (true) {
        Thread.sleep(1_000L);
      }
    }
  }
}

slovdahl@ubuntu2204:~/reproducer$ cat reproducer.service
[Service]
Type=simple
ExecStart=/usr/lib/jvm/java-17-openjdk-amd64/bin/java /home/slovdahl/reproducer/Reproducer.java

User=slovdahl
Group=slovdahl
ReadWritePaths=/tmp
AmbientCapabilities=CAP_NET_BIND_SERVICE
slovdahl@ubuntu2204:~/reproducer$ sudo cp -a reproducer.service /etc/systemd/system/
slovdahl@ubuntu2204:~/reproducer$ sudo systemctl daemon-reload
slovdahl@ubuntu2204:~/reproducer$ sudo systemctl start reproducer.service
slovdahl@ubuntu2204:~/reproducer$ sudo systemctl status reproducer.service
● reproducer.service
     Loaded: loaded (/etc/systemd/system/reproducer.service; static)
     Active: active (running) since Tue 2024-01-30 11:45:15 EET; 1s ago
   Main PID: 2543233 (java)
      Tasks: 26 (limit: 76971)
     Memory: 105.5M
        CPU: 751ms
     CGroup: /system.slice/reproducer.service
             └─2543233 /usr/lib/jvm/java-17-openjdk-amd64/bin/java /home/slovdahl/reproducer/Reproducer.java

jan 30 11:45:15 ubuntu2204 systemd[1]: Started reproducer.service.
jan 30 11:45:15 ubuntu2204 java[2543233]: Hello, World!
jan 30 11:45:15 ubuntu2204 java[2543233]: Bound to port 81

slovdahl@ubuntu2204:~/reproducer$ ls -lh /proc/$(pgrep -f Reproducer.java)/root
ls: cannot read symbolic link '/proc/2543233/root': Permission denied
lrwxrwxrwx 1 slovdahl slovdahl 0 jan 30 11:45 /proc/2543233/root
slovdahl@ubuntu2204:~/reproducer$ sudo ls -lh /proc/$(pgrep -f Reproducer.java)/root
lrwxrwxrwx 1 slovdahl slovdahl 0 jan 30 11:45 /proc/2543233/root -> /

Fails with vanilla OpenJDK 17:

slovdahl@ubuntu2204:~/reproducer$ /usr/lib/jvm/java-17-openjdk-amd64/bin/jcmd $(pgrep -f Reproducer.java) VM.version
2543233:
com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file /proc/2543233/root/tmp/.java_pid2543233: target process 2543233 doesn't respond within 10500ms or HotSpot VM not loaded
	at jdk.attach/sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:104)
	at jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58)
	at jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207)
	at jdk.jcmd/sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:113)
	at jdk.jcmd/sun.tools.jcmd.JCmd.main(JCmd.java:97)

Works when trying to attach as root with vanilla OpenJDK 17:

slovdahl@ubuntu2204:~/reproducer$ sudo /usr/lib/jvm/java-17-openjdk-amd64/bin/jcmd $(pgrep -f Reproducer.java) VM.version
2543233:
OpenJDK 64-Bit Server VM version 17.0.9+9-Ubuntu-122.04
JDK 17.0.9

A JDK built with the fix in this PR:

slovdahl@ubuntu2204:~/reproducer$ sudo systemctl stop reproducer.service
slovdahl@ubuntu2204:~/reproducer$ cat reproducer-custom-jvm.service
[Service]
Type=simple
ExecStart=/home/slovdahl/dev/external/jdk/build/linux-x86_64-server-release/images/jdk/bin/java /home/slovdahl/reproducer/Reproducer.java

User=slovdahl
Group=slovdahl
ReadWritePaths=/tmp
AmbientCapabilities=CAP_NET_BIND_SERVICE
slovdahl@ubuntu2204:~/reproducer$ sudo cp -a reproducer-custom-jvm.service /etc/systemd/system/
slovdahl@ubuntu2204:~/reproducer$ sudo systemctl daemon-reload
slovdahl@ubuntu2204:~/reproducer$ sudo systemctl start reproducer-custom-jvm.service
slovdahl@ubuntu2204:~/reproducer$ sudo systemctl status reproducer-custom-jvm.service
● reproducer-custom-jvm.service
     Loaded: loaded (/etc/systemd/system/reproducer-custom-jvm.service; static)
     Active: active (running) since Tue 2024-01-30 11:49:13 EET; 1s ago
   Main PID: 2546431 (java)
      Tasks: 26 (limit: 76971)
     Memory: 68.4M
        CPU: 809ms
     CGroup: /system.slice/reproducer-custom-jvm.service
             └─2546431 /home/slovdahl/dev/external/jdk/build/linux-x86_64-server-release/images/jdk/bin/java /home/slovdahl/reproducer/Reproducer.java

jan 30 11:49:13 ubuntu2204 systemd[1]: Started reproducer-custom-jvm.service.
jan 30 11:49:14 ubuntu2204 java[2546431]: Hello, World!
jan 30 11:49:14 ubuntu2204 java[2546431]: Bound to port 81

slovdahl@ubuntu2204:~/reproducer$ ls -lh /proc/$(pgrep -f Reproducer.java)/root
ls: cannot read symbolic link '/proc/2546431/root': Permission denied
lrwxrwxrwx 1 slovdahl slovdahl 0 jan 30 11:49 /proc/2546431/root

Attaching works as non-root with the fixed JDK:

slovdahl@ubuntu2204:~/reproducer$ /home/slovdahl/dev/external/jdk/build/linux-x86_64-server-release/images/jdk/bin/jcmd $(pgrep -f Reproducer.java) VM.version
2546431:
OpenJDK 64-Bit Server VM version 23-internal-adhoc.slovdahl.jdk
JDK 23.0.0

Attaching to a JVM inside a Docker container works as before with vanilla OpenJDK 17 and my locally built one (always requires root):

slovdahl@ubuntu2204:~/reproducer$ sudo systemctl stop reproducer-custom-jvm.service
slovdahl@ubuntu2204:~/reproducer$ docker run eclipse-temurin:17 java -version
openjdk version "17.0.10" 2024-01-16
OpenJDK Runtime Environment Temurin-17.0.10+7 (build 17.0.10+7)
OpenJDK 64-Bit Server VM Temurin-17.0.10+7 (build 17.0.10+7, mixed mode, sharing)

slovdahl@ubuntu2204:~/reproducer$ docker run --rm -v .:/app -w /app eclipse-temurin:17 java Reproducer.java
Hello, World!
Bound to port 81

slovdahl@ubuntu2204:~/reproducer$ /usr/lib/jvm/java-17-openjdk-amd64/bin/jcmd $(pgrep --newest -f Reproducer.java) VM.version
2553682:
java.io.IOException: Permission denied
	at java.base/java.io.UnixFileSystem.createFileExclusively(Native Method)
	at java.base/java.io.File.createNewFile(File.java:1043)
	at jdk.attach/sun.tools.attach.VirtualMachineImpl.createAttachFile(VirtualMachineImpl.java:308)
	at jdk.attach/sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:80)
	at jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58)
	at jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207)
	at jdk.jcmd/sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:113)
	at jdk.jcmd/sun.tools.jcmd.JCmd.main(JCmd.java:97)
slovdahl@ubuntu2204:~/reproducer$ sudo /usr/lib/jvm/java-17-openjdk-amd64/bin/jcmd $(pgrep --newest -f Reproducer.java) VM.version
2553682:
OpenJDK 64-Bit Server VM version 17.0.10+7
JDK 17.0.10

slovdahl@ubuntu2204:~/reproducer$ /home/slovdahl/dev/external/jdk/build/linux-x86_64-server-release/images/jdk/bin/jcmd $(pgrep --newest -f Reproducer.java) VM.version
2553682:
java.io.IOException: Unable to access root directory /proc/2553682/root of target process 2553682
	at jdk.attach/sun.tools.attach.VirtualMachineImpl.findTargetProcessTmpDirectory(VirtualMachineImpl.java:247)
	at jdk.attach/sun.tools.attach.VirtualMachineImpl.findSocketFile(VirtualMachineImpl.java:214)
	at jdk.attach/sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:71)
	at jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58)
	at jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207)
	at jdk.jcmd/sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:113)
	at jdk.jcmd/sun.tools.jcmd.JCmd.main(JCmd.java:97)
slovdahl@ubuntu2204:~/reproducer$ sudo /home/slovdahl/dev/external/jdk/build/linux-x86_64-server-release/images/jdk/bin/jcmd $(pgrep --newest -f Reproducer.java) VM.version
2553682:
OpenJDK 64-Bit Server VM version 17.0.10+7
JDK 17.0.10

@slovdahl slovdahl changed the title 8307977: Fix dynamic attach to processes with elevated capabilities on Linux 8307977: jcmd and jstack broken for target processes running with elevated capabilities Jan 30, 2024
@tstuefe
Copy link
Member

tstuefe commented Jan 30, 2024

ping @jerboaa

// A process may not exist in the same mount namespace as the caller.
// Instead, attach relative to the target root filesystem as exposed by
// procfs regardless of namespaces.
String root = "/proc/" + pid + "/root/" + tmpdir;
Copy link

@perlun perlun Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helping myself and other future readers understand this: the problem with the previous implementation is that the code assumed that the tmpdir could be accessed this way (/proc/<pid>/root/<tmpdir>). In other words:

  • The code for creating the socket would correctly check if pid != ns_pid and then act accordingly (/proc/<pid>/root/<tmpdir> or just plain <tmpdir>)
  • The code for reading the socket would not have the check the above. It would resort to always use /proc/<pid>/root/<tmpdir>.
  • For certain scenarios (CAP_NET_BIND_SERVICE-processes, as described in 8226919: attach in linux hangs due to permission denied accessing /proc/pid/root #17628 (comment)), we would get a Permission denied when trying to access the temporary directory like this.

What this PR does is to ensure that the same pid != ns_pid check is used both when creating and reading the socket, and fall back to <tmpdir> when no namespacing is being used. This seems to work better for these processes with elevated permissions.

/cc @slovdahl, feel free to add/correct the above if needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it not be comparing pid namespace ids and not pids?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that it should compare the input PID against the outermost (leftmost) PID in the NSpid list from /proc/<pid>/status and not innermost (rightmost) as is done right now? What would be the benefit of that? Or did you mean something else?

I'm working on a fix for https://bugs.openjdk.org/browse/JDK-8327114 right now, and it occurred to me that there is a tiny risk of pid != ns_pid not evaluating to true even though the processes are in different PID namespaces (because two different PID namespaces can have the same PIDs). I think it could be mitigated by always trying /proc/<pid>/root/tmp first, and if it cannot be read, fall back to /tmp.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c.f: /proc//ns/pid

every (Linux) namespace has a unique id, if 2 (or more) processes occupy the same pid namespace (or any other for that matter) then their ../ns/pid namespace ids will be the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Files.readSymbolicLink(Path.of("/proc/self/ns/pid"))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

h'mmm ignore my ramblings for now, I need to spend some more time looking into this before wading into the fray with random opinions etc!

@jerboaa
Copy link
Contributor

jerboaa commented Jan 30, 2024

I have poked around in the JDK sources but not found any tests related to this. Is there some prior art to look at?

Please run container tests, which do some jcmd testing across containers (host system runs jcmd and containers have the JVMs): test/hotspot/jtreg/containers See testing.md as to how to run them. I'll give this PR a spin as well.

@jerboaa
Copy link
Contributor

jerboaa commented Jan 30, 2024

test/hotspot/jtreg/serviceability tests would also be worth running.

@mlbridge
Copy link

mlbridge bot commented Jan 30, 2024

Mailing list message from Bernd Eckenfels on serviceability-dev:

Is that actually safe to allow low priveledged user context to attach and control to a higher prived? It can at least overwrite files, but probably also inject code? On the native level a ptrace(2) would probably not be allowed.

Gru?
Bernd
?
https://bernd.eckenfels.net

@slovdahl
Copy link
Contributor Author

Hi @jerboaa, thanks a lot for the hints! The container tests were new to me at least.

Just out of curiosity, are the container tests run as part of the tier1 tests (make test-tier1)? I'm not sure if I'm looking at the right place, but I get the feeling this is defined in test/hotspot/jtreg/TEST.groups, and the answer is maybe "no" in that case.

Please run container tests, which do some jcmd testing across containers (host system runs jcmd and containers have the JVMs): test/hotspot/jtreg/containers See testing.md as to how to run them. I'll give this PR a spin as well.

If I understand it correctly, this is the way to run them. Please correct me if I'm wrong.

$ make test TEST="jtreg:test/hotspot/jtreg/containers"

...

==============================
Test summary
==============================
   TEST                                              TOTAL  PASS  FAIL ERROR   
   jtreg:test/hotspot/jtreg/containers                  14    14     0     0   
==============================
TEST SUCCESS

test/hotspot/jtreg/serviceability tests would also be worth running.

$ make test TEST="jtreg:test/hotspot/jtreg/serviceability"

...

==============================
Test summary
==============================
   TEST                                              TOTAL  PASS  FAIL ERROR   
   jtreg:test/hotspot/jtreg/serviceability             368   368     0     0   
==============================
TEST SUCCESS

@slovdahl
Copy link
Contributor Author

Is that actually safe to allow low priveledged user context to attach and control to a higher prived? It can at least overwrite files, but probably also inject code? On the native level a ptrace(2) would probably not be allowed.

It's a good question. For context, this has worked fine in JDK 8, and AFAIK it was never intentionally broken for security reasons.

In some cases the opposite can also be true - that one needs root access to attach to a process is not acceptable or even possible.

@jerboaa
Copy link
Contributor

jerboaa commented Jan 31, 2024

Hi @jerboaa, thanks a lot for the hints! The container tests were new to me at least.

Just out of curiosity, are the container tests run as part of the tier1 tests (make test-tier1)? I'm not sure if I'm looking at the right place, but I get the feeling this is defined in test/hotspot/jtreg/TEST.groups, and the answer is maybe "no" in that case.

Please run container tests, which do some jcmd testing across containers (host system runs jcmd and containers have the JVMs): test/hotspot/jtreg/containers See testing.md as to how to run them. I'll give this PR a spin as well.

If I understand it correctly, this is the way to run them. Please correct me if I'm wrong.

$ make test TEST="jtreg:test/hotspot/jtreg/containers"

...

==============================
Test summary
==============================
   TEST                                              TOTAL  PASS  FAIL ERROR   
   jtreg:test/hotspot/jtreg/containers                  14    14     0     0   
==============================
TEST SUCCESS

Thanks! Please make sure that the tests actually ran. If, for example, docker is not installed, they get skipped.

@slovdahl
Copy link
Contributor Author

Thanks! Please make sure that the tests actually ran. If, for example, docker is not installed, they get skipped.

Ah, good point. Running the tests did take some amount of time, so it felt like they did something. And by spamming docker ps while the tests are running I could see e.g. this:

$ docker ps
CONTAINER ID   IMAGE                                                          COMMAND                  CREATED         STATUS                  PORTS                                                NAMES
179ff2470b18   jdk-internal:test-containers-docker-TestJFREvents-jfr-events   "/jdk/bin/java -cp /…"   1 second ago    Up Less than a second                                                        stoic_clarke

@jerboaa
Copy link
Contributor

jerboaa commented Feb 5, 2024

Mailing list message from Bernd Eckenfels on serviceability-dev:

Is that actually safe to allow low priveledged user context to attach and control to a higher prived? It can at least overwrite files, but probably also inject code? On the native level a ptrace(2) would probably not be allowed.

Note that for the dynamic attach mechanism the file ownership of the files the JVM creates on both sides need to match. In this case it's user A with potentially elevated privileges (e.g. to bind to a port), and the attach happens from user A as well (without the same elevated privileges). So this doesn't make the security worse. It remains questionable if it's safe to be allowed to attach in that case, but it's been like that in older releases (JDK 8).

@jerboaa
Copy link
Contributor

jerboaa commented Feb 5, 2024

/reviewers 2

@openjdk
Copy link

openjdk bot commented Feb 5, 2024

@jerboaa
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 1 Reviewer, 1 Author).

Copy link
Contributor

@jerboaa jerboaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, but would like for somebody from the serviceability group to look at this as well. @plummercj perhaps?

@kevinjwalls
Copy link
Contributor

Hi,
Yes makes sense, this seems like an oversight that we were not consistent with the path.

Does CAP_NET_BIND_SERVICE cause any issues for createAttachFile(int pid, int ns_pid) where it creates the .attach file in the current directory - it starts by trying "/proc/" + pid + "/cwd/" + ".attach_pid" + ns_pid, regardless of ns_pid.

I'm curious if that always fails in the situation that causes the issue in this bug.
If so looks like it would catch an IOException and then use findTargetProcessTmpDirectory, but wonder if we should predict it go straight there.

Thanks!

@slovdahl
Copy link
Contributor Author

slovdahl commented Feb 8, 2024

Does CAP_NET_BIND_SERVICE cause any issues for createAttachFile(int pid, int ns_pid) where it creates the .attach file in the current directory - it starts by trying "/proc/" + pid + "/cwd/" + ".attach_pid" + ns_pid, regardless of ns_pid.

I'm curious if that always fails in the situation that causes the issue in this bug.
If so looks like it would catch an IOException and then use findTargetProcessTmpDirectory, but wonder if we should predict it go straight there.

Hi @kevinjwalls, and thank you for taking a look!

To make sure we're on the same page, is what you are asking if something like this would make sense (on top of the current state of the PR)?

diff --git src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java
index 81d4fd259ed..c06c972b39a 100644
--- src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java
+++ src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java
@@ -221,16 +221,19 @@ private File findSocketFile(int pid, int ns_pid) throws IOException {
     // checks for the file.
     private File createAttachFile(int pid, int ns_pid) throws IOException {
         String fn = ".attach_pid" + ns_pid;
-        String path = "/proc/" + pid + "/cwd/" + fn;
-        File f = new File(path);
-        try {
-            // Do not canonicalize the file path, or we will fail to attach to a VM in a container.
-            f.createNewFile();
-        } catch (IOException x) {
+
+        File f;
+        if (pid != ns_pid) {
+            String path = "/proc/" + pid + "/cwd/" + fn;
+            f = new File(path);
+        } else {
             String root = findTargetProcessTmpDirectory(pid, ns_pid);
             f = new File(root, fn);
-            f.createNewFile();
         }
+
+        // Do not canonicalize the file path, or we will fail to attach to a VM in a container.
+        f.createNewFile();
+
         return f;
     }
 

That is, if we know that pid and ns_pid are equal, do not even try to create the file in /proc/<pid>/cwd.

That's a good question. I tried to minimize the changes because I'm so unfamiliar with JDK internals and also don't have a good understanding of all the different use-cases that need to work.

I tried out the diff above locally using the reproducer steps from #17628 (comment). It seems to work equally fine in the case of a systemd unit using AmbientCapabilities=CAP_NET_BIND_SERVICE, and also in the case of attaching against a JVM running inside a Docker container.

The test/hotspot/jtreg/containers and test/hotspot/jtreg/serviceability tests all pass too.

That said, I'm still more confident in the current state of the PR, as it more closely follows what has existed before. But if you believe that this is a better way of handling it, I'm fine with that too.

@kevinjwalls
Copy link
Contributor

kevinjwalls commented Feb 8, 2024

Thanks, yes that's what I was thinking about.
I tested and think it's a good update to this change.

I tested setting
sudo setcap 'cap_net_bind_service=+ep' build/linux-x64/images/jdk/bin/java
..and then with jcmd I do see the EACCES on e.g. "/proc/27979/root/tmp/.java_pid27979"

I see the failure to attach, and I see it fixed by this change.
I also see the EACCESS on the .attach_pid file in cwd, e.g. in strace:
26682 open("/proc/26593/cwd/.attach_pid26593", O_RDWR|O_CREAT|O_EXCL, 0666 <unfinished ...>
...
26682 <... open resumed>) = -1 EACCES (Permission denied)

We catch this and retry in /tmp. But exactly as in your response, we can predict that and if the target is not in a namespace, go straight to /tmp. I tested what you have there and it works well. Also tested that a new jcmd attaching to an older JDK, that still works.

One other thing - JDK-8226919 looks like the original bug for this, logged a few years back, so if this fixes both, the record should show that it fixes that one, and JDK-8307977 should be closed as a duplicate. I/somebody can take care of that JBS admin. But if this PR could be associated with only JDK-8226919 that would be simple.

Thanks! 8-)

@dcubed-ojdk
Copy link
Member

Will this result in files being left in /tmp that are not cleaned up during test runs?

@kevinjwalls
Copy link
Contributor

Will this result in files being left in /tmp that are not cleaned up during test runs?

It shouldn't... We do cleanup, VirtualMachineImpl creates the attach file and deletes it in a finally block.

@dcubed-ojdk
Copy link
Member

Cool. Thanks for the confirmation.

@kevinjwalls
Copy link
Contributor

Hi, looking at it again:
Getting a target's current directory, you have to use /proc/PID/cwd, or you can fall back to using tmpdir.
The attach listener uses cwd and then tmpdir, so they will meet either way.

When getting the target's current directory, /proc/PID/cwd will fail for the evelated processes, but will work for others (where permissions allow).
So although I suggested the additional change, it can't be that much more efficient as it just makes the native attach listener try the second location...

So that was all very interesting, but let me just approve what we have here already. No need to proceed with the additional code you had in the comment response above. 8-)

@openjdk
Copy link

openjdk bot commented Feb 9, 2024

@slovdahl This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8226919: attach in linux hangs due to permission denied accessing /proc/pid/root

Reviewed-by: sgehwolf, kevinw

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 139 new commits pushed to the master branch:

  • b42b888: 8325038: runtime/cds/appcds/ProhibitedPackage.java can fail with UseLargePages
  • 6944537: 8325203: System.exit(0) kills the launched 3rd party application
  • 4368437: 8325264: two compiler/intrinsics/float16 tests fail after JDK-8324724
  • 4a3a38d: 8325517: Shenandoah: Reduce unnecessary includes from shenandoahControlThread.cpp
  • 40708ba: 8325563: Remove unused Space::is_in
  • 29d89d4: 8325551: Remove unused obj_is_alive and block_start in Space
  • 8ef918d: 8324646: Avoid Class.forName in SecureRandom constructor
  • 69b2674: 8324648: Avoid NoSuchMethodError when instantiating NativePRNG
  • 52d4976: 8325437: Safepoint polling in monitor deflation can cause massive logs
  • 8b70b8d: 8325440: Confusing error reported for octal literals with wrong digits
  • ... and 129 more: https://git.openjdk.org/jdk/compare/a1d65eb6d87ff9019a9a92a775213be2a8b60fd1...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@jerboaa, @kevinjwalls) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 9, 2024
@slovdahl
Copy link
Contributor Author

slovdahl commented Feb 9, 2024

Alright, sounds good to me. :) Thanks again for taking a look!

One other thing - JDK-8226919 looks like the original bug for this, logged a few years back, so if this fixes both, the record should show that it fixes that one, and JDK-8307977 should be closed as a duplicate. I/somebody can take care of that JBS admin. But if this PR could be associated with only JDK-8226919 that would be simple.

I'll still fix this. So, I should change the PR title to match JDK-8226919, and issue an /issue remove command for JDK-8307977, is that correct?

Once that is done, I would kindly ask for someone sponsoring this change as well.

@kevinjwalls
Copy link
Contributor

I'll still fix this. So, I should change the PR title to match JDK-8226919, and issue an /issue remove command for JDK-8307977, is that correct?

Yes exactly, thanks.

@slovdahl slovdahl changed the title 8307977: jcmd and jstack broken for target processes running with elevated capabilities 8226919: attach in linux hangs due to permission denied accessing /proc/pid/root Feb 9, 2024
@slovdahl
Copy link
Contributor Author

slovdahl commented Feb 9, 2024

/issue remove JDK-8307977

@openjdk
Copy link

openjdk bot commented Feb 9, 2024

@slovdahl This PR does not contain any additional solved issues that can be removed. To remove the primary solved issue, simply edit the title of this PR.

@slovdahl
Copy link
Contributor Author

slovdahl commented Feb 9, 2024

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Feb 9, 2024
@openjdk
Copy link

openjdk bot commented Feb 9, 2024

@slovdahl
Your change (at version f1848b9) is now ready to be sponsored by a Committer.

@kevinjwalls
Copy link
Contributor

/sponsor

@openjdk
Copy link

openjdk bot commented Feb 9, 2024

Going to push as commit ac4607e.
Since your change was applied there have been 139 commits pushed to the master branch:

  • b42b888: 8325038: runtime/cds/appcds/ProhibitedPackage.java can fail with UseLargePages
  • 6944537: 8325203: System.exit(0) kills the launched 3rd party application
  • 4368437: 8325264: two compiler/intrinsics/float16 tests fail after JDK-8324724
  • 4a3a38d: 8325517: Shenandoah: Reduce unnecessary includes from shenandoahControlThread.cpp
  • 40708ba: 8325563: Remove unused Space::is_in
  • 29d89d4: 8325551: Remove unused obj_is_alive and block_start in Space
  • 8ef918d: 8324646: Avoid Class.forName in SecureRandom constructor
  • 69b2674: 8324648: Avoid NoSuchMethodError when instantiating NativePRNG
  • 52d4976: 8325437: Safepoint polling in monitor deflation can cause massive logs
  • 8b70b8d: 8325440: Confusing error reported for octal literals with wrong digits
  • ... and 129 more: https://git.openjdk.org/jdk/compare/a1d65eb6d87ff9019a9a92a775213be2a8b60fd1...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 9, 2024
@openjdk openjdk bot closed this Feb 9, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Feb 9, 2024
@openjdk
Copy link

openjdk bot commented Feb 9, 2024

@kevinjwalls @slovdahl Pushed as commit ac4607e.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@slovdahl slovdahl deleted the 8307977-fix-dynamic-attach-for-target-process-with-elevated-capability branch February 9, 2024 18:39
@slovdahl
Copy link
Contributor Author

slovdahl commented Feb 9, 2024

Thank you @kevinjwalls and @jerboaa for reviewing and guiding me through this process, this was a great as a first-time JDK contributor!

One more question, can I do anything to help getting this backported to e.g. 21 and 17?

@jerboaa
Copy link
Contributor

jerboaa commented Feb 12, 2024

One more question, can I do anything to help getting this backported to e.g. 21 and 17?

First, I suggest to wait a few weeks in order to see if there are any follow-up bugs which show up in testing in mainline. Then start backporting it to 22u, then 21u, then 17u (in that order). A few references:

https://openjdk.org/guide/#backporting
https://wiki.openjdk.org/display/JDKUpdates/JDK+21u

@jdoylei
Copy link

jdoylei commented Feb 28, 2024

@slovdahl - Apologies for adding a comment to a closed Pull Request, but I happened on https://bugs.openjdk.org/browse/JDK-8307977 via the earlier https://bugs.openjdk.org/browse/JDK-8179498 after researching "AttachNotSupportedException: Unable to open socket file" and troubleshooting our own OpenJDK 17 jcmd setup on top of containers and Kubernetes. Reading the code changes and discussion here, I'm concerned that this change, which I understand is not yet in OpenJDK 17, might cause a regression with our setup.

We're running jcmd (OpenJDK build 17.0.10+7-LTS) and the target JVM in two separate containers in a Kubernetes pod. The target JVM container is already running, and then we use kubectl debug --target=... to start a Kubernetes debug container with jcmd that targets the first container. Given the --target option, they share the same Linux process namespace (both think the target JVM is PID 1). But since they are separate containers, they see different root filesystems (jcmd container sees the target JVM tmpdir under /proc/1/root/tmp but has its own distinct /tmp directory).

I believe the attach file and socket file paths then work like this in OpenJDK 17:

  • jcmd creates the .attach_pid1 attach file without issues using /proc/1/cwd
  • Target JVM finds the .attach_pid1 attach file in its cwd.
  • Target JVM creates the .java_pid1 socket file in its tmpdir /tmp
  • jcmd finds the .java_pid1 socket file in /proc/1/root/tmp

I think this scenario with a Kubernetes debug container may be a little different from other Docker container scenarios because these are two different containers with different root filesystems but the same Linux process namespace. So jcmd using /proc/<pid>/root is necessary to find the socket file, even though jcmd and the target JVM both agree the PID is the same (1). A similar scenario with just Docker Engine is described at docker container run - Example, join another container's PID namespace.

If I understand the code change for this PR, I think it will change the behavior in this scenario, because findSocketFile will have pid == ns_pid, and now will use /tmp instead of /proc/<pid>/root/tmp, based on findTargetProcessTmpDirectory.

We are lucky currently that the only place the current OpenJDK 17 code checks pid == ns_pid is the createAttachFile catch block that runs if /proc/<pid>/cwd/.attach_pid<ns_pid> can't be created, since as long as /proc/<pid>/cwd works, we are fine. But the pid != ns_pid check there makes an assumption that if the process namespace is the same, the root filesystem is the same and /tmp can be used, so this catch block wouldn't work if we were to hit it in our scenario. I think propagating this catch block logic into findSocketFile will break our scenario - it will force using /tmp/.java_pid<ns_pid> and that won't work.

Could the findSocketFile logic be made more robust to the different namespace/filesystem scenarios? E.g. attempt /proc/<pid>/root first? Or perhaps there is a way (not pid != ns_pid) to more accurately determine whether / and /proc/<pid>/root are the same filesystem and /tmp is OK?

Thanks for your time!

@kevinjwalls
Copy link
Contributor

If I understand the code change for this PR, I think it will change the behavior in this scenario, because findSocketFile will have pid == ns_pid, and now will use /tmp instead of /proc/<pid>/root/tmp, based on findTargetProcessTmpDirectory.

We are lucky currently that the only place the current OpenJDK 17 code checks pid == ns_pid is the createAttachFile catch block that runs if /proc/<pid>/cwd/.attach_pid<ns_pid> can't be created, since as long as /proc/<pid>/cwd works, we are fine. But the pid != ns_pid check there makes an assumption that if the process namespace is the same, the root filesystem is the same and /tmp can be used, so this catch block wouldn't work if we were to hit it in our scenario. I think propagating this catch block logic into findSocketFile will break our scenario - it will force using /tmp/.java_pid<ns_pid> and that won't work.

Could the findSocketFile logic be made more robust to the different namespace/filesystem scenarios? E.g. attempt /proc/<pid>/root first? Or perhaps there is a way (not pid != ns_pid) to more accurately determine whether / and /proc/<pid>/root are the same filesystem and /tmp is OK?

That is certainly worth capturing in a new JBS bug for investigating a further change. If you can't log one, I'll use the information here to do that, thanks!

@kevinjwalls
Copy link
Contributor

Logged https://bugs.openjdk.org/browse/JDK-8327114 for investigation. Thanks @jdoylei !

@jdoylei
Copy link

jdoylei commented Mar 1, 2024

@kevinjwalls - Perfect, thank you for opening the JBS bug!

@slovdahl
Copy link
Contributor Author

Thanks for the detailed write-up, @jdoylei! I'm sorry to have introduced a regression here. Good that the backporting was held off a bit at least :) Let's continue the discussion at https://mail.openjdk.org/pipermail/serviceability-dev/2024-April/055317.html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integrated Pull request has been integrated serviceability serviceability-dev@openjdk.org

Development

Successfully merging this pull request may close these issues.

8 participants