Skip to content

Leaking file descriptor / socket within DomainSocket tooling #32423

@knalli

Description

@knalli

This is the result of a long journey searching for a very strange leaking issue. I hope I have put everything together again, found the right address. Otherwise please ask for more details.

So, let's start of my finding: Probably this affects the component "buildpack" (unsure how you named this), but not the actual "main" framework/boot library system.

As far as I can say, I would say that the call chain of

# org.springframework.boot.buildpack.platform.socket
DomainSocket.get(String path)
-> LinuxDomainSocket(String path)
-> DomainSocket(String path)
-> #open(String path)

results into a resource leak under the condition the path does not exist. I can see there is native code involved, because the exception happens in the connect method:

Caused by: com.sun.jna.LastErrorException: [2] No such file or directory
 at com.github.dockerjava.transport.LinuxDomainSocket.connect(Native Method)
 at com.github.dockerjava.transport.LinuxDomainSocket.connect(LinuxDomainSocket.java:49)
 at com.github.dockerjava.transport.DomainSocket.open(DomainSocket.java:69)
 at com.github.dockerjava.transport.DomainSocket.<init>(DomainSocket.java:59)

Yes, the package differs but it is only shaded (as far as I can say it is identical).

I have a demo provided at this repository.

// socket exist
BEGIN = Current open files: 70
Creating socket...
INIT = Current open files: 73
Closing socket..
EXIT = Current open files: 72

// socket does not exist
BEGIN = Current open files: 70
Creating socket...
Error: com.sun.jna.LastErrorException: [2] No such file or directory
INIT = Current open files: 73
No open socket
EXIT = Current open files: 73

Please have a look at the open files counter (which is based on the same as FileDescriptorMetrics).


I have found this class while searching for a leak when using the library docker-java which have included these classes (credits of Phil Webb left, so I found the origin here). Because you (the project) are the actual author, I guess this issue is better addresses here. If the finding is correct and/or we find a solution, I will follow-up for the required changes to them.


The symptom is the system (linux) reports for the java process with lsof -p <pid> | grep STREAM multiple entries

java    1 user  240u     unix 0xffff9b1e9739f080       0t0 4063746316 type=STREAM

A couple of them are fine, but this list grows only in one of our applications. I took some time isolating the actual culprit, at the end speaking for this context the application does (reduced): call each minute DomainSocket.get("/var/lib/docker.sock"), but this socket does not exist on this system. (For sake of simplicity I have reduced the dynamic aspect here.)

On every "call", the list grows for more entries (which means more open files). The counter "open files" in the java runtime grows as well. The (linux) system reports this files as sockets without any meta information except the mapped pid.


I am not sure whether the file descriptor should stay (I would say no, because the exception is thrown) or the call must be guarded with "file exist" check. However, I guess this should be part of at least the DomainSocket class?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions