Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-byte filenames in zip files can cause an endless loop in ZipString.hash #38751

Closed
jhackel-hypo opened this issue Dec 12, 2023 · 9 comments
Assignees
Labels
type: regression A regression from a previous release
Milestone

Comments

@jhackel-hypo
Copy link

jhackel-hypo commented Dec 12, 2023

Our Spring Boot 3.2.0 app hangs for random builds during a very early start-up phase. We see this thread-dump:

"main" #1 prio=5 os_prio=31 cpu=17588,58ms elapsed=17,96s tid=0x0000000145009a00 nid=0x2203 runnable  [0x000000016d49a000]
   java.lang.Thread.State: RUNNABLE
	at org.springframework.boot.loader.zip.ZipString.readInBuffer(ZipString.java:281)
	at org.springframework.boot.loader.zip.ZipString.hash(ZipString.java:112)
	at org.springframework.boot.loader.zip.ZipContent$Loader.add(ZipContent.java:445)
	at org.springframework.boot.loader.zip.ZipContent$Loader.loadContent(ZipContent.java:581)
	at org.springframework.boot.loader.zip.ZipContent$Loader.openAndLoad(ZipContent.java:543)
	at org.springframework.boot.loader.zip.ZipContent$Loader.loadNestedZip(ZipContent.java:537)
	at org.springframework.boot.loader.zip.ZipContent$Loader.load(ZipContent.java:522)
	at org.springframework.boot.loader.zip.ZipContent.open(ZipContent.java:372)
	at org.springframework.boot.loader.zip.ZipContent.open(ZipContent.java:361)
	at org.springframework.boot.loader.jar.NestedJarFileResources.<init>(NestedJarFileResources.java:57)
	at org.springframework.boot.loader.jar.NestedJarFile.<init>(NestedJarFile.java:141)
	at org.springframework.boot.loader.jar.NestedJarFile.<init>(NestedJarFile.java:120)
	at org.springframework.boot.loader.net.protocol.jar.UrlNestedJarFile.<init>(UrlNestedJarFile.java:42)
	at org.springframework.boot.loader.net.protocol.jar.UrlJarFileFactory.createJarFileForNested(UrlJarFileFactory.java:86)
	at org.springframework.boot.loader.net.protocol.jar.UrlJarFileFactory.createJarFile(UrlJarFileFactory.java:55)
	at org.springframework.boot.loader.net.protocol.jar.UrlJarFiles.getOrCreate(UrlJarFiles.java:72)
	at org.springframework.boot.loader.net.protocol.jar.JarUrlConnection.connect(JarUrlConnection.java:289)
	at org.springframework.boot.loader.net.protocol.jar.JarUrlConnection.getJarFile(JarUrlConnection.java:99)
	at org.springframework.boot.loader.net.protocol.jar.JarUrlClassLoader.getJarFile(JarUrlClassLoader.java:185)
	at org.springframework.boot.loader.net.protocol.jar.JarUrlClassLoader.definePackage(JarUrlClassLoader.java:143)
	at org.springframework.boot.loader.net.protocol.jar.JarUrlClassLoader.definePackageIfNecessary(JarUrlClassLoader.java:126)
	at org.springframework.boot.loader.net.protocol.jar.JarUrlClassLoader.loadClass(JarUrlClassLoader.java:99)
	at org.springframework.boot.loader.launch.LaunchedClassLoader.loadClass(LaunchedClassLoader.java:91)
	at java.lang.ClassLoader.loadClass(java.base@17.0.5/ClassLoader.java:520)
	at java.lang.Class.getDeclaredMethods0(java.base@17.0.5/Native Method)
	at java.lang.Class.privateGetDeclaredMethods(java.base@17.0.5/Class.java:3402)
	at java.lang.Class.getDeclaredMethod(java.base@17.0.5/Class.java:2673)
	at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:89)
	at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:53)
	at org.springframework.boot.loader.launch.JarLauncher.main(JarLauncher.java:58)

I am not sure what this code exactly does, but it seems to calculate a hash code of a a file name from a JAR file embedded in the super-jar of our app. The code seems to operate on DataBlock instances of the central ZIP file directory.

The error occurs because the block it is processing happens to only contain the first half of the file name. Unfortunately, the last byte in the data block is the first byte of multi-byte character. It seems, this code tries to read the whole multi-byte character:

if (!hasEnoughBytes(byteIndex, codePointSize, count)) {
pos--;
len++;
break;
}

But the data block seems to be unable to provide the missing byte, because it will not read any further data from the underlying file channel. It returns only the first byte of the multi-byte character again and again. And the hash function hangs.

I am not deep enough into it, do dig what is exactly going on. Furthermore, I am unable to provide the broken JAR as it comes from a closed source project.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Dec 12, 2023
@wilkinsona
Copy link
Member

Thanks for the report.

I am unable to provide the broken JAR as it comes from a closed source project

Any jar that reproduces the problem would be useful so that we can reproduce the problem, or even just a concrete example of the file names that cause the problem.

@wilkinsona wilkinsona added the status: waiting-for-feedback We need additional information before we can continue label Dec 12, 2023
@jhackel-hypo
Copy link
Author

The ZIP record for the file as produced by zipdump is this:

000c8687: PK.0304: 0014 0800 0008 5788834c 04267862 00000fad 00002e4f 0054 0000 | 000c86a5 000c86f9 000c86f9 000c96a6 - de/genopace/bsh/pe/darlehen/validierung/FehlendeOderUngültigeBaufiSmartKosten.class

The path has the length 0x54 == 84. The data block happens to contain only 56 bytes of that path, which is the first half of the character ü.

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Dec 12, 2023
@philwebb
Copy link
Member

I wonder if we've already fixed this with #38572. @jhackel-hypo Any chance you could try 3.2.1-SNAPSHOT before we try anything else?

@philwebb philwebb added status: waiting-for-feedback We need additional information before we can continue and removed status: feedback-provided Feedback has been provided labels Dec 12, 2023
@jhackel-hypo
Copy link
Author

jhackel-hypo commented Dec 13, 2023

I have no way to deliberately build a broken JAR. It just happens from time to time.

On the main branch, this test will also hang ZipStringTest:

@Test
void hashDataBlockWithInCompleteMultiByteCharacterAtTheEnd() throws IOException {
	ByteArrayDataBlock dataBlock = new ByteArrayDataBlock(
			(byte) 100, (byte) 101, (byte) 47, (byte) 103, (byte) 101, (byte) 110,
			(byte) 111, (byte) 112, (byte) 97, (byte) 99, (byte) 101,  (byte) 47,
			(byte) 98, (byte) 115, (byte) 104, (byte) 47, (byte) 112, (byte) 101,
			(byte) 47, (byte) 100, (byte) 97, (byte) 114, (byte) 108, (byte) 101,
			(byte) 104, (byte) 101, (byte) 110, (byte) 47, (byte) 118, (byte) 97,
			(byte) 108, (byte) 105, (byte) 100, (byte) 105, (byte) 101, (byte) 114,
			(byte) 117, (byte) 110, (byte) 103, (byte) 47, (byte) 70, (byte) 101,
			(byte) 104, (byte) 108,	(byte) 101, (byte) 110, (byte) 100, (byte) 101,
			(byte) 79, (byte) 100, (byte) 101, (byte) 114, (byte) 85, (byte) 110,
			(byte) 103,	(byte) -61
	);

	ZipString.hash(null, dataBlock, 0, 87, true);
}

This pretty much the same which ZipContent.Loader.add feeds to the function in my error case, except that there is more data at the beginning of the data block.

I do not really grasp how ZipContent.Loader.add ever deals with file names reaching beyond the end of the data block.

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Dec 13, 2023
@jhackel-hypo
Copy link
Author

It seems to me that ZipContent.Loader.add must never feed a split string to ZipString.hash. In fact, that should be a precondition when calling ZipString.hash with a data block.

@jhackel-hypo
Copy link
Author

jhackel-hypo commented Dec 13, 2023

I think, I do understand the problem now. I am also pretty sure that 8c7e877 is not fixing the problem.

The problem is, that ZipString#hash is assuming, it can read the whole multibyte character by stepping a byte backwards (line 118) and reading from the FileChannelDataBlock again (line 112):

while (len > 0) {
int count = readInBuffer(dataBlock, pos, buffer, len);
len -= count;
pos += count;
for (int byteIndex = 0; byteIndex < count;) {
int codePointSize = getCodePointSize(bytes, byteIndex);
if (!hasEnoughBytes(byteIndex, codePointSize, count)) {
pos--;
len++;
break;
}

But this is not the case. The data block has half of the multibyte character at the last position of its FileChannelDataBlock.ManagedFileChannel#buffer. Stepping one position backward will return this first half of the multibyte character but not the second byte because it is not in the buffer. So, the FileChannelDataBlock#read method will return 1, because a single byte could be read. It will only read more bytes into its buffer, if the caller is asking for a position beyond what is already buffered (line 181):

int read(ByteBuffer dst, long position) throws IOException {
synchronized (this.lock) {
if (position < this.bufferPosition || position >= this.bufferPosition + this.bufferSize) {
fillBuffer(position);
}

But because ZipString.hash is always stepping back one position, it will never go beyond the limits of the buffer. It will receive the same byte again and again.

@jhackel-hypo
Copy link
Author

jhackel-hypo commented Dec 13, 2023

The implementation of ZipString#hash is making the wrong assumption that it can always read two (actually up to four) consecutive bytes from the data block. This is not the case. The clients of DataBlock#read must be able to process the data byte by byte.

@jhackel-hypo
Copy link
Author

I guess a possible solution could be to use DataBlock#readyFully here:

private static int readInBuffer(DataBlock dataBlock, long pos, ByteBuffer buffer, int maxLen) throws IOException {
buffer.clear();
if (buffer.remaining() > maxLen) {
buffer.limit(maxLen);
}
int count = dataBlock.read(buffer, pos);
if (count <= 0) {
throw new EOFException();
}
return count;
}

The returned count would be buffer.position() then.

The method readInBuffer is also called by compare. But that method might be affected by the same problem.

@philwebb philwebb self-assigned this Dec 13, 2023
@philwebb philwebb added type: regression A regression from a previous release and removed status: waiting-for-triage An issue we've not yet triaged status: feedback-provided Feedback has been provided labels Dec 13, 2023
@philwebb philwebb added this to the 3.2.1 milestone Dec 13, 2023
@philwebb philwebb changed the title Endless loop in ZipString.hash for multi-byte filenames Multi-byte filenames in zip files can cause an endless loop in ZipString.hash Dec 13, 2023
@philwebb
Copy link
Member

Thanks very much for the detailed analysis @jhackel-hypo!

I think I've pushed something that will fix this. I'm not keen to move DataBlock#readyFully because those methods are called a lot and I want to keep things as fast and lean as possible.

ndwnu pushed a commit to ndwnu/nls-routing-map-matcher that referenced this issue Apr 10, 2024
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [org.apache.maven.plugins:maven-surefire-plugin](https://maven.apache.org/surefire/) | build | patch | `3.2.2` -> `3.2.5` |
| [org.apache.maven.plugins:maven-failsafe-plugin](https://maven.apache.org/surefire/) | build | patch | `3.2.2` -> `3.2.5` |
| [org.springframework.boot:spring-boot-starter-parent](https://spring.io/projects/spring-boot) ([source](https://github.com/spring-projects/spring-boot)) | parent | patch | `3.2.0` -> `3.2.1` |

---

### Release Notes

<details>
<summary>spring-projects/spring-boot (org.springframework.boot:spring-boot-starter-parent)</summary>

### [`v3.2.1`](https://github.com/spring-projects/spring-boot/releases/tag/v3.2.1)

[Compare Source](spring-projects/spring-boot@v3.2.0...v3.2.1)

#### 🐞 Bug Fixes

-   HibernateJpaAutoConfiguration should be applied before DataSourceTransactionManagerAutoConfiguration [#&#8203;38880](spring-projects/spring-boot#38880)
-   META-INF entries are duplicated under BOOT-INF/classes causing "Conflicting persistence unit definitions" error [#&#8203;38862](spring-projects/spring-boot#38862)
-   logging.include-application-name has no effect when using log4j2 [#&#8203;38847](spring-projects/spring-boot#38847)
-   Pulsar authentication param properties cause IllegalStateException with Pulsar Client 3.1.0  [#&#8203;38839](spring-projects/spring-boot#38839)
-   Child context created with SpringApplicationBuilder runs parents runners [#&#8203;38837](spring-projects/spring-boot#38837)
-   getSigners() info is lost for signed jars when using the new loader implementation with requiresUnpack [#&#8203;38833](spring-projects/spring-boot#38833)
-   TestContainers parallel initialization doesn't work properly  [#&#8203;38831](spring-projects/spring-boot#38831)
-   Zip file closed exceptions can be thrown due to StaticResourceJars closing jars from cached connections [#&#8203;38770](spring-projects/spring-boot#38770)
-   Multi-byte filenames in zip files can cause an endless loop in ZipString.hash [#&#8203;38751](spring-projects/spring-boot#38751)
-   Gradle task "bootJar" fails with "Failed to get permissions" when using Gradle 8.6-milestone-1 [#&#8203;38741](spring-projects/spring-boot#38741)
-   Custom binding converters are ignored when working with collection types [#&#8203;38734](spring-projects/spring-boot#38734)
-   WebFlux and resource server auto-configuration may fail due to null authentication manager [#&#8203;38713](spring-projects/spring-boot#38713)
-   It is unclear that Docker Compose services have not been started as one or more is already run...
ndwlocatieservices added a commit to ndwnu/nls-routing-map-matcher that referenced this issue Apr 16, 2024
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [org.apache.maven.plugins:maven-surefire-plugin](https://maven.apache.org/surefire/) | build | patch | `3.2.2` -> `3.2.5` |
| [org.apache.maven.plugins:maven-failsafe-plugin](https://maven.apache.org/surefire/) | build | patch | `3.2.2` -> `3.2.5` |
| [org.springframework.boot:spring-boot-starter-parent](https://spring.io/projects/spring-boot) ([source](https://github.com/spring-projects/spring-boot)) | parent | patch | `3.2.0` -> `3.2.1` |

---

### Release Notes

<details>
<summary>spring-projects/spring-boot (org.springframework.boot:spring-boot-starter-parent)</summary>

### [`v3.2.1`](https://github.com/spring-projects/spring-boot/releases/tag/v3.2.1)

[Compare Source](spring-projects/spring-boot@v3.2.0...v3.2.1)

#### 🐞 Bug Fixes

-   HibernateJpaAutoConfiguration should be applied before DataSourceTransactionManagerAutoConfiguration [#&#8203;38880](spring-projects/spring-boot#38880)
-   META-INF entries are duplicated under BOOT-INF/classes causing "Conflicting persistence unit definitions" error [#&#8203;38862](spring-projects/spring-boot#38862)
-   logging.include-application-name has no effect when using log4j2 [#&#8203;38847](spring-projects/spring-boot#38847)
-   Pulsar authentication param properties cause IllegalStateException with Pulsar Client 3.1.0  [#&#8203;38839](spring-projects/spring-boot#38839)
-   Child context created with SpringApplicationBuilder runs parents runners [#&#8203;38837](spring-projects/spring-boot#38837)
-   getSigners() info is lost for signed jars when using the new loader implementation with requiresUnpack [#&#8203;38833](spring-projects/spring-boot#38833)
-   TestContainers parallel initialization doesn't work properly  [#&#8203;38831](spring-projects/spring-boot#38831)
-   Zip file closed exceptions can be thrown due to StaticResourceJars closing jars from cached connections [#&#8203;38770](spring-projects/spring-boot#38770)
-   Multi-byte filenames in zip files can cause an endless loop in ZipString.hash [#&#8203;38751](spring-projects/spring-boot#38751)
-   Gradle task "bootJar" fails with "Failed to get permissions" when using Gradle 8.6-milestone-1 [#&#8203;38741](spring-projects/spring-boot#38741)
-   Custom binding converters are ignored when working with collection types [#&#8203;38734](spring-projects/spring-boot#38734)
-   WebFlux and resource server auto-configuration may fail due to null authentication manager [#&#8203;38713](spring-projects/spring-boot#38713)
-   It is unclear that Docker Compose services have not been started as one or more is already run...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: regression A regression from a previous release
Projects
None yet
Development

No branches or pull requests

4 participants