Skip to content

fix(spring): handle Unicode classpath resource paths#24220

Merged
mshabarov merged 2 commits intovaadin:mainfrom
martinfrancois:fix/11871-unicode-resource-paths
Apr 30, 2026
Merged

fix(spring): handle Unicode classpath resource paths#24220
mshabarov merged 2 commits intovaadin:mainfrom
martinfrancois:fix/11871-unicode-resource-paths

Conversation

@martinfrancois
Copy link
Copy Markdown
Contributor

@martinfrancois martinfrancois commented Apr 30, 2026

Summary

Fix Spring classpath resource matching when an application is located under a directory whose path contains encoded Unicode or decomposed Unicode characters.

CustomResourceLoader previously compared resource paths from URL#getPath(), which can leave parts of the classpath root percent-encoded. When Spring later returned class resources using a decoded path, the resource no longer matched its parent root and startup could fail with Parent resource ... not found in the resources!.

This can happen even when both paths refer to the same filesystem location. The issue is that the compared strings are not in the same representation:

  • URL#getPath() can return a path where non-ASCII characters are still percent-encoded, for example %C3%A7 or %CC%A7.
  • Spring resource resolution can later return the corresponding class resource using a decoded filesystem path.
  • A raw string comparison then fails, even though both paths point to the same location.

This is especially easy to reproduce with decomposed Unicode characters. A decomposed character is represented as a base character plus one or more combining marks instead of a single precomposed code point. For example, ç can be represented either as the single code point U+00E7, or as c plus the combining cedilla U+0327.

Visually the path can look correct, but the encoded URL path and the decoded filesystem path are different strings. The culprit was therefore not Unicode normalization itself, but comparing URL-encoded paths with decoded paths during parent/child classpath resource matching.

This change normalizes resource paths through URL#toURI().getPath() for comparisons, while keeping the original URL path for dev-mode cache keys. It also keeps the existing native-image file:///resources! handling and falls back to the original URL path if URI conversion is not possible.

The key piece needed to make the fix work is using the decoded URI path for comparable resource paths:

resource.getURL().toURI().getPath()

instead of relying on the raw URL path for matching:

resource.getURL().getPath()

This follows the JDK recommendation for URL escaping handling. URL does not itself encode or decode URL components, and the recommended way to manage URL encoding and decoding is to use URI and convert between URL and URI.

Recommended reference:
https://docs.oracle.com/javase/8/docs/api/java/net/URL.html

Closes #11871

Implementation note

The regression test needs to exercise CustomResourceLoader directly. I found existing Flow tests using both patterns: some use reflection to reach private implementation details, and others keep implementation types package-private so same-package tests can instantiate them directly.

I chose to make CustomResourceLoader package-private instead of using reflection, because it keeps the test simpler while still avoiding public API exposure. If the maintainers prefer preserving the private nested class, I can switch the test back to reflection and make CustomResourceLoader private again.

The important behavioral change is limited to the comparable path used for parent/child resource matching. The original URL path is still preserved where the previous encoded form is required, such as dev-mode cache lookup keys.

Testing

  • Reproduced the issue with a minimal Spring Boot/Vaadin application in a project path containing François and a decomposed Unicode segment. Before the fix, the app failed during test startup with Parent resource ... not found in the resources!.
  • Verified the same minimal application starts/tests successfully after installing the fixed local Flow artifacts.
  • Added regression coverage in VaadinServletContextInitializerTest for a classpath root containing Unicode and decomposed Unicode characters.
  • Verified the regression test is meaningful: with only the production fix temporarily reverted, the new test fails with Parent resource ... not found in the resources!; with the fix restored, the same test passes.

Local checks run:

mvn -q -P!install-git-hooks -pl vaadin-spring spotless:check
mvn -q -P!install-git-hooks -pl vaadin-spring -Dtest=VaadinServletContextInitializerTest test
mvn -q -P!install-git-hooks -pl vaadin-spring -Dtest=VaadinServletContextInitializerTest#customResourceLoader_classpathRootContainsUnicodeCombiningCharacter_resourcesAreMatched test
mvn -q -P!install-git-hooks -pl vaadin-spring -am -DskipTests -Dexec.skip=true install

AI Disclosure

Code drafted with OpenClaw/Codex for contributor review. Tests were added and run locally by the assistant. I reviewed the code and this description before opening the PR.

@cla-assistant
Copy link
Copy Markdown

cla-assistant Bot commented Apr 30, 2026

CLA assistant check
All committers have signed the CLA.

@martinfrancois martinfrancois force-pushed the fix/11871-unicode-resource-paths branch from e33142b to 4285f93 Compare April 30, 2026 06:49
@mshabarov mshabarov added the Contribution PRs coming from the community or external to the team label Apr 30, 2026
@sonarqubecloud
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown

Test Results

 1 394 files  ±0   1 394 suites  ±0   1h 16m 40s ⏱️ + 1m 24s
10 068 tests +1   9 998 ✅ +1  70 💤 ±0  0 ❌ ±0 
10 543 runs  +1  10 464 ✅ +1  79 💤 ±0  0 ❌ ±0 

Results for commit 468b05e. ± Comparison against base commit 3fcd491.

@mshabarov mshabarov added this pull request to the merge queue Apr 30, 2026
Merged via the queue into vaadin:main with commit f4c473a Apr 30, 2026
50 of 51 checks passed
@vaadin-bot
Copy link
Copy Markdown
Collaborator

Hi @martinfrancois and @mshabarov, when i performed cherry-pick to this commit to 25.0, i have encountered the following issue. Can you take a look and pick it manually?
Error Message:
Error: Command failed: git cherry-pick f4c473a
error: could not apply f4c473a... fix(spring): handle Unicode classpath resource paths (#24220)
hint: After resolving the conflicts, mark them with
hint: "git add/rm ", then run
hint: "git cherry-pick --continue".
hint: You can instead skip this commit with "git cherry-pick --skip".
hint: To abort and get back to the state before "git cherry-pick",
hint: run "git cherry-pick --abort".

vaadin-bot added a commit that referenced this pull request Apr 30, 2026
….1) (#24231)

This PR cherry-picks changes from the original PR #24220 to branch 25.1.
---
#### Original PR description
> ## Summary
> 
> Fix Spring classpath resource matching when an application is located
under a directory whose path contains encoded Unicode or decomposed
Unicode characters.
> 
> `CustomResourceLoader` previously compared resource paths from
`URL#getPath()`, which can leave parts of the classpath root
percent-encoded. When Spring later returned class resources using a
decoded path, the resource no longer matched its parent root and startup
could fail with `Parent resource ... not found in the resources!`.
> 
> This can happen even when both paths refer to the same filesystem
location. The issue is that the compared strings are not in the same
representation:
> 
> - `URL#getPath()` can return a path where non-ASCII characters are
still percent-encoded, for example `%C3%A7` or `%CC%A7`.
> - Spring resource resolution can later return the corresponding class
resource using a decoded filesystem path.
> - A raw string comparison then fails, even though both paths point to
the same location.
> 
> This is especially easy to reproduce with decomposed Unicode
characters. A decomposed character is represented as a base character
plus one or more combining marks instead of a single precomposed code
point. For example, `ç` can be represented either as the single code
point `U+00E7`, or as `c` plus the combining cedilla `U+0327`.
> 
> Visually the path can look correct, but the encoded URL path and the
decoded filesystem path are different strings. The culprit was therefore
not Unicode normalization itself, but comparing URL-encoded paths with
decoded paths during parent/child classpath resource matching.
> 
> This change normalizes resource paths through `URL#toURI().getPath()`
for comparisons, while keeping the original URL path for dev-mode cache
keys. It also keeps the existing native-image `file:///resources!`
handling and falls back to the original URL path if URI conversion is
not possible.
> 
> The key piece needed to make the fix work is using the decoded URI
path for comparable resource paths:
> 
> ```java
> resource.getURL().toURI().getPath()
> ```
> 
> instead of relying on the raw URL path for matching:
> 
> ```java
> resource.getURL().getPath()
> ```
> 
> This follows the JDK recommendation for URL escaping handling. `URL`
does not itself encode or decode URL components, and the recommended way
to manage URL encoding and decoding is to use `URI` and convert between
`URL` and `URI`.
> 
> Recommended reference:
> https://docs.oracle.com/javase/8/docs/api/java/net/URL.html
> 
> Closes #11871
> 
> ## Implementation note
> 
> The regression test needs to exercise `CustomResourceLoader` directly.
I found existing Flow tests using both patterns: some use reflection to
reach private implementation details, and others keep implementation
types package-private so same-package tests can instantiate them
directly.
> 
> I chose to make `CustomResourceLoader` package-private instead of
using reflection, because it keeps the test simpler while still avoiding
public API exposure. If the maintainers prefer preserving the private
nested class, I can switch the test back to reflection and make
`CustomResourceLoader` private again.
> 
> The important behavioral change is limited to the comparable path used
for parent/child resource matching. The original URL path is still
preserved where the previous encoded form is required, such as dev-mode
cache lookup keys.
> 
> ## Testing
> 
> - Reproduced the issue with a minimal Spring Boot/Vaadin application
in a project path containing `François` and a decomposed Unicode
segment. Before the fix, the app failed during test startup with `Parent
resource ... not found in the resources!`.
> - Verified the same minimal application starts/tests successfully
after installing the fixed local Flow artifacts.
> - Added regression coverage in `VaadinServletContextInitializerTest`
for a classpath root containing Unicode and decomposed Unicode
characters.
> - Verified the regression test is meaningful: with only the production
fix temporarily reverted, the new test fails with `Parent resource ...
not found in the resources!`; with the fix restored, the same test
passes.
> 
> Local checks run:
> 
> ```bash
> mvn -q -P!install-git-hooks -pl vaadin-spring spotless:check
> mvn -q -P!install-git-hooks -pl vaadin-spring
-Dtest=VaadinServletContextInitializerTest test
> mvn -q -P!install-git-hooks -pl vaadin-spring
-Dtest=VaadinServletContextInitializerTest#customResourceLoader_classpathRootContainsUnicodeCombiningCharacter_resourcesAreMatched
test
> mvn -q -P!install-git-hooks -pl vaadin-spring -am -DskipTests
-Dexec.skip=true install
> ```
> 
> ## AI Disclosure
> 
> Code drafted with OpenClaw/Codex for contributor review. Tests were
added and run locally by the assistant. I reviewed the code and this
description before opening the PR.

Co-authored-by: François Martin <f.martin@fastmail.com>
Co-authored-by: Mikhail Shabarov <61410877+mshabarov@users.noreply.github.com>
@martinfrancois
Copy link
Copy Markdown
Contributor Author

@mshabarov thanks for merging it! 🎉 Regarding the bot, is there anything I need to do?

@mshabarov
Copy link
Copy Markdown
Contributor

@martinfrancois thanks for the contribution! No, nothing to do on your side. I'll make the back-port cherry pick to Vaadin 25.0 branch myself.

@martinfrancois
Copy link
Copy Markdown
Contributor Author

@mshabarov thanks! 😊

platosha pushed a commit that referenced this pull request May 6, 2026
#24250)

Cherry pick of #24220 to 25.0

Co-authored-by: François Martin <f.martin@fastmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-picked-25.1 Contribution PRs coming from the community or external to the team need to pick manually 25.0 target/25.0 Cherry-pick to 25.0 branch target/25.1 +0.0.1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot build project when special character presents in the path of the project

3 participants