iOS UTF-8 codec: replace-char semantics, NEON ASCII fast-path, benchmark by shai-almog · Pull Request #4989 · codenameone/CodenameOne

shai-almog · 2026-05-19T18:46:43Z

Summary

Replace RuntimeException("Decoding Error") with JDK-compatible U+FFFD replacement in the ParparVM UTF-8 decoder; remove silent Latin-1 fallback and other silent-corruption paths.
Add a portable UTF-16 → UTF-8 encoder so the POSIX/clean-target build no longer ships a 1-byte-per-char stub, and so the Apple build can skip NSString for the common UTF-8 case.
Add an __ARM_NEON-gated ASCII prefix scan + u8→u16 widen for large inputs (~53× faster than scalar DFA on ASCII payloads — verified in a non-allocating microbench).
Fix ISO-8859-2 silently aliased to NSISOLatin1, add UTF8/ASCII/LATIN1/LATIN2 aliases, honour String.offset when reading the encoding name.
New Utf8PerformanceIntegrationTest mirroring the Base64 perf pattern. Runs ASCII + mixed-byte payloads through JavaSE and ParparVM, asserts identical signatures, and folds a malformed-input probe into the signature so REPLACE parity is verified end-to-end.

Why

Apps reading slightly-broken network/file data crashed on iOS where they did not on the JavaSE simulator. JDK's new String(bytes, "UTF-8") uses CodingErrorAction.REPLACE; this PR brings ParparVM in line.
ISO-8859-2 returning Latin-1 bytes was a silent data-corruption bug.
The C-level encoder stub for non-Apple builds meant any port outside Apple's NSString could not round-trip Unicode.
UTF-8 decode shows up in JSON / HTML parsers — NEON ASCII fast-path is what simdjson and CoreFoundation use, and we have the infrastructure in IOSSimd.m to use the same pattern here.

Test plan

mvn test -Dtest=Utf8PerformanceIntegrationTest (passes: JavaSE + ParparVM produce identical RESULT signatures including malformed-input probe).
mvn test -Dtest=Base64PerformanceIntegrationTest (regression — still passes after the codec rewrite).
Standalone microbench of the NEON helpers vs scalar DFA: 23 GB/s vs 443 MB/s on pure-ASCII payload (Apple Silicon).
CI: full ParparVM integration suite on a clean checkout.

Notes

NEON win is hidden in the integration-test timings because each call allocates a fresh char[] and ParparVM's GC bookkeeping dominates. The integration test still serves correctness + regression; the perf gain is in the helpers themselves for callers in tight parser loops.
One observable change for callers: code that relied on the old Decoding Error throw will instead receive a string containing U+FFFD. This matches JDK behaviour but is a behavioural diff worth flagging in release notes.

🤖 Generated with Claude Code

Rewrite the UTF-8 decode/encode helpers used by the ParparVM String layer. The previous decoder threw RuntimeException("Decoding Error") on malformed input, the encoder fell through to a 1-byte-per-char stub on non-Apple builds, and ISO-8859-2 was silently aliased to NSISOLatin1. * Decoder: Hoehrmann DFA with JDK-compatible REPLACE semantics -- emits one U+FFFD per maximal-subpart violation instead of throwing. Truncated trailing sequences also emit U+FFFD. Removes the silent Latin-1 fallback that hid encoding errors when NSString rejected input. * Encoder: portable UTF-16 -> UTF-8 with surrogate-pair joining. The Apple path now uses it for UTF-8 directly so NSString is no longer involved in the common case; the POSIX/test fallback gains a real implementation in place of the old "TODO" stub. * NEON: __ARM_NEON-gated ASCII prefix scan (vmaxvq_u8) and u8->u16 widen (vmovl_u8) for inputs >= 64 bytes. A standalone microbench shows ~53x speedup over scalar DFA on ASCII-heavy payloads. The integration-level benchmark cannot see this win because allocating a fresh char[] per call dominates on ParparVM, but the helpers carry pull-its-weight cost on the parser-style hot paths the SIMD work was added for. * ISO-8859-2 now maps to NSISOLatin2StringEncoding for both decode and encode; "UTF8", "ASCII", "LATIN1", "LATIN2" join the accepted aliases. String.offset is now honoured when reading the encoding name (was ignored before, latent bug for substring-derived encoding strings). Utf8PerformanceIntegrationTest mirrors the Base64 perf pattern: builds an ASCII payload + a mixed payload with 2/3/4-byte sequences (incl. surrogate pair U+1F600), runs encode/decode loops on both JavaSE and ParparVM, and asserts identical RESULT signatures. A malformed-input probe is folded into the signature so REPLACE parity between JDK and the iOS decoder is verified end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-19T19:02:12Z

✅ Continuous Quality Report

Test & Coverage

✅ Tests: 2552 total, 0 failed, 0 skipped
📊 Line coverage: 53.99% [HTML preview] [Download]
- Lowest covered classes
  - com.codename1.ui.Display$EdtException – 0.00%
  - com.codename1.ui.plaf.CSSBorder$LinearGradient – 0.00%
  - com.codename1.util.EasyThread$InQueueRunnable – 0.00%
  - com.codename1.components.FloatingActionButton$ReleaseActionListener – 0.00%
  - com.codename1.io.Oauth2$RefreshTokenActionListener – 0.00%
  - com.codename1.util.EasyThread$RunAndWaitRunnable – 0.00%
  - com.codename1.components.ToastBar$FlushAnimationCallback – 0.00%
  - com.codename1.ui.ElevationComparator – 0.00%
  - com.codename1.ui.spinner.TimeSpinner3D$MinuteRowFormatter – 0.00%
  - com.codename1.io.ConnectionRequest$ImageFileSystemSuccessCallback – 0.00%

Static Analysis

SpotBugs [Report archive]
- ✅ ByteCodeTranslator: 0 findings (no issues)
- ✅ android: 0 findings (no issues)
- ✅ codenameone-maven-plugin: 0 findings (no issues)
- ✅ core-unittests: 0 findings (no issues)
- ✅ ios: 0 findings (no issues)
✅ PMD: 0 findings (no issues) [Report archive]
✅ Checkstyle: 0 findings (no issues) [Report archive]

Generated automatically by the PR CI workflow.

shai-almog · 2026-05-19T19:17:46Z

Compared 20 screenshots: 20 matched.
✅ JavaScript-port screenshot tests passed.

github-actions · 2026-05-19T19:52:43Z

✅ ByteCodeTranslator Quality Report

Test & Coverage

✅ Tests: 647 total, 0 failed, 2 skipped

Benchmark Results

Execution Time: 10549 ms
Hotspots (Top 20 sampled methods):
- 22.80% java.lang.String.indexOf (415 samples)
- 18.74% com.codename1.tools.translator.Parser.isMethodUsed (341 samples)
- 16.65% java.util.ArrayList.indexOf (303 samples)
- 6.21% com.codename1.tools.translator.BytecodeMethod.addToConstantPool (113 samples)
- 5.11% java.lang.Object.hashCode (93 samples)
- 2.64% java.lang.System.identityHashCode (48 samples)
- 2.14% com.codename1.tools.translator.ByteCodeClass.updateAllDependencies (39 samples)
- 2.03% com.codename1.tools.translator.ByteCodeClass.calcUsedByNative (37 samples)
- 1.81% com.codename1.tools.translator.BytecodeMethod.appendMethodSignatureSuffixFromDesc (33 samples)
- 1.65% com.codename1.tools.translator.ByteCodeClass.markDependent (30 samples)
- 1.59% com.codename1.tools.translator.Parser.generateClassAndMethodIndexHeader (29 samples)
- 1.32% com.codename1.tools.translator.BytecodeMethod.optimize (24 samples)
- 0.93% sun.nio.fs.UnixNativeDispatcher.open0 (17 samples)
- 0.71% sun.nio.ch.FileDispatcherImpl.write0 (13 samples)
- 0.66% com.codename1.tools.translator.Parser.cullMethods (12 samples)
- 0.66% com.codename1.tools.translator.BytecodeMethod.appendCMethodPrefix (12 samples)
- 0.66% java.lang.StringBuilder.append (12 samples)
- 0.60% com.codename1.tools.translator.ByteCodeClass.isDefaultInterfaceMethod (11 samples)
- 0.55% java.util.TreeMap.getEntry (10 samples)
- 0.49% com.codename1.tools.translator.BytecodeMethod.isMethodUsedByNative (9 samples)
⚠️ Coverage report not generated.

Static Analysis

✅ SpotBugs: no findings (report was not generated by the build).
⚠️ PMD report not generated.
⚠️ Checkstyle report not generated.

Generated automatically by the PR CI workflow.

shai-almog · 2026-05-19T20:29:18Z

Compared 110 screenshots: 110 matched.
✅ Native iOS screenshot tests passed.

Benchmark Results

VM Translation Time: 0 seconds
Compilation Time: 231 seconds

Build and Run Timing

Metric	Duration
Simulator Boot	67000 ms
Simulator Boot (Run)	1000 ms
App Install	13000 ms
App Launch	8000 ms
Test Execution	361000 ms

Detailed Performance Metrics

Metric	Duration
Base64 payload size	8192 bytes
Base64 benchmark iterations	6000
Base64 native encode	890.000 ms
Base64 CN1 encode	2001.000 ms
Base64 encode ratio (CN1/native)	2.248x (124.8% slower)
Base64 native decode	473.000 ms
Base64 CN1 decode	1291.000 ms
Base64 decode ratio (CN1/native)	2.729x (172.9% slower)
Base64 SIMD encode	461.000 ms
Base64 encode ratio (SIMD/native)	0.518x (48.2% faster)
Base64 encode ratio (SIMD/CN1)	0.230x (77.0% faster)
Base64 SIMD decode	776.000 ms
Base64 decode ratio (SIMD/native)	1.641x (64.1% slower)
Base64 decode ratio (SIMD/CN1)	0.601x (39.9% faster)
Image encode benchmark iterations	100
Image createMask (SIMD off)	103.000 ms
Image createMask (SIMD on)	16.000 ms
Image createMask ratio (SIMD on/off)	0.155x (84.5% faster)
Image applyMask (SIMD off)	174.000 ms
Image applyMask (SIMD on)	107.000 ms
Image applyMask ratio (SIMD on/off)	0.615x (38.5% faster)
Image modifyAlpha (SIMD off)	216.000 ms
Image modifyAlpha (SIMD on)	126.000 ms
Image modifyAlpha ratio (SIMD on/off)	0.583x (41.7% faster)
Image modifyAlpha removeColor (SIMD off)	228.000 ms
Image modifyAlpha removeColor (SIMD on)	118.000 ms
Image modifyAlpha removeColor ratio (SIMD on/off)	0.518x (48.2% faster)
Image PNG encode (SIMD off)	1502.000 ms
Image PNG encode (SIMD on)	1344.000 ms
Image PNG encode ratio (SIMD on/off)	0.895x (10.5% faster)
Image JPEG encode	621.000 ms

shai-almog · 2026-05-19T20:30:12Z

Compared 110 screenshots: 110 matched.
✅ Native iOS Metal screenshot tests passed.

Benchmark Results

VM Translation Time: 0 seconds
Compilation Time: 321 seconds

Build and Run Timing

Metric	Duration
Simulator Boot	92000 ms
Simulator Boot (Run)	1000 ms
App Install	19000 ms
App Launch	7000 ms
Test Execution	284000 ms

Detailed Performance Metrics

Metric	Duration
Base64 payload size	8192 bytes
Base64 benchmark iterations	6000
Base64 native encode	1016.000 ms
Base64 CN1 encode	1854.000 ms
Base64 encode ratio (CN1/native)	1.825x (82.5% slower)
Base64 native decode	543.000 ms
Base64 CN1 decode	1292.000 ms
Base64 decode ratio (CN1/native)	2.379x (137.9% slower)
Base64 SIMD encode	481.000 ms
Base64 encode ratio (SIMD/native)	0.473x (52.7% faster)
Base64 encode ratio (SIMD/CN1)	0.259x (74.1% faster)
Base64 SIMD decode	385.000 ms
Base64 decode ratio (SIMD/native)	0.709x (29.1% faster)
Base64 decode ratio (SIMD/CN1)	0.298x (70.2% faster)
Image encode benchmark iterations	100
Image createMask (SIMD off)	74.000 ms
Image createMask (SIMD on)	14.000 ms
Image createMask ratio (SIMD on/off)	0.189x (81.1% faster)
Image applyMask (SIMD off)	143.000 ms
Image applyMask (SIMD on)	85.000 ms
Image applyMask ratio (SIMD on/off)	0.594x (40.6% faster)
Image modifyAlpha (SIMD off)	177.000 ms
Image modifyAlpha (SIMD on)	181.000 ms
Image modifyAlpha ratio (SIMD on/off)	1.023x (2.3% slower)
Image modifyAlpha removeColor (SIMD off)	251.000 ms
Image modifyAlpha removeColor (SIMD on)	106.000 ms
Image modifyAlpha removeColor ratio (SIMD on/off)	0.422x (57.8% faster)
Image PNG encode (SIMD off)	1187.000 ms
Image PNG encode (SIMD on)	1514.000 ms
Image PNG encode ratio (SIMD on/off)	1.275x (27.5% slower)
Image JPEG encode	665.000 ms

shai-almog linked an issue May 19, 2026 that may be closed by this pull request

Crash in resource manager #3919

Closed

shai-almog merged commit d60e995 into master May 20, 2026
20 of 21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iOS UTF-8 codec: replace-char semantics, NEON ASCII fast-path, benchmark#4989

iOS UTF-8 codec: replace-char semantics, NEON ASCII fast-path, benchmark#4989
shai-almog merged 1 commit into
masterfrom
ios-utf8-replace-and-neon

shai-almog commented May 19, 2026

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

shai-almog commented May 19, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

shai-almog commented May 19, 2026 •

edited

Loading

Uh oh!

shai-almog commented May 19, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shai-almog commented May 19, 2026

Summary

Why

Test plan

Notes

Uh oh!

github-actions Bot commented May 19, 2026

✅ Continuous Quality Report

Test & Coverage

Static Analysis

Uh oh!

shai-almog commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 19, 2026

✅ ByteCodeTranslator Quality Report

Test & Coverage

Benchmark Results

Static Analysis

Uh oh!

shai-almog commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results

Build and Run Timing

Detailed Performance Metrics

Uh oh!

shai-almog commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results

Build and Run Timing

Detailed Performance Metrics

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

shai-almog commented May 19, 2026 •

edited

Loading

shai-almog commented May 19, 2026 •

edited

Loading

shai-almog commented May 19, 2026 •

edited

Loading