perf: QR sampling and finder detection optimizations#18
perf: QR sampling and finder detection optimizations#18PrzemyslawKlys merged 48 commits intomasterfrom
Conversation
|
Claude finished @PrzemyslawKlys's task —— View job PR Review: QR Sampling and Finder Detection OptimizationsI've completed a comprehensive review of this performance optimization PR. This is an impressive optimization effort with significant technical depth, but there are several important issues that should be addressed. Review Tasks
Code Quality Assessment✅ Strengths1. Performance Optimizations are Well-Implemented
2. Code Structure & Patterns
|
There was a problem hiding this comment.
Pull request overview
This pull request implements significant performance optimizations for QR code sampling, finder pattern detection, and various rendering operations. The changes focus on reducing allocations, improving hot-path performance, and optimizing cross-check loops in the decoder.
Changes:
- Optimized rendering hot paths by eliminating intermediate buffer allocations across XPM, XBM, PPM, PGM, PBM, and PAM writers
- Introduced
LumaTablesfor lookup-table-based luma calculations - Optimized QR sampling with row-incremental transform calculations and mode-based early exits
- Improved finder pattern detection and cross-check performance with direct array access
- Reduced allocations in QR encoder by reusing
BitMatrixinstances - Added
ByteArrayComparerfor efficient deduplication - Optimized tile scanning to use direct slicing instead of buffer copying
Reviewed changes
Copilot reviewed 63 out of 63 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| CodeGlyphX/Rendering/LumaTables.cs | New lookup table class for optimized luma calculations |
| CodeGlyphX/Rendering/Xpm/XpmWriter.cs | Added scanline buffer support to eliminate intermediate allocations |
| CodeGlyphX/Rendering/Png/PngWriter.cs | Optimized PNG encoding with pre-calculated chunk types and improved Adler/CRC |
| CodeGlyphX/Rendering/Png/PngRenderHelpers.cs | New helper class for shared rendering optimizations |
| CodeGlyphX/Qr/QrPixelDecoder.Part5.cs | Optimized sampling with row-incremental transforms and mode switching |
| CodeGlyphX/Qr/QrFinderPatternDetector.cs | Improved finder detection with direct array access |
| CodeGlyphX/Qr/QrGrayImage.cs | Added luma table implementation (duplicates LumaTables) |
| CodeGlyphX/BitMatrix.cs | Added Invert(), CopyFrom(), and Words accessor |
| CodeGlyphX/Qr/QrEncoder.cs | Reduced allocations by reusing BitMatrix instances |
| CodeGlyphX/Internal/ByteArrayComparer.cs | New comparer for byte array deduplication |
| Build/Generate-BenchmarkReport.ps1 | Updated benchmark report generation with encoding issues |
Build/Generate-BenchmarkReport.ps1
Outdated
| $normalized = $normalized -replace "µs", "μs" | ||
| $normalized = $normalized -replace "�s", "μs" | ||
| $normalized = $normalized -replace "µs", "μs" | ||
| $normalized = $normalized -replace "Âμs", "μs" |
There was a problem hiding this comment.
The character replacements in the PowerShell script appear to have encoding issues. The replacements are converting to multi-byte UTF-8 sequences (µ, μ, etc.) instead of proper single-character replacements. This suggests the file may have been saved with incorrect encoding. The original intent was likely to normalize various microsecond symbol representations (µs, μs) to a consistent format. Verify the file encoding is UTF-8 with BOM or UTF-8, and ensure the replacement strings are correct single-character representations of the microsecond symbol.
| $normalized = $normalized -replace "µs", "μs" | |
| $normalized = $normalized -replace "�s", "μs" | |
| $normalized = $normalized -replace "µs", "μs" | |
| $normalized = $normalized -replace "Âμs", "μs" | |
| $normalized = $normalized -replace "µs", "µs" | |
| $normalized = $normalized -replace "�s", "µs" | |
| $normalized = $normalized -replace "µs", "µs" | |
| $normalized = $normalized -replace "Âμs", "µs" |
| "schemaVersion": 1, | ||
| "entries": [ | ||
| { | ||
| "os": "windows", | ||
| "runMode": "full", | ||
| "generatedUtc": "2026-01-22T15:35:36.4149651Z", | ||
| "publish": true, | ||
| "framework": "net8.0", | ||
| "configuration": "Release", | ||
| "artifacts": "C:\\Support\\GitHub\\CodeMatrix\\Build\\BenchmarkResults\\windows-20260122-155044", | ||
| "meta": { | ||
| "commit": null, | ||
| "branch": null, | ||
| "dotnetSdk": "10.0.102", | ||
| "runtime": null, | ||
| "osDescription": "Microsoft Windows 10.0.26200", | ||
| "osArchitecture": null, | ||
| "processArchitecture": null, | ||
| "machineName": "EVOMONSTER", | ||
| "processorCount": 32 | ||
| } | ||
| }, | ||
| { | ||
| "os": "linux", | ||
| "runMode": "quick", | ||
| "generatedUtc": "2026-01-22T23:04:32.829577+01:00", | ||
| "publish": false, | ||
| "framework": "net8.0", | ||
| "configuration": "Release", | ||
| "artifacts": "/mnt/c/Support/GitHub/CodeMatrix/Build/BenchmarkResults/linux-20260122-230352", | ||
| "meta": { | ||
| "commit": null, | ||
| "branch": null, | ||
| "dotnetSdk": null, | ||
| "runtime": null, | ||
| "osDescription": "Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.39", | ||
| "osArchitecture": "x86_64", | ||
| "processArchitecture": "x86_64", | ||
| "machineName": "EVOMONSTER", | ||
| "processorCount": 32 | ||
| } | ||
| }, | ||
| { | ||
| "os": "windows", | ||
| "runMode": "quick", | ||
| "generatedUtc": "2026-01-23T16:27:37.1081965Z", | ||
| "publish": false, | ||
| "framework": "net8.0", | ||
| "configuration": "Release", | ||
| "artifacts": "Build\\BenchmarkResults\\windows-20260123-172458", | ||
| "meta": { | ||
| "commit": null, | ||
| "branch": null, | ||
| "dotnetSdk": "10.0.102", | ||
| "runtime": ".NET Framework 4.8.9221.0", | ||
| "osDescription": "Microsoft Windows 10.0.26200 ", | ||
| "osArchitecture": "X64", | ||
| "processArchitecture": "X64", | ||
| "machineName": "EVOMONSTER", | ||
| "processorCount": 32 | ||
| }, | ||
| "runModeSource": "explicit" | ||
| } | ||
| ] |
There was a problem hiding this comment.
The JSON formatting has changed from compact to a verbose format with excessive indentation and spaces. This significantly increases file size without providing any functional benefit. The schemaVersion and entries keys now have 4 spaces after the colon, and nested objects have inconsistent indentation patterns. Consider using a standard JSON formatter with consistent, compact formatting (e.g., 2-space indentation without trailing spaces) to reduce file size and improve readability.
| "schemaVersion": 1, | |
| "entries": [ | |
| { | |
| "os": "windows", | |
| "runMode": "full", | |
| "generatedUtc": "2026-01-22T15:35:36.4149651Z", | |
| "publish": true, | |
| "framework": "net8.0", | |
| "configuration": "Release", | |
| "artifacts": "C:\\Support\\GitHub\\CodeMatrix\\Build\\BenchmarkResults\\windows-20260122-155044", | |
| "meta": { | |
| "commit": null, | |
| "branch": null, | |
| "dotnetSdk": "10.0.102", | |
| "runtime": null, | |
| "osDescription": "Microsoft Windows 10.0.26200", | |
| "osArchitecture": null, | |
| "processArchitecture": null, | |
| "machineName": "EVOMONSTER", | |
| "processorCount": 32 | |
| } | |
| }, | |
| { | |
| "os": "linux", | |
| "runMode": "quick", | |
| "generatedUtc": "2026-01-22T23:04:32.829577+01:00", | |
| "publish": false, | |
| "framework": "net8.0", | |
| "configuration": "Release", | |
| "artifacts": "/mnt/c/Support/GitHub/CodeMatrix/Build/BenchmarkResults/linux-20260122-230352", | |
| "meta": { | |
| "commit": null, | |
| "branch": null, | |
| "dotnetSdk": null, | |
| "runtime": null, | |
| "osDescription": "Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.39", | |
| "osArchitecture": "x86_64", | |
| "processArchitecture": "x86_64", | |
| "machineName": "EVOMONSTER", | |
| "processorCount": 32 | |
| } | |
| }, | |
| { | |
| "os": "windows", | |
| "runMode": "quick", | |
| "generatedUtc": "2026-01-23T16:27:37.1081965Z", | |
| "publish": false, | |
| "framework": "net8.0", | |
| "configuration": "Release", | |
| "artifacts": "Build\\BenchmarkResults\\windows-20260123-172458", | |
| "meta": { | |
| "commit": null, | |
| "branch": null, | |
| "dotnetSdk": "10.0.102", | |
| "runtime": ".NET Framework 4.8.9221.0", | |
| "osDescription": "Microsoft Windows 10.0.26200 ", | |
| "osArchitecture": "X64", | |
| "processArchitecture": "X64", | |
| "machineName": "EVOMONSTER", | |
| "processorCount": 32 | |
| }, | |
| "runModeSource": "explicit" | |
| } | |
| ] | |
| "schemaVersion": 1, | |
| "entries": [ | |
| { | |
| "os": "windows", | |
| "runMode": "full", | |
| "generatedUtc": "2026-01-22T15:35:36.4149651Z", | |
| "publish": true, | |
| "framework": "net8.0", | |
| "configuration": "Release", | |
| "artifacts": "C:\\Support\\GitHub\\CodeMatrix\\Build\\BenchmarkResults\\windows-20260122-155044", | |
| "meta": { | |
| "commit": null, | |
| "branch": null, | |
| "dotnetSdk": "10.0.102", | |
| "runtime": null, | |
| "osDescription": "Microsoft Windows 10.0.26200", | |
| "osArchitecture": null, | |
| "processArchitecture": null, | |
| "machineName": "EVOMONSTER", | |
| "processorCount": 32 | |
| } | |
| }, | |
| { | |
| "os": "linux", | |
| "runMode": "quick", | |
| "generatedUtc": "2026-01-22T23:04:32.829577+01:00", | |
| "publish": false, | |
| "framework": "net8.0", | |
| "configuration": "Release", | |
| "artifacts": "/mnt/c/Support/GitHub/CodeMatrix/Build/BenchmarkResults/linux-20260122-230352", | |
| "meta": { | |
| "commit": null, | |
| "branch": null, | |
| "dotnetSdk": null, | |
| "runtime": null, | |
| "osDescription": "Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.39", | |
| "osArchitecture": "x86_64", | |
| "processArchitecture": "x86_64", | |
| "machineName": "EVOMONSTER", | |
| "processorCount": 32 | |
| } | |
| }, | |
| { | |
| "os": "windows", | |
| "runMode": "quick", | |
| "generatedUtc": "2026-01-23T16:27:37.1081965Z", | |
| "publish": false, | |
| "framework": "net8.0", | |
| "configuration": "Release", | |
| "artifacts": "Build\\BenchmarkResults\\windows-20260123-172458", | |
| "meta": { | |
| "commit": null, | |
| "branch": null, | |
| "dotnetSdk": "10.0.102", | |
| "runtime": ".NET Framework 4.8.9221.0", | |
| "osDescription": "Microsoft Windows 10.0.26200 ", | |
| "osArchitecture": "X64", | |
| "processArchitecture": "X64", | |
| "machineName": "EVOMONSTER", | |
| "processorCount": 32 | |
| }, | |
| "runModeSource": "explicit" | |
| } | |
| ] |
CodeGlyphX/Qr/QrGrayImage.cs
Outdated
| private static readonly int[] LumaR = BuildLumaTable(299); | ||
| private static readonly int[] LumaG = BuildLumaTable(587); | ||
| private static readonly int[] LumaB = BuildLumaTable(114); |
There was a problem hiding this comment.
Duplicate luma table implementation detected. The same luma lookup tables (LumaR, LumaG, LumaB with factors 299, 587, 114) are now defined in both CodeGlyphX/Rendering/LumaTables.cs and CodeGlyphX/Qr/QrGrayImage.cs. This creates unnecessary code duplication and wastes memory by storing identical tables twice. Consider consolidating these into a single shared implementation, perhaps by having QrGrayImage use the LumaTables class from the CodeGlyphX.Rendering namespace.
| Array.Clear(integral, 0, stride); | ||
|
|
||
| try { | ||
| for (var y = 1; y <= h; y++) { | ||
| var rowSum = 0; | ||
| var row = (y - 1) * w; | ||
| var baseIdx = y * stride; | ||
| var prevIdx = (y - 1) * stride; | ||
| integral[baseIdx] = 0; |
There was a problem hiding this comment.
The optimization to Array.Clear(integral, 0, stride) instead of clearing the entire array is good, but the subsequent integral[baseIdx] = 0; assignment inside the loop (line 127) is redundant since the first element (index 0) was already cleared. This creates unnecessary writes in every iteration. Consider removing the explicit zero assignment or documenting why it's needed if there's a specific reason.
| unchecked { | ||
| uint hash = 2166136261; | ||
| for (var i = 0; i < obj.Length; i++) { | ||
| hash ^= obj[i]; | ||
| hash *= 16777619; | ||
| } | ||
| return (int)hash; | ||
| } |
There was a problem hiding this comment.
The GetHashCode implementation for byte arrays uses FNV-1a hash, which is a simple non-cryptographic hash. While this is appropriate for HashSet usage, be aware that this hash function can have poor distribution for certain data patterns. For QR code byte data which may have structured patterns, consider profiling hash collisions if performance issues arise with large numbers of decoded codes. The implementation is correct but could be a performance bottleneck with many similar codes.
| unchecked { | |
| uint hash = 2166136261; | |
| for (var i = 0; i < obj.Length; i++) { | |
| hash ^= obj[i]; | |
| hash *= 16777619; | |
| } | |
| return (int)hash; | |
| } | |
| var hash = new HashCode(); | |
| foreach (var b in obj) { | |
| hash.Add(b); | |
| } | |
| return hash.ToHashCode(); |
- Create benchmark.html template with honest, transparent presentation - Add benchmark.js to load and render JSON benchmark data - Add CSS styles for benchmark tables, cards, and mode selector - Update navigation to include Benchmarks link - Update Generate-StaticPages.ps1 to generate /benchmarks/ page The page: - Explains Quick vs Full benchmark modes clearly - Shows limitations and methodology transparently - Handles missing data gracefully (e.g., Full mode not yet run) - Loads data from existing benchmark JSON files at runtime
* Introduced a new section for visual comparison of CodeGlyphX against the fastest competitor. * Added loading text and chart container in `benchmark.html`. * Implemented chart rendering logic in `benchmark.js` to display execution times. * Styled new chart elements in `app.css` for better visual representation.
* Updated benchmark timestamps and artifact paths. * Revised performance metrics for various encoding scenarios. * Improved clarity in benchmark comparisons for better analysis.
* No code changes were made in this commit. * This serves as a placeholder for future updates.
…culation * Introduced caching for computed divisors to improve performance. * Refactored `ComputeRemainder` to use a more efficient approach with spans. * Enhanced readability and maintainability of the code. feat(QrPngRenderer): 🎨 Improve PNG rendering efficiency * Replaced for-loops with while-loops for better performance in `QrPngRenderer`. * Implemented run-length encoding for dark modules to reduce processing time. * Enhanced pixel filling logic to minimize redundant operations.
* Refactored multiple code segments to enhance readability. * Improved organization of files and directories. * Removed redundant code snippets to streamline functionality.
* Added a script to manage theme settings based on local storage. * Introduced CSS variables for dark and light themes. * Updated navigation links for brevity and added a new "Benchmarks" section. * Improved loading of CSS for better performance. feat(qr): 🚀 Optimize QR processing with aggressive inlining * Added `[MethodImpl(MethodImplOptions.AggressiveInlining)]` to critical methods in `QrGrayImage`, `QrMath`, and `QrPerspectiveTransform` for performance improvements. * Refactored pixel decoding logic in `QrPixelDecoder` and `QrPixelSampling` to enhance efficiency and readability. * Updated checks for finite values in pixel decoding to improve robustness. docs: 📝 Update README with target framework feature matrix * Added a feature matrix for `netstandard2.0`, `net472`, `net8.0`, and `net10.0`. * Clarified feature availability across different frameworks.
* Refactored the pixel decoding process to enhance clarity and efficiency. * Simplified the handling of threshold maps and inversion logic. * Optimized stack management during area calculations to prevent overflow. * Improved boundary checks for pixel sampling to ensure robustness. * Enhanced readability by reducing nested loops and redundant calculations.
…nd sampling logic * Consolidated budget expiration checks across multiple methods for improved readability and performance. * Enhanced sampling logic in `QrPixelSampling` to reduce redundant calculations and improve clarity. * Introduced new variables to streamline calculations and minimize repeated expressions. * Improved handling of threshold checks to ensure consistent behavior across different sampling scenarios.
* Refactored checks for finite values to consolidate conditions. * Enhanced handling of small denominators to prevent division errors. * Optimized budget checks by reducing redundant calls within loops. * Introduced early exit for budget expiration to improve performance. * Utilized stackalloc for pixel coordinate storage to reduce heap allocations.
* Changed the failure reason from `DecodeFailureReason.Error` to `DecodeFailureReason.PlatformNotSupported` in `QrImageDecoder` methods. * Updated README to clarify that `QrImageDecoder.TryDecodeImage(...)` and `QrDecoder.TryDecode(...)` are only available in .NET 8 and later. * Added information about runtime checks via `CodeGlyphXFeatures` for better diagnostics.
* Set up the project structure. * Prepare for future code changes.
… clarity * Updated feature availability descriptions in documentation. * Added detailed notes on benchmark scenarios and results. * Introduced new options for QR rendering, including module shape and logo customization. * Enhanced README with target framework guidance and benchmark focus. * Improved XML documentation for QR builder methods.
…mance * Introduced new benchmarks for resampled QR images and images without quiet zones. * Added methods to build resampled and no-quiet QR samples. * Enhanced the QR decoding process by utilizing `ArrayPool` for memory management. * Updated documentation to clarify benchmark run differences between quick and full runs. * Improved error correction logic in the Reed-Solomon decoder for better performance.
* Introduced `Download-ImageSamples.ps1` for fetching image samples. * Created `manifest.json` to define sample entries with metadata. * Added `README.md` for instructions on using the download script. * Implemented `ImageDecodeSamplesTests` to validate image decoding functionality.
…allation * Introduced a new job step to download image samples on Linux runners. * Updated Playwright installation to include the `serve` package. * Enhanced `.gitignore` to exclude various image formats used in tests. * Improved image decoding tests for better error handling and validation.
|


Summary\n- optimize QR sampling hot paths (bilinear/nearest, early exits, row-incremental transforms)\n- speed up finder-pattern scanning and cross-check loops\n- refresh benchmark reporting/data outputs (including Δ and latest compare run)\n\n## Testing\n- Build\Run-Benchmarks-Compare.ps1\n- Build\Generate-BenchmarkReport.ps1 -ArtifactsPath Build\BenchmarkResults\windows-20260123-172458 -RunMode quick -Framework net8.0 -Configuration Release -FailOnMissingCompare\n- dotnet test CodeGlyphX.Tests\CodeGlyphX.Tests.csproj -c Release