ADFA-3813 | Fix OCR image metadata parsing and refactor CV domain#1300
Conversation
📝 WalkthroughRelease Notes - ADFA-3813: OCR Image Metadata Parsing Fix & CV Domain RefactoringFeatures & Fixes
Architecture Refactoring
Dependency Injection Updates
|
| Layer / File(s) | Summary |
|---|---|
Repository contract & implementation cv-image-to-xml/src/main/java/.../VisionRepository.kt, .../VisionRepositoryImpl.kt |
Introduces VisionRepository (init/detect/recognize/isInitialized/release) and VisionRepositoryImpl which delegates to YoloModelSource and OcrSource with appropriate dispatcher usage and lifecycle delegation. |
Detection scaling utility .../domain/DetectionScaler.kt |
New singleton that converts YOLO-normalized DetectionResult rects into pixel Rect for a target resolution, handling zero source dims, clamping, and minimum-size enforcement. |
Text association utilities .../domain/TextAssociator.kt |
New singleton providing assignTextToParents(...) and assignNearbyTextToWidgets(...) to map OCR TextBlocks to visual widgets using overlap, vertical alignment, and proximity scoring with widget-specific cleaning. |
Layout tree construction .../domain/LayoutTreeBuilder.kt |
New builder that groups ScaledBoxes into rows, classifies control rows (radio/checkbox), accumulates vertical groups, and emits LayoutItem nodes (radio/checkbox groups, horizontal rows, simple views). |
Converter & XML generator updates .../domain/YoloToXmlConverter.kt, .../domain/xml/AndroidXmlGenerator.kt |
YoloToXmlConverter now uses DetectionScaler and TextAssociator (constructor drop of LayoutGeometryProcessor); AndroidXmlGenerator uses LayoutTreeBuilder.buildLayoutTree instead of geometry processor. |
Domain use cases .../domain/usecase/PrepareImageUC.kt, RunVisionUC.kt, GenerateXmlUC.kt, ImportPlaceholderImageUC.kt, RemovePlaceholderImageUC.kt |
Adds: PrepareImageUC (decode, EXIF rotate, smart boundary detection → left/right pct), RunVisionUC (detect → resolve → region OCR → merge → filter → parse annotations; reports progress), GenerateXmlUC (wraps converter.generateXmlLayout), Import/Remove placeholder image UCs (delegating to DrawableImportHelper). Cancellation semantics preserved. |
ViewModel & DI wiring .../ui/viewmodel/ComputerVisionViewModel.kt, .../di/ComputerVisionModule.kt |
ViewModel constructor changed to accept VisionRepository + use cases; image loading, detection, XML generation, placeholder import/removal now delegate to use cases; resource cleanup delegates to repository.release(). DI module updated to register VisionRepositoryImpl, OcrSource, RegionOcrProcessor, GenericBoxResolver, and singletons for the new use cases; ViewModel factory wired accordingly. |
Minor value-cleaning tweak .../domain/parser/ValueCleanersImpl.kt |
DrawableCleaner.clean adds post-processing replacing im_age→image and standalone im→image prior to final drawable name normalization. |
Removed/Deleted artifacts (evidence) .../data/repository/ComputerVisionRepository.kt, .../data/repository/ComputerVisionRepositoryImpl.kt, .../domain/LayoutGeometryProcessor.kt |
The old monolithic repository and the LayoutGeometryProcessor file were removed; their responsibilities were split into the layers above. |
sequenceDiagram
participant VM as ComputerVisionViewModel
participant Prepare as PrepareImageUC
participant RunUC as RunVisionUC
participant Repo as VisionRepository
participant Yolo as YoloModelSource
participant OCR as OcrSource
participant Gen as GenerateXmlUC
participant Conv as YoloToXmlConverter
VM->>Prepare: invoke(uri)
Prepare-->>VM: Result<PreparedImage>
VM->>RunUC: invoke(bitmap,leftPct,rightPct,onProgress)
RunUC->>Repo: detectWidgets(bitmap)
Repo->>Yolo: runInference(bitmap)
Yolo-->>Repo: detections
Repo-->>RunUC: Result<List<DetectionResult>>
RunUC->>Repo: recognizeText(bitmap)
Repo->>OCR: recognizeText(bitmap)
OCR-->>Repo: textBlocks
Repo-->>RunUC: Result<List<TextBlock>>
RunUC->>RunUC: resolve/regionOcr/merge/filter/parse
RunUC-->>VM: Result<VisionResult>
VM->>Gen: invoke(detections, annotations, ...)
Gen->>Conv: generateXmlLayout(...)
Conv-->>Gen: Pair<layoutXml, stringsXml>
Gen-->>VM: Result<Pair<String,String>>
Estimated code review effort
🎯 4 (Complex) | ⏱️ ~50 minutes
Possibly related PRs
- appdevforall/CodeOnTheGo#678: Refactors the same cv-image-to-xml feature; shares removal of old repository/geometry processor and introduces related replacements.
- appdevforall/CodeOnTheGo#887: Related left/right guide UI changes that feed guide percentages into the vision pipeline used by
RunVisionUC. - appdevforall/CodeOnTheGo#902: Related ROI filtering logic for margin false-positive suppression; similar filtering was moved into
RunVisionUCin this PR.
Suggested reviewers
- avestaadfa
- Daniel-ADFA
- hal-eisen-adfa
🐰 I hopped through boxes, text, and scale,
I stitched the pieces where logic would fail,
Use cases now guide, the ViewModel's light,
Scaled boxes and text find their layout right,
A carrot of code—refactor with zeal! 🥕
🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. | Write docstrings for the functions missing them to satisfy the coverage threshold. |
✅ Passed checks (4 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title check | ✅ Passed | The title clearly identifies the main objectives: fixing OCR metadata parsing and refactoring the CV domain architecture, which align with the substantial changes in the changeset. |
| Description check | ✅ Passed | The description is directly related to the changeset, explaining the OCR fix rationale and detailed refactoring of the CV architecture with specific component names and motivations. |
| Linked Issues check | ✅ Passed | Check skipped because no linked issues were found for this pull request. |
| Out of Scope Changes check | ✅ Passed | Check skipped because no linked issues were found for this pull request. |
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing Touches
📝 Generate docstrings
- Create stacked PR
- Commit on current branch
🧪 Generate unit tests (beta)
- Create PR with unit tests
- Commit unit tests in branch
fix/ADFA-3813-ocr-metadata-parsing-experimental
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/DetectionScaler.kt`:
- Around line 26-27: Integer division in DetectionScaler.kt causes normW and
normH to lose precision; cast operands to floating-point before dividing (e.g.,
convert (rect.right - rect.left) and sourceWidth/sourceHeight to Float/Double)
so normalization uses floating-point math, and ensure normW/normH types match
(Float/Double) where they are used; update the normalization lines referencing
rect, normW, normH, sourceWidth and sourceHeight accordingly.
In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt`:
- Around line 167-168: The current cleanup replaces "im_age" but leaves the OCR
variant "im" producing "@drawable/im"; in ValueCleanersImpl (the cleaned ->
finalCleaned flow) update the replacement logic to also normalize standalone
"im" to "image" (use a word-boundary or equivalent check so you don't
accidentally change substrings) before returning "@drawable/$finalCleaned";
ensure you apply this to the same cleaned/finalCleaned variable used in the
return path so empty-value fallback still returns rawValue.
In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/PrepareImageUC.kt`:
- Around line 45-51: The code in PrepareImageUC.kt that computes orientation
currently catches all Exceptions; narrow this to specific expected exceptions
(e.g., catch (ioe: IOException)) around the
contentResolver.openInputStream(uri)?.use { ... } / ExifInterface(...) call so
only IO problems are swallowed and other unexpected errors still surface;
optionally add an additional catch for SecurityException if permission issues
are possible. Ensure the fallback to ExifInterface.ORIENTATION_NORMAL remains in
the catch block.
In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/ui/viewmodel/ComputerVisionViewModel.kt`:
- Line 70: When handling ComputerVisionEvent.UpdateGuides in the ViewModel,
sanitize the incoming event.leftPct and event.rightPct before storing: clamp
both values to the [0f, 1f] range, then order them so leftGuidePct <=
rightGuidePct, and finally call _uiState.update with the normalized values
(leftGuidePct and rightGuidePct) to prevent crossed or out-of-range guides from
affecting downstream filtering and annotation parsing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ce97d103-0463-426a-8c1d-8d278e025120
📒 Files selected for processing (18)
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/ComputerVisionRepository.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/ComputerVisionRepositoryImpl.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/VisionRepository.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/VisionRepositoryImpl.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/di/ComputerVisionModule.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/DetectionScaler.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/LayoutGeometryProcessor.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/LayoutTreeBuilder.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/TextAssociator.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/YoloToXmlConverter.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/GenerateXmlUC.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/ImportPlaceholderImageUC.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/PrepareImageUC.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/RemovePlaceholderImageUC.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/RunVisionUC.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/xml/AndroidXmlGenerator.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/ui/viewmodel/ComputerVisionViewModel.kt
💤 Files with no reviewable changes (3)
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/ComputerVisionRepository.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/ComputerVisionRepositoryImpl.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/LayoutGeometryProcessor.kt
…omain logic into use cases.
Encapsulate guide limits, narrow EXIF exceptions, and fix 'im' drawable regex.
9815c28 to
f80a06d
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/PrepareImageUC.kt (1)
22-27: ⚡ Quick winMove blocking I/O operations off
Dispatchers.DefaulttoDispatchers.IO.
ContentResolver.openFileDescriptor(),BitmapFactory.decodeFileDescriptor(), andContentResolver.openInputStream()are blocking operations that should not occupy the limited Default dispatcher thread pool. This can starve CPU-bound coroutines under concurrent load. UseDispatchers.IOfor decode and EXIF operations, reservingDispatchers.Defaultonly for the CPU-bound boundary detection.Proposed refactor
- suspend operator fun invoke(uri: Uri): Result<PreparedImage> = withContext(Dispatchers.Default) { - runCatching { - val bitmap = uriToBitmap(uri) ?: throw IllegalStateException("Failed to decode image from URI") - val rotatedBitmap = handleImageRotation(uri, bitmap) - val (leftBoundPx, rightBoundPx) = SmartBoundaryDetector.detectSmartBoundaries(rotatedBitmap) + suspend operator fun invoke(uri: Uri): Result<PreparedImage> = runCatching { + val rotatedBitmap = withContext(Dispatchers.IO) { + val bitmap = uriToBitmap(uri) ?: throw IllegalStateException("Failed to decode image from URI") + handleImageRotation(uri, bitmap) + } + val (leftBoundPx, rightBoundPx) = withContext(Dispatchers.Default) { + SmartBoundaryDetector.detectSmartBoundaries(rotatedBitmap) + } val widthFloat = rotatedBitmap.width.toFloat() PreparedImage( bitmap = rotatedBitmap, leftPct = leftBoundPx / widthFloat, rightPct = rightBoundPx / widthFloat ) - }.onFailure { + }.onFailure { if (it is CancellationException) throw it - } - } + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/PrepareImageUC.kt` around lines 22 - 27, The coroutine currently runs all work on Dispatchers.Default in PrepareImageUC.operator fun invoke; move blocking I/O and decode work (uriToBitmap, ContentResolver.openFileDescriptor/ openInputStream, BitmapFactory.decodeFileDescriptor, and EXIF reading used by handleImageRotation) to Dispatchers.IO, then switch back to Dispatchers.Default only for CPU-bound work like SmartBoundaryDetector.detectSmartBoundaries; update invoke to perform I/O steps (calling uriToBitmap and handleImageRotation’s file/stream operations) inside withContext(Dispatchers.IO) and perform SmartBoundaryDetector.detectSmartBoundaries(rotatedBitmap) inside withContext(Dispatchers.Default).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/PrepareImageUC.kt`:
- Around line 55-59: The when block handling EXIF orientation in
PrepareImageUC.kt currently handles only rotation cases; add explicit cases for
mirrored/transposed orientations ExifInterface.ORIENTATION_FLIP_HORIZONTAL (2),
ORIENTATION_FLIP_VERTICAL (4), ORIENTATION_TRANSPOSE (5), and
ORIENTATION_TRANSVERSE (7) so they perform the correct matrix operations instead
of falling through: use postScale(-1f, 1f) for horizontal flip (2),
postScale(1f, -1f) for vertical flip (4), and combine rotation+flip or use
postTranspose/postRotate/postScale for transposed (5) and transverse (7) cases
consistent with how postRotate/postScale are used now; ensure the function still
returns the transformed bitmap when handling these new branches and reference
the existing orientation variable and methods postRotate, postScale,
postTranspose in your changes.
---
Nitpick comments:
In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/PrepareImageUC.kt`:
- Around line 22-27: The coroutine currently runs all work on
Dispatchers.Default in PrepareImageUC.operator fun invoke; move blocking I/O and
decode work (uriToBitmap, ContentResolver.openFileDescriptor/ openInputStream,
BitmapFactory.decodeFileDescriptor, and EXIF reading used by
handleImageRotation) to Dispatchers.IO, then switch back to Dispatchers.Default
only for CPU-bound work like SmartBoundaryDetector.detectSmartBoundaries; update
invoke to perform I/O steps (calling uriToBitmap and handleImageRotation’s
file/stream operations) inside withContext(Dispatchers.IO) and perform
SmartBoundaryDetector.detectSmartBoundaries(rotatedBitmap) inside
withContext(Dispatchers.Default).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 10080178-0271-4bd8-8fe2-9cd05e96f8de
📒 Files selected for processing (18)
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/ComputerVisionRepository.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/ComputerVisionRepositoryImpl.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/VisionRepository.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/VisionRepositoryImpl.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/di/ComputerVisionModule.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/DetectionScaler.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/LayoutGeometryProcessor.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/LayoutTreeBuilder.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/TextAssociator.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/YoloToXmlConverter.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/GenerateXmlUC.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/ImportPlaceholderImageUC.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/PrepareImageUC.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/RemovePlaceholderImageUC.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/RunVisionUC.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/xml/AndroidXmlGenerator.ktcv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/ui/viewmodel/ComputerVisionViewModel.kt
💤 Files with no reviewable changes (3)
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/ComputerVisionRepositoryImpl.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/ComputerVisionRepository.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/LayoutGeometryProcessor.kt
🚧 Files skipped from review as they are similar to previous changes (14)
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/RemovePlaceholderImageUC.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/VisionRepository.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/ImportPlaceholderImageUC.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/DetectionScaler.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/GenerateXmlUC.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/xml/AndroidXmlGenerator.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/YoloToXmlConverter.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/VisionRepositoryImpl.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/RunVisionUC.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/TextAssociator.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/di/ComputerVisionModule.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/LayoutTreeBuilder.kt
- cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/ui/viewmodel/ComputerVisionViewModel.kt
Description
Fixes an OCR issue where arbitrary metadata values for image files were parsed incorrectly (e.g., reading "image" as "im" or "im_age"), causing broken
android:srcreferences in the generated XML. Alongside this fix, the Computer Vision architecture was heavily refactored: complex domain logic was extracted fromComputerVisionViewModeland the monolithic repository into focused, single-responsibility UseCases (RunVisionUC,GenerateXmlUC,PrepareImageUC, etc.) and helper objects (DetectionScaler,LayoutTreeBuilder,TextAssociator).Details
DrawableCleaner(ValueCleanersImpl.kt) to correct common OCR misinterpretations for the word "image".ComputerVisionRepositorywithVisionRepositoryto purely handle ML operations.GenerateXmlUC,ImportPlaceholderImageUC,PrepareImageUC,RemovePlaceholderImageUC,RunVisionUC).LayoutTreeBuilder,TextAssociator, andDetectionScaler).Screen.Recording.2026-05-13.at.11.22.18.AM.mov
Ticket
ADFA-3813
Observation
The domain refactoring drastically reduces the bloat in the
ComputerVisionViewModel, effectively decoupling the UI state management from ML detection and XML generation logic.