Skip to content

ADFA-3709 | Refactor OCR fuzzy parsing and modularize YOLO to XML conversion#1185

Merged
jatezzz merged 5 commits intostagefrom
refactor/ADFA-3709-ocr-fuzzy-parsing-experimental
Apr 17, 2026
Merged

ADFA-3709 | Refactor OCR fuzzy parsing and modularize YOLO to XML conversion#1185
jatezzz merged 5 commits intostagefrom
refactor/ADFA-3709-ocr-fuzzy-parsing-experimental

Conversation

@jatezzz
Copy link
Copy Markdown
Collaborator

@jatezzz jatezzz commented Apr 15, 2026

Description

This PR improves the accuracy and maintainability of the computer vision to XML pipeline.

  • Updated FuzzyAttributeParser to include comprehensive regex-based sanitization for common OCR typos (e.g., misreading '0' as 'o', '5' as 's') and common UI label typos (e.g., "usemame" to "Username").
  • Introduced strict fuzzy matching for categorical XML values such as inputType, gravity, and textStyle to ensure valid Android XML output.
  • Refactored YoloToXmlConverter to reduce cyclomatic complexity. The large mapping and appending logic was broken down into smaller, specialized functions (assignTextToParents, matchAnnotationsToElements, appendTextViewAttributes, etc.).

Details

Logic-related changes. Verified parsing improvements and corrected UI typo mappings via application logs. XML generation now cleanly delegates to component-specific builder methods.

UI_sketch_18 photo_5116610229700332736_y photo_5116610229700332737_y

Ticket

ADFA-3709

@jatezzz jatezzz requested review from a team, Daniel-ADFA and avestaadfa April 15, 2026 17:59
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 15, 2026

Warning

Rate limit exceeded

@jatezzz has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 44 minutes and 49 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 44 minutes and 49 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: af25b656-6d01-4041-9cdf-5650646eb7ad

📥 Commits

Reviewing files that changed from the base of the PR and between 5727d0d and eb6b697.

📒 Files selected for processing (3)
  • cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParser.kt
  • cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/MarginAnnotationParser.kt
  • cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/YoloToXmlConverter.kt
📝 Walkthrough

Walkthrough

FuzzyAttributeParser adds regex-based OCR normalization, fuzzy categorical matching, and refined id/number cleaning. YoloToXmlConverter refactors layout generation: scales widgets, assigns OCR text boxes to parents, derives canvas tags, matches annotations to elements, and extracts attribute-emission into new helpers.

Changes

Cohort / File(s) Summary
Fuzzy attribute parsing
cv-image-to-xml/.../FuzzyAttributeParser.kt
Added precompiled regexes for id/number cleanup and OCR character normalization, implemented matchCategoricalValue(...), updated cleanValue(...) and cleanId(...) (including btmbtn fix), and simplified delimiter handling and unsafe !! usages.
YOLO → XML conversion pipeline
cv-image-to-xml/.../YoloToXmlConverter.kt
Refactored generateXmlLayout flow: scale widgets (exclude widget_tag), assign OCR text boxes to parent boxes, derive canvas tags, match annotations to elements via new matchAnnotationsToElements, and moved view attribute logic into appendTextViewAttributes, appendEditTextAttributes, appendImageViewAttributes. Defaults for missing width/height now use scaled box dims.

Sequence Diagram

sequenceDiagram
    participant Converter as YoloToXmlConverter
    participant Scaler as Scaler
    participant TextAssigner as TextAssigner
    participant TagDeriver as TagDeriver
    participant Matcher as AnnotationMatcher
    participant Builder as XmlBuilder
    participant Parser as FuzzyAttributeParser

    Converter->>Scaler: Scale YOLO widget boxes (exclude widget_tag)
    Scaler-->>Converter: scaledBoxes

    Converter->>TextAssigner: assignTextToParents(scaledBoxes, ocrBoxes)
    TextAssigner-->>Converter: parentTextMappings

    Converter->>TagDeriver: derive canvas tags (widget_tag + OCR tag text)
    TagDeriver-->>Converter: canvasTags

    Converter->>Matcher: matchAnnotationsToElements(canvasTags, scaledBoxes, parentTextMappings)
    Matcher-->>Converter: finalAnnotations

    Converter->>Builder: build XML from sortedBoxes + finalAnnotations
    Builder->>Parser: sanitize/normalize attribute values
    Parser-->>Builder: cleanedAttributes
    Builder-->>Converter: xmlLayout
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • Daniel-ADFA
  • avestaadfa
  • hal-eisen-adfa

Poem

🐇 In pixels' hush I hop and parse,

O to 0, and l to 1 I sparse;
Text finds parents, tags align,
Boxes scale and attributes shine—
A tiny rabbit cheers the XML march.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: refactoring OCR fuzzy parsing in FuzzyAttributeParser and modularizing YOLO to XML conversion in YoloToXmlConverter.
Description check ✅ Passed The description is directly related to the changeset, covering the three main improvements: regex-based sanitization, fuzzy matching for categorical values, and refactoring to reduce complexity.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor/ADFA-3709-ocr-fuzzy-parsing-experimental

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParser.kt (1)

490-492: Consider using kotlin.random.Random for more idiomatic Kotlin.

Math.random() works but kotlin.random.Random.nextInt(1000) is more idiomatic and explicit about the integer range.

♻️ Suggested improvement
+import kotlin.random.Random
+
 // In cleanId function:
-            return "view_${(Math.random() * 1000).toInt()}"
+            return "view_${Random.nextInt(1000)}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParser.kt`
around lines 490 - 492, Replace the use of Math.random() in the
FuzzyAttributeParser return expression with the idiomatic Kotlin random integer
API: use kotlin.random.Random.nextInt(1000) to generate the 0–999 int and
construct the ID string (i.e., change the expression inside the if in
FuzzyAttributeParser where it returns "view_${(Math.random() * 1000).toInt()}"
to use Random.nextInt(1000)); add an import for kotlin.random.Random if
required.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParser.kt`:
- Around line 490-492: Replace the use of Math.random() in the
FuzzyAttributeParser return expression with the idiomatic Kotlin random integer
API: use kotlin.random.Random.nextInt(1000) to generate the 0–999 int and
construct the ID string (i.e., change the expression inside the if in
FuzzyAttributeParser where it returns "view_${(Math.random() * 1000).toInt()}"
to use Random.nextInt(1000)); add an import for kotlin.random.Random if
required.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d745e406-afeb-4484-9b73-4ee1ef882fd6

📥 Commits

Reviewing files that changed from the base of the PR and between 017ffd5 and bc85cf0.

📒 Files selected for processing (2)
  • cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParser.kt
  • cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/YoloToXmlConverter.kt

@jatezzz jatezzz force-pushed the refactor/ADFA-3709-ocr-fuzzy-parsing-experimental branch from bc85cf0 to 5727d0d Compare April 15, 2026 19:31
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParser.kt`:
- Line 152: The regex ocrLetterBToSixRegex incorrectly maps OCR 'B' to the digit
6; update the mapping and its name to reflect that 'B' is visually confused with
'8' (e.g., rename to ocrLetterBToEightRegex and keep the pattern Regex("[bB]")),
and then update all usages in extractOcrNumber to replace the old mapping
logic/name so the parser substitutes 'B' -> '8' instead of '6'.
- Around line 475-477: The fallback ID generation in FuzzyAttributeParser (the
block checking FuzzySearch.ratio for "match_parent"/"wrap_content") is
non-deterministic and low-cardinality; replace the Math.random() usage with a
thread-safe incremental counter (e.g., an AtomicInteger field in the class) and
produce IDs like "view_${counter.incrementAndGet()}" to ensure deterministic,
unique IDs and avoid test flakiness and collisions.
- Around line 502-511: The extractOcrNumber function currently normalizes the
entire input string (using ocrLetterOToZeroRegex, ocrLetterIToOneRegex,
ocrLetterZToTwoRegex, ocrLetterSToFiveRegex, ocrLetterBToSixRegex) which
corrupts unit suffixes like "sp"/"dp"; change the logic to detect and strip any
trailing unit suffix before normalization (e.g., check for common units such as
"sp", "dp", "px"), apply the regex replacements only to the stripped numeric
part, run numberExtractionRegex on that normalized numeric substring, and then
reattach/return the numeric value (or use it downstream) without having altered
the original unit suffix; update extractOcrNumber to operate on numericPart and
leave unitPart untouched.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1f8c18e6-85e4-44d2-a113-07314d69b577

📥 Commits

Reviewing files that changed from the base of the PR and between bc85cf0 and 5727d0d.

📒 Files selected for processing (2)
  • cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParser.kt
  • cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/YoloToXmlConverter.kt
🚧 Files skipped from review as they are similar to previous changes (1)
  • cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/YoloToXmlConverter.kt

@jatezzz jatezzz force-pushed the refactor/ADFA-3709-ocr-fuzzy-parsing-experimental branch from f04ef0f to 5ba557d Compare April 16, 2026 21:10
@jatezzz
Copy link
Copy Markdown
Collaborator Author

jatezzz commented Apr 16, 2026

@coderabbitai

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 16, 2026

The PR description images show:

  • UI_sketch_18: A hand-drawn layout sketch with two EditText fields ("Your name", "User name"), a "Finish" Button (B-1, B-2), and a Switch (SW-1), each annotated with layout attributes like match_parent, 150dp, inputType: text, id: btm_finish, etc.
  • photo_5116610229700332736_y: The "Generate XML" screen on an Android tablet showing the sketch with green bounding boxes overlaid on detected UI elements (EditTexts, Button, Switch), with OCR-recognized labels and margin guides visible.
  • photo_5116610229700332737_y: The generated activity_main.xml in AndroidIDE, showing a LinearLayout containing two EditText elements (hints: "Your name", "User name"), a Button (text: "Finish"), and a Switch (checked: true) — all with pixel-scaled dp dimensions and tools:ignore="HardcodedText".

@jatezzz, it looks like you tagged me without a specific request! Here's what I can help with for PR #1185:

  • 🔍 Trigger a full review of the changes in FuzzyAttributeParser.kt and YoloToXmlConverter.kt
  • 💬 Answer questions about the PR changes
  • 🐛 Investigate specific concerns about the OCR fuzzy parsing or YOLO-to-XML refactoring

Would you like me to proceed with a full review, or is there something specific you'd like me to look at?


🧠 Learnings used
Learnt from: hal-eisen-adfa
Repo: appdevforall/CodeOnTheGo PR: 1176
File: app/src/androidTest/kotlin/com/itsaky/androidide/helper/DevicePermissionGrantUiHelper.kt:47-76
Timestamp: 2026-04-15T01:58:44.608Z
Learning: In `appdevforall/CodeOnTheGo`, `clickFirstAccessibilityNodeByText` in `app/src/androidTest/kotlin/com/itsaky/androidide/helper/DevicePermissionGrantUiHelper.kt` pre-filters via `findAccessibilityNodeInfosByText` (case-insensitive text containment) before applying the caller's `matchBy` predicate. This is intentional and correct for all current callers because the targeted buttons (NEXT, accept, grant) have visible rendered text that is matched by the pre-filter. Do not flag the pre-filter as missing contentDescription-only nodes — that scenario does not apply to these UI elements.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@jatezzz jatezzz force-pushed the refactor/ADFA-3709-ocr-fuzzy-parsing-experimental branch from 6ad9ed5 to f4995f8 Compare April 17, 2026 14:03
@jatezzz jatezzz force-pushed the refactor/ADFA-3709-ocr-fuzzy-parsing-experimental branch from f869f7c to 8e2974f Compare April 17, 2026 14:20
jatezzz added 5 commits April 17, 2026 11:11
Add regex-based typo sanitization and extract XML view generation into dedicated functions.
Rely entirely on the existing escapeXmlAttr method for XML safety and drop hardcoded OCR typo fixes.
@jatezzz jatezzz force-pushed the refactor/ADFA-3709-ocr-fuzzy-parsing-experimental branch from 8e2974f to eb6b697 Compare April 17, 2026 16:11
@jatezzz jatezzz merged commit 724374d into stage Apr 17, 2026
2 checks passed
@jatezzz jatezzz deleted the refactor/ADFA-3709-ocr-fuzzy-parsing-experimental branch April 17, 2026 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants