Skip to content

refactor: use Point struct across the native vision stack#1130

Open
chmjkb wants to merge 3 commits intomainfrom
@chmjkb/native-point-refactor
Open

refactor: use Point struct across the native vision stack#1130
chmjkb wants to merge 3 commits intomainfrom
@chmjkb/native-point-refactor

Conversation

@chmjkb
Copy link
Copy Markdown
Collaborator

@chmjkb chmjkb commented May 8, 2026

Description

Refactors the native code so it uses unified Point instead of random structures and pair arrays that were scattered all over the codebase.

Introduces a breaking change?

  • Yes
  • No

Type of change

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Documentation update (improves or adds clarity to existing documentation)
  • Other (chores, tests, code style improvements etc.)

Tested on

  • iOS
  • Android

Testing instructions

Screenshots

Related issues

#760
#1110

Checklist

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings

Additional notes

@chmjkb chmjkb marked this pull request as ready for review May 8, 2026 06:17
@msluszniak
Copy link
Copy Markdown
Member

msluszniak commented May 8, 2026

First question, does this handle the scope of this pr: #1126, if so I will close it and you can mark another issue to close ;)

EDIT: Ok, I see it's not handling this one, so maybe you can incorporate these changes into your PR?

@msluszniak msluszniak self-requested a review May 8, 2026 07:53
@chmjkb chmjkb linked an issue May 8, 2026 that may be closed by this pull request
@chmjkb
Copy link
Copy Markdown
Collaborator Author

chmjkb commented May 8, 2026

ok, so I'll cherry-pick commits from #1126 PR and then we'll close it? does it sound fine? @msluszniak

@msluszniak
Copy link
Copy Markdown
Member

msluszniak commented May 8, 2026

ok, so I'll cherry-pick commits from #1126 PR and then we'll close it? does it sound fine? @msluszniak

Yes, probably this would need some tweaking since I hadn't used Point struct.

Resolves #760. The OCR and VerticalOCR pipelines previously exposed all four
rotated-rectangle corners in OCRDetection.bbox. Two points (top-left and
bottom-right of the axis-aligned bounding box) are sufficient for downstream
rendering and are simpler to consume.

Changes:
- Types.h: shrink OCRDetection.bbox from std::array<Point,4> to std::array<Point,2>
- RecognitionHandler.cpp: compute AABB (min/max x,y) over the four detector
  corners instead of forwarding them verbatim
- VerticalOCR.cpp: same AABB reduction in _processSingleTextBox
- OCR.cpp / VerticalOCR.cpp generateFromFrame: re-normalize the two bbox
  corners after inverseRotatePoints to guarantee bbox[0] <= bbox[1]
- JsiConversions.h: serialize 2 points instead of 4 to JavaScript
- OCRTest.cpp / VerticalOCRTest.cpp: assert size==2 and that bbox[1] >= bbox[0]
- ocr.ts: narrow TypeScript type from Point[] to [Point,Point] and update docs
Comment on lines -64 to 74
interface Point {
x: number;
y: number;
interface Bbox {
x1: number;
y1: number;
x2: number;
y2: number;
}

interface OCRDetection {
bbox: Point[];
bbox: Bbox;
text: string;
score: number;
}
```
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is breaking change, right? please change PR title and description

@msluszniak
Copy link
Copy Markdown
Member

Also could you write some basic testing instructions, namely which models are affected, what I specifically need to check?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor native code to use Point instead of x, y values Refactor code of OCR and VerticalOCR to utilise only two points in bounding boxes

3 participants