qrdet sometimes give wrong quad_xy coords? #7

vladimir-dudnik-1 · 2024-02-21T23:37:58Z

qrdet v2.4 installed from PyPi seems give wrong quad_xy coords, see resulting image below

This result was obtained from this test image

using code below:
`#!/usr/bin/python3
from qrdet import QRDetector
import numpy as np
import cv2

detector = QRDetector(model_size = 's')
image = cv2.imread(filename='test.jpg')
detections = detector.detect(image=image, is_bgr=True)

Draw the detections

for detection in detections:
x1, y1, x2, y2 = np.array(detection['bbox_xyxy'], np.int32)
cv2.rectangle(image, (x1, y1), (x2, y2), color = (0, 255, 0), thickness = 1)

(qx1, qy1), (qx2, qy2), (qx3, qy3), (qx4, qy4) = np.array(detection['quad_xy'], np.int32)
cv2.circle(image, (qx1, qy1), 4, color = (0, 0, 255), thickness = -1)
cv2.circle(image, (qx2, qy2), 4, color = (0, 0, 255), thickness = -1)
cv2.circle(image, (qx3, qy3), 4, color = (0, 0, 255), thickness = -1)
cv2.circle(image, (qx4, qy4), 4, color = (0, 0, 255), thickness = -1)

confidence = detection['confidence']
cv2.putText(image, f'{confidence:.2f}', (x1, y1 - 10), fontFace = cv2.FONT_HERSHEY_SIMPLEX, fontScale = 1, color = (0, 255, 0), thickness = 2)

Save the result

cv2.imwrite(filename='out.jpg', img=image)
`

The text was updated successfully, but these errors were encountered:

Eric-Canas · 2024-05-10T20:08:57Z

Hi,

You are True. qrdet runs a segmentation model under the hood, so getting the four corners from the segmentation mask is a bit tricky and needs to be improved.

The method I'm using right now, is defined in quadrilateral-fitter.

That was my first approach, but that's definitely not perfect, and I should think about it again and find a better one.

Please, any ideas are welcomed!

Thanks for sharing your test case, I will use it to validate the next quadrilateral-fitter approach.

Trichy-man · 2024-05-14T11:02:07Z

Hi!

I was reading this paper Identification of QR Code Perspective Distortion Based on Edge Directions and Edge Projections Analysis (https://www.mdpi.com/2313-433X/6/7/67) maybe you can find it useful. Right now I am trying to implement it and understand how it could fit in the repo.

Thanks for all the work done so far.

Eric-Canas · 2024-05-14T16:55:42Z

123948234 million thanks, @Trichy-man!

I'll read it. That's something I tried and failed.

The yolov8_results_to_dict function, is where I'm actually fitting the quadrilateral from each one of the detected polygon masks, and then, I'm building the dictionary that contains each polygon features.

These dicts are the ones that users are getting when calling detect().
My first plan about that, was to use the accurate_polygon_xy mask to crop the subimage containing the QR, find the edges in that subimage and then, include in that dictionary the coords of those three finding patterns. Something like... edge_coords_xy: {bl: (x, y), tl: (x, y), tr: (x, y)}

But I never found a valid way of finding those edges.

One of the main reasons why finding them would be extremely relevant, is because with them it could be inferred a very relevant value: the QR rotation_degrees. That value would be extremely useful in a lot of applications, as for example, for camera alignment, perspective correction, finding the rotation of objects...

Those are QR use cases that doesn't even need to have a readable QR.

I tried it and I couldn't find a way. My second thought was to train a second CV model, based on some adaptation over human-pose detection models. But that approach is quite time-demanding too to implement.

If you are able to make edge detection work with that's paper approach, I'll be eternally grateful to you!

Trichy-man · 2024-05-15T08:24:29Z

Hi @Eric-Canas

probably we could use the subimage containing the QR, use Canny image detection to 1) find the 1:1:3:1:1 ratio of the finder pattern 2) compute the centroids 3) find the 4th vertex 4) apply transorms (rotation, perspective and cylindrical).

Eric-Canas · 2024-05-15T09:31:23Z

It looks like a good approach.

Some notes about that:

I think 4th point should be modified if placed here. Applying the transform is a step that should help on reading the QR, but qrdet purpose is constrained to detection. QR detection+reading is done at QReader. This way people using qr's only as key marks, doesn't need to get the rest of the overhead and dependencies implied in the decoding part. QReader is actually doing that homography when trying to decode, but it is actually based on the quad_xy_coords that are not stable enough, and could be improved this way.

However, something about that 4th point that I think could be super useful, would be to directly calculate and get in that results_dict the transformation matrix. So in case the user needs to apply the transform over the qr, or even the full image, we are already returning that matrix.
And it could be directly used on QReader to improve the stabilization of the decoding.

In a future, I think that the edge detection Machine Learning model could work as a fallback for difficult use cases, where the tricky position or occlusions over the QR, cramped papers... would make it hard to find the finder patterns with the Canny based method. But having to run the model only in those tricky wild cases where the Canny approach didn't get a result, should heavily reduce the overhead, speed up the detection and improve precision.

Trichy-man · 2024-05-16T06:58:48Z

A fast QR code detector based on a similar idea of the paper is already present in the Open CV library modules -> objdetect -> src -> qrcode.cpp.

Eric-Canas · 2024-05-17T17:12:13Z

I heavily used it as a help while tagging the dataset. My experience was that it worked quite well for "test cases" when the QR is perfectly and clearly visible. And segmentation when working was almost pixel-perfect. But didn't give very good results when working in the wild, with QRs that are not as clean, big or plain (which is actually the most common usage of qrdet).

In fact, that's the one I used in the benchmark for QReader.

Do you think it would work well for detecting the corners once the subimage is already cropped? Maybe we can use a subpart of it? Maybe we could even adapt the algorithm of the paper to improve that detection rate of the finder patterns by taking profit of the segmentation mask we have. Such as for example... Enhancing the sensitivity of the algorithm, or being a bit more tolerant in that 1:1:3:1:1 ratio and take profit of the fact that we already know that there are likely no other elements in the image others than the QR, as it is already segmented. So we can discard False Positives by keeping only the three finder patterns that are closer to the edges.

Or something alike. I think that there is a clear advantage in the fact that original algorithm is intended for detecting QR in scenarios with other distracting elements that will also produce edges when applying Canny. While this case is an over simplified task where we already have a cropped image containing a QR, with no background, were we only want to find where those finding pattern corners are. In order to calculate rotations and transformation matrices, and give 3 (or 4 if we include the 4th vertex) anchor points to simplify the noisy segmentation mask.

Just to be in the same page, when thinking about the intended use case for qrdet, I think of something like a slightly crampled paper containing a qr, lying in the middle of the grass. More or less, that kind of "in the wild".

Trichy-man · 2024-05-20T14:37:27Z

There is quite a bit to discuss. Assuming that the quad fitted on the QR is correct, my first idea was to compute the angle of rotation (by interpolating a straight line with a side of the quad, again assuming it is fitted correctly) and "guess" the correct rotation by rotating at most 4 times. Otherwise we can find the FP (finder patterns) in the cropped image to recollect the information for rotation.

As for perspective and cylindrical tranformations we do not need the FPs, but actually only the fitted quad (in the case of cylindrical we would need a curved quad, so my idea was to fill all the QR to obtain a single black square e get the edges from that). The only implementation I was able to find for finding FPs was this https://github.com/omargamal253/Automatic-Segmentation-and-Alignment-of-QR-Codes/blob/main/IP%20Project/Find_And_Detect_Corners.m based on black and white connected components.

Also, i was reading about Orientated Bounding Boxes (OBB) (https://docs.ultralytics.com/tasks/obb/).

Eric-Canas · 2024-05-25T06:27:00Z

Hi!

About substituting the segmentator by Oriented Bounding Boxes: I don't think they would fit as they couldn't represent persepective. And in the way we would lost the precise segmentation mask, that could be useful for some cases.

About the collect of the finder patterns: If we assume the quad is correctly calculated, we could do that rotation until fitting the straight line. In fact, something similar is done at QReader for actually correcting the perspective when decoding the QR. That assumption of quad coords being actually correct is not always true, but is a first approach. They are actually calculated by assigning each point of the mask to one of the four sides of the quadrilateral and fitting a line for them. Sometimes work, sometimes not.

Any effort on calculating that transformation matrices now, based on the quad_coords, although based in non-reliable coords right now, could become automatically true once the calculation of that coords will be improved by detecting the finding patterns. And at the end of the day, if finding them becomes difficult or erratic, I could finally tag them as a dataset and look for a way of modifying a pose-detection model to find these three keypoints. A simple and small yolo-pose-n would work for that purpose likely, as the problem is too constrained.

So if you find the task of detecting the finding patterns difficult, we can focus on the cylindrical transformation, and I'll just look for a way of training that model

The part of finding the perspective + rotation transformation matrix is actually done here. Rotation is not completely real, as we don't have yet the finding patterns information that tag precisely which of the quad_coords are tr, br, bl & tl. But it would be just a minor change once they will be found.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qrdet sometimes give wrong quad_xy coords? #7

qrdet sometimes give wrong quad_xy coords? #7

vladimir-dudnik-1 commented Feb 21, 2024 •

edited

Eric-Canas commented May 10, 2024

Trichy-man commented May 14, 2024

Eric-Canas commented May 14, 2024

Trichy-man commented May 15, 2024

Eric-Canas commented May 15, 2024

Trichy-man commented May 16, 2024

Eric-Canas commented May 17, 2024

Trichy-man commented May 20, 2024

Eric-Canas commented May 25, 2024 •

edited

qrdet sometimes give wrong quad_xy coords? #7

qrdet sometimes give wrong quad_xy coords? #7

Comments

vladimir-dudnik-1 commented Feb 21, 2024 • edited

Draw the detections

Save the result

Eric-Canas commented May 10, 2024

Trichy-man commented May 14, 2024

Eric-Canas commented May 14, 2024

Trichy-man commented May 15, 2024

Eric-Canas commented May 15, 2024

Trichy-man commented May 16, 2024

Eric-Canas commented May 17, 2024

Trichy-man commented May 20, 2024

Eric-Canas commented May 25, 2024 • edited

vladimir-dudnik-1 commented Feb 21, 2024 •

edited

Eric-Canas commented May 25, 2024 •

edited