Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancing Model Robustness for Argentine License Plate OCR in Varied Lighting Conditions #9

Closed
yihong1120 opened this issue Dec 18, 2023 · 10 comments

Comments

@yihong1120
Copy link

Dear Maintainers,

I hope this message finds you well. I have been exploring the remarkable work done on the Argentine License Plate OCR repository and am thoroughly impressed with the system's performance, particularly given the constraints of embedded system deployment.

However, I have observed that the model's performance can be significantly impacted by varying lighting conditions, which is a common scenario in real-world applications. In low-light or overexposed environments, the accuracy of character recognition appears to be compromised.

Given the importance of reliable license plate recognition across all times of day and under diverse lighting, I believe enhancing the model's robustness in this aspect could greatly improve its utility. Here are a few suggestions that might help address this issue:

  1. Dynamic Range Adjustment: Implementing an algorithm to normalize the lighting of the input images could help the model perform consistently. This could involve techniques like histogram equalization or adaptive gamma correction.

  2. Lighting Augmentation in Training: To better prepare the model for different lighting scenarios, we could introduce a wider range of lighting conditions in the training data augmentation pipeline. This might include simulating underexposure, overexposure, and shadow effects.

  3. Dedicated Low-light Model: Training a specialized model on a dataset predominantly composed of low-light images might yield a more robust performance in such conditions. This model could either be used in tandem with the primary model or be triggered based on the detected lighting conditions.

  4. Inference-time Preprocessing: Incorporating a preprocessing step during inference to adjust the lighting of the input images could be another approach. This step would aim to bring the image closer to the model's "comfort zone."

I would be keen to contribute to this enhancement, whether through dataset curation, model training, or developing preprocessing algorithms. I believe that with a collaborative effort, we can achieve a more resilient OCR system that performs reliably in all lighting conditions.

Thank you for considering my suggestions. I look forward to your thoughts and any guidance on how I might assist in this endeavour.

Best regards,
yihong1120

@ankandrew
Copy link
Owner

Hi,

Thanks for writing this, I think what you are proposing here is really cool. We could start with proposed idea (2) and fallback to the other ideas if doesn't satisfy us. I also had in mind some ideas for this repo:

  1. Look into using more modern architecture (if it's valuable)
  2. Making this a more language-universal OCR by training it with generated data i.e. this.
  3. Upgrade to Keras 3.0

Regarding the lightning conditions maybe we probably need a more significant dataset to validate our results. Maybe we can gather a dataset different from the Argentine plates or I can look into publishing one (will take me time).

Also, with what images/dataset of low-light plates did you try the OCR? if you can post some examples would be great!

@biodatasciencearg
Copy link

biodatasciencearg commented Feb 20, 2024

HI, everyone!! I'm impressed with the performance of this model. I want to contribute with a small dataset (~2K) that I extracted from mercadolibre. You Have a repo with the data?

@ankandrew
Copy link
Owner

Hi @biodatasciencearg! That contribution would be welcomed :). I can publish a new release with that dataset. Perhaps you can upload it right here in the comments as a .zip file.

@biodatasciencearg
Copy link

Hello, I'm sending you a drive link because I can't upload the file here because it's too big! Let me know when you've downloaded it so I can delete it!
https://drive.google.com/file/d/1zKXjo6i2m0xdLCti793VI2Dy0ewz_kkb/view?usp=drive_link

@ankandrew
Copy link
Owner

Hi!

re @yihong1120: I've reworked the repo with some considerations aligned with what you mentioned. Related to your second point, I started using Albumentations, so much more augmentations can be now used out of the box. Some, but not all, are available in their demo https://demo.albumentations.ai/.

re @biodatasciencearg: Great, thanks for the contribution! I've also released the dataset I used to train the original model, see this. I downloaded your dataset, aligned it with the new format and uploaded it to releases https://github.com/ankandrew/fast-plate-ocr/releases/tag/arg-plates. However, I noticed performance on this is not great, so definitely should retrain with both datasets combined. Feel free to re-train, and I will upload it to the hub - otherwise I'll see if I have time and do it.

@ankandrew
Copy link
Owner

Since I didn't include any "zoom out" augmentation and my train dataset plates were roughly cropped the same way, seems like that is the reason for bad performance in @biodatasciencearg dataset. I compared the same image, but just cropped (so it aligns more with the train dataset).

Image example: 826988-MLA54910508732_042023.jpg

Original image

Screenshot 2024-04-15 at 3 44 49 PM

Confidence: [0.11418319 0.2238257  0.06423701 0.13651179 0.12144833 0.13031216 0.6558619 ]

Cropped image

Screenshot 2024-04-15 at 3 44 19 PM

[0.78425103 0.74215186 0.672121   0.774952   0.7979243  0.78554845 0.8182486 ]

I guess some more augmentation can be introduced to make this more robust for different crops.

@ankandrew ankandrew reopened this Apr 15, 2024
@biodatasciencearg
Copy link

biodatasciencearg commented Apr 22, 2024

Hi everyone! @ankandrew In fact when you have an distorsioned license plate the recognition would be difficult.
I have even tried this service https://portal.vision.cognitive.azure.com/demo/extract-text-from-images and in very tilted license plate fail.

But to be honest You just need to get de ROIs with this implementation an it's. After that with this model works perfectly and with low latency!
https://github.com/claudiojung/iwpod-net

@ankandrew
Copy link
Owner

Hi everyone! @ankandrew In fact when you have an distorsioned license plate the recognition would be difficult. I have even tried this service https://portal.vision.cognitive.azure.com/demo/extract-text-from-images and in very tilted license plate fail.

But to be honest You just need to get de ROIs with this implementation an it's. After that with this model works perfectly and with low latency! https://github.com/claudiojung/iwpod-net

I agree, to get the best performance plates should be cropped properly (as was my original design choice). Closing this.

@biodatasciencearg
Copy link

I am considering the possibility of doing the data augmentation externally, which could potentially include GANs for generating license plates with lighting changes.

Therefore, I would like to ask:

Is it possible to turn off the data augmentation?
Does the input resolution in the current solution have to be 140x70? Because in the provided dataset, I see that this is not the case, and they are even in RGB. Is the conversion to B&W happening in the augmentation step? Does the resolution of motorcycle images also get adjusted to the same resolution?
Thank you very much!
elias

@ankandrew
Copy link
Owner

ankandrew commented Sep 8, 2024

Hi!

Is it possible to turn off the data augmentation?

The current way to do it is to pass an empty augmentation pipeline, so it doesn't use the default one. See this.

Does the input resolution in the current solution have to be 140x70?

Nope, the size can be anyone. These numbers are derived from my original dataset stats.

Is the conversion to B&W happening in the augmentation step?

In the preprocess phase, but I should change this and make it configurable. I will modify this in code later.

Does the resolution of motorcycle images also get adjusted to the same resolution?

Yep, applies to all images.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants