zockerBoy - AD Detection

Usage

cd to the working directory "zockerBoy"
Install the requirements using the following command:
```
pip install -r requirements.txt
```

Run the following command to run the program:

python main.py --i <location of image to be used> --d <device to be used, 0 for cpu, 1 for cuda device>

There are example images provided inside the "images" folder

Example Usage:

python main.py --i image\\test_ad.jpg --d 1

Improvements:

Logo Detection has not been implemented:
- Training was taking too long, and no suitable open-source dataset was found that gave enough accuracy to be worth adding it.
- Good datasets were in the 100's of the Gb's.
Adding a LLM to the output:
- Adding a LLM would:
  - Would make it easier to understand the output.
  - Adding a LLM would help to convert hex code into color names.
  - Would be helpful for people who are colorblind, would apply neccessary conversions.
  - Will help in making it more of interactive experience.
Adding a GUI:
- Can add gradio for a better GUI
Adding better text post-processing techniques:
- Can use RE with a method of regex to find the text that makes sense and can filter non-sense text and single letter text.

Path Taken

Started off with Color Palette detection:
1. Used color thief
Then went onto make text overlay detection:
1. Tried using pyTesseract natively
  1. Had issues with implementing it due to bad OCR results
2. Used cv2 to turn it into bw (better results than rgb)
3. Used CV2 to make boxes around text and compile it
4. Used pyTesseract to detect text from those boxes
  1. Had to mess around with the config to see which method gives most matches based on e-media
  2. Added thresholding to it make sure only the high % matches get through
  3. Had to mess around with the threshold to see what works the best
Object Detection:
1. Tried using yoloX
  1. dependancies weren't resolving, wheels weren't building
    1. Severe python interpreter problems faced, had to re-do path
  2. tried moondream
    1. models specified had too varying tensor values, couldn't find a suitable sigmoid loss model for it to fit into
  3. tried mmdetection
    1. it's also viable for commercial usage
    2. had to build gcc/mingw
    3. had to build mmDetection and it's sub-packages from base to work with cuda 12.1 and my version of cuDNN
      1. Built core labelling and idenitfication
      2. Built tracking
      3. Test tracking on image and videos with boxes made
        
        Used an instance of yoloX for the same (which is also commercially viable)
  4. Tried yoloV8
    1. Faster inference
    2. Better trained model
    3. made argparser to catch output
      1. Checked documentation, turns out they have a thing to for that anyways
Logo Detection:
1. Tried using yolov8:
  1. Lengthy to convert dataset into yoloV8 format
  2. Local machine cannot train efficiently enough (3060)
  3. Bad documentation
2. trying to use yoloV3/4/5:

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
image		image
models		models
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
todo.md		todo.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

zockerBoy - AD Detection

Usage

Improvements:

Path Taken

About

Releases

Packages

Languages

License

PoyBoi/zockerBoy

Folders and files

Latest commit

History

Repository files navigation

zockerBoy - AD Detection

Usage

Improvements:

Path Taken

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages