Skip to content

DLP System using computer vision and Tesseract LSTM

Notifications You must be signed in to change notification settings

angelorodem/SuspiciousEyes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Suspicious Eyes Image DLP Example

This project is a very simple demonstration of image DLP System, it works for text content in images and for image identifiers.

Text in images

The first feature uses Tesseract OCR to locate words in the image, with a simple algorithm we construct sentences with the found words and apply Perl compliant Regular expressions to search for matches with the DLP rules.

Example 1

Text recognition

Assembled text: " AM Android is the world's most popular mobile platform. With Android you can use all the Google apps you know and love, plus there are more than 600,000 apps and games available on Google Play to keep you entertained, alongside millions of songs and books, and thousands of movies. Android devices are already smart, and will only get smarter, with new features you won't find on any other platform, letting you focus on what's important and putting you in control of your mobile experience."

DLP Rule Detected rule

Example 2

Text image

Assembled text: " A . Teste <Hey, im leaking this secret info about Joshep jhonson, his secret id is abcd-1a2b3-uu77"

Rule Detection
Note: there is a limitation with dark on dark text

Images in images

This feature allows us to search an image inside other images, this can be used to find marked documents with special markers or just by some common item.
In the result below you will see examples of detection of distorted, skewed, and rotated markers, this was all made using an algorithm called SURF. A B C D E F G H

About

DLP System using computer vision and Tesseract LSTM

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages