Clairvoyance

A data extracting and validating web app. Extracts the data from images using Tesseract OCR.
The app "extracts" the following types of data from images through regex, and then "validates" them (if applicable):

->id numbers
->credit card numbers
->plate ids
->dates
->emails
->domains
->urls
->hashes
->combolists

The successful JSONResponse is of the form:

  {"content": "This is a lot of 12 point text to test the\nocr code and see if it works on all types\nof file format.\n\nThe quick brown dog jumped over the\nlazy fox. The quick brown dog jumped\nover the lazy fox. The quick brown dog\njumped over the lazy fox. The quick\nbrown dog jumped over the lazy fox.",
  "status": "successful",
  "findings": [
    {
      "value": "12 point ",
      "type": "DATE"
    }
  ]
}

TO RUN:
You need to have Docker installed on your machine.
In order to run the containers, do:
docker compose up

By default the app works on localhost:8000 and the url for the upload endpoint is http://localhost:8000/api/v1/upload/.
You can upload all image formats Tesseract OCR is compatible with. The endpoint's content type needs to be multipart/form-data.

It is advised to use POSTMAN to upload the images but you can also use curl to upload files the following way:
curl -X POST "http://localhost:8000/api/v1/upload/" -H "accept: application/json" -H "Content-Type: multipart/form-data" -F "file=@test.png;type=image/jpeg"

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
envs		envs
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
README.md		README.md
TODO		TODO
challenge.pdf		challenge.pdf
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
regex cases		regex cases

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clairvoyance

About

Releases

Packages

Languages

bitsima/clairvoyance

Folders and files

Latest commit

History

Repository files navigation

Clairvoyance

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages