- U^2-Net is used for background removal
- Textcleaner is used for image cleaning and line deskew (max 5 degrees)
- Tesseract is used for text angle rotation
- Deskew is used for line deskew (between 5 and 45 degrees)
Tested one document on smartphone camera with different angles
- Clone thee repo
- Download the model: check
app/saved_models/README.md
- Build Docker image :
docker build -t <REPOSITORY-NAME>/<IMAGE>:<TAG> .
- Test locally : Run Docker image and check if api is working by running http://localhost:10000
- CPU :
docker run -it -v $PWD:/LOCAL/ -p 10000:80 <REPOSITORY-NAME>/<IMAGE>:<TAG>
- GPU :
docker run -it --gpus all -v $PWD:/LOCAL/ -p 10000:80 <REPOSITORY-NAME>/<IMAGE>:<TAG>
- CPU :
- Push docker image to Dockerhub (optional):
- Check: https://docs.docker.com/docker-hub/repos/ for account setup
- Create in Dockerhub Repo similar to the name of yout Image ID :
<REPOSITORY-NAME>
- Run
docker push <REPOSITORY-NAME>/<IMAGE>:<TAG>
- Deploy to Cloud Run (optional):
- Create your google cloud account
- Push Docker Image to Google Container Registry
- create new project called
[PROJECT-ID]
- Open Cloud shell in your Google account and run:
docker pull <REPOSITORY-NAME>/<IMAGE>:<TAG>
docker tag [IMAGE] gcr.io/[PROJECT-ID]/[IMAGE]
docker push gcr.io/[PROJECT-ID]/[IMAGE]
more detail in this link
- create new project called
- Create CloudRun Service, and select Container that was created
- Screenshot of the config - for demo purpose, it will be cost free
- Click Deploy, and test the Api Url that will display
- Speed: It takes 7 to 10 seconds to process one image (serverless Cloud Run) With Gpu we can save 2 to 3 seconds (U^2-Net is 3 times faster)
- Textcleaner is slow(speed) but works good on image cleaning, but needs some manual fine-tuning a faster alternative can be used (Ex. Opencv)
- Taking pictures from angled positions is not supported, perspective transformation can be used but may deteriorate text quality
- U^2-Net limitations : the document should be centered in a contrasting color background (white background will not work)