Skip to content

EltonLab/ocr_desktop_ubuntu

Repository files navigation

demo video

demo.mp4

I made a simple gui for PaddleOCR, most of the code are just from their repo, this project has only be tested on ubuntu 22.04

project setup

to run this project you'll need to download the paddleOCR or paddleOCR-gpu deep learning framework,

pip install -r requirements.txt 
### paddle paddle is the deep learning framework this repo used
# if you have gpu
python -m pip install paddlepaddle-gpu -i https://pypi.tuna.tsinghua.edu.cn/simple
# if you dont have gpu
python -m pip install paddlepaddle -i https://pypi.tuna.tsinghua.edu.cn/simple

trouble shoot

here is some problem I've encounter when trying to install paddlepaddle, and the solutions for these problems have been answered in these link.
ImportError: libssl.so.1.1: cannot open shared object file: No such file or directory
ImportError: libcudart.so.10.2: cannot open shared object file: No such file or directory
FatalError: Segmentation fault AttributeError: 'FreeTypeFont' object has no attribute 'getsize'

run the model

gui

to run ocr with gui, type the following in your command line

python ocr_shot3.py

ocr_shot3 will then call predict_system3.py, the default args for the model is in ./arg.py

cmd

If you want to debug using cmd just type

pip install fire    #useful command line tool
python predict_system3.py --mode en
python predict_system3.py --mode jp
python predict_system3.py --mode ch

run this app using keyboard shortcut

currently only tested on linux/ubuntu22.04,
for other distro, checkout flameshot's setup guide

set up on ubuntu

1. copy the execute file to /usr/local/bin

$ pwd
/path/to/this/OCR_ScreenShot
$ sudo ln -s /path/to/this/OCR_ScreenShot_folder /usr/local
$ chmod +x ./ocr_shot
$ sudo cp ./ocr_shot /usr/local/bin
$ ocr_shot
# now you can ocr_shot in command line everywhere

go to
setting/keyboard/View and Customize Shortcuts
scroll down and click Custom Shortcuts
“”

click "Add shortcut"
and set to any keyboard shortcut you like

“”

download the weights from official website

weights are already in the download folder, but sometimes you will want to download other language from the official website.
model are all download from this page (en) or this page (zh)
basically, you just need to download the detection model, and recognition model with the language you want to use, and then put it into the ./model directory

detection model

one thing to note is that sometimes you don't need to download every detection model, for example the chinise detection model can also detect english sentence (performance might drop a little bit, but it's doesn't )

make sure to download the inference model, not the train model. and the chinese detection model version should be v3

english detection model version should be v3

“”

Multilingual detection model version should be v3

recognition model

you can download either v4 or v3, make sure to download the inference model, not the train model

you can download either v4 or v3

“”

japanese recognition model should be v3

About

Desktop OCR tool on ubuntu, powered by paddleOCR

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors