computer-lab

Local HTTP microservices for OCR and open-vocabulary UI detection.

computer-lab exposes EasyOCR and Grounding DINO through small Python servers, so automation systems can call vision models over HTTP instead of embedding those dependencies directly.

Services

Service	Default port	Endpoint	Returns
OCR	`9003`	`POST /ocr`	Text regions with `text`, `box`, `center`, `confidence`
Detection	`9004`	`POST /detect`	Grounded boxes with `label`, `box`, `center`, `confidence`

Useful for:

screenshot-to-text pipelines
UI and desktop automation
agent systems that need OCR or visual grounding behind a stable local API

Quick Start

Python 3.10+ is required. A GPU is recommended for usable detection latency; OCR can run on CPU.

git clone https://github.com/belarusian/computer-lab.git
cd computer-lab
python3 -m venv .venv
source .venv/bin/activate
pip install -U pip wheel
pip install -e ".[gpu]"

For a CPU-oriented install:

pip install -e ".[cpu]"

Run the services in separate terminals:

python servers/ocr_server.py 9003
python servers/detect_server.py 9004

Health checks:

curl -s http://127.0.0.1:9003/health
curl -s http://127.0.0.1:9004/health

Smoke Tests

The repo includes manual integration scripts rather than a full automated test suite.

export OCR_URL=http://127.0.0.1:9003/ocr
python test_ocr_locate.py

export DETECT_URL=http://127.0.0.1:9004/detect
python test_detect_locate.py "close button"

Both scripts capture the current screen with pyautogui, so they require a local desktop session.

API

POST /ocr Request body: raw PNG bytes. Response: JSON list of {text, box, center, confidence}.
POST /detect Request body: JSON { "image": "<base64 PNG>", "query": "..." }. If query is omitted, the server runs a generic open detection pass.
POST /detect_raw Request body: raw PNG bytes. Optional header: X-Query.
GET /health Returns a simple JSON health response from each server.

Runtime Notes

EASYOCR_GPU=0 forces EasyOCR to stay on CPU.
The detection server uses CUDA automatically when available and falls back to CPU otherwise.

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
servers		servers
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
test_detect_locate.py		test_detect_locate.py
test_ocr_locate.py		test_ocr_locate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

computer-lab

Services

Quick Start

Smoke Tests

API

Runtime Notes

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

computer-lab

Services

Quick Start

Smoke Tests

API

Runtime Notes

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages