Skip to content

superhellth/PDFLAT

 
 

Repository files navigation

PDFLAT

PDF Layout Annotation Tool

A simple, self-hosted, web-based app that allows you to annotate the layouts of PDF files to create custom datasets.

Architecture

PDFLAT is powered by a SvelteKit frontend, a FastAPI backend and a PostgreSQL database. For an easy setup, consistency, and portability across different environments, the application is fully dockerized.

Setup

  1. clone the repository
  2. make sure your ports 1337, 5173, and 5432 are unoccupied (or modify the configuration if needed)
  3. run ./start.sh (might require admin rights => sudo ./start.sh, this will take quite a while, don't worry, it's normal)

Usage

  1. create your datasets at port 5173
  2. upload PDF files for your datasets
  3. annotate pages
  4. use the API via port 1337 to export datasets for subsequent tasks

If you use PDFLAT in the process of creating any (published) work please cite this repository and feel invited to drop me a message so I can see what you are working on :)

Project Status

Please note that PDFLAT is in early beta status and lacks proper documentation and useful features. Feel free to create pull requests if you improve it.

About

PDF Layout Annotation Tool

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Python 68.6%
  • Svelte 20.4%
  • JavaScript 5.9%
  • Dockerfile 1.8%
  • TypeScript 1.6%
  • CSS 1.4%
  • HTML 0.3%