Skip to content

Extract formatted text from documents, transform it, then put back in place

License

Notifications You must be signed in to change notification settings

TinoDidriksen/Transfuse

Repository files navigation

Transfuse

Requirements

  • CMake
  • SQLite 3
  • libxml2
  • xxhash
  • libzip
  • pkg-config (for non-vcpkg platforms)
  • Debian/Ubuntu: sudo apt-get install build-essential cmake pkg-config libsqlite3-dev libxml2-dev libxxhash-dev libzip-dev
  • macOS MacPorts: sudo port install cmake pkgconfig sqlite3 libxml2 xxhash libzip

Usage

Given a HTML document, run tf-extract document.html or cat document.html | tf-extract to extract text blocks with transformed inline tags.

About

Extract formatted text from documents, transform it, then put back in place

Resources

License

Stars

Watchers

Forks