Skip to content

(MIRRORED FROM GITLAB) Extract the table of content(ToC) from a PDF document into a GitHub-flavored markdown(GFM) task list markup

Notifications You must be signed in to change notification settings

brlin-tw/pdf-toc-to-gfm-tasklist

Repository files navigation

PDF ToC to GFM tasklist

Extract the table of content(ToC) from a PDF document into a GitHub-flavored markdown(GFM) task list markup

https://gitlab.com/brlin/pdf-toc-to-gfm-tasklist
The GitLab CI pipeline status badge of the project's main branch GitHub Actions workflow status badge pre-commit enabled badge REUSE Specification compliance badge

Usage demonstration screenshot

Prerequisites

Before running this software, the following prerequites must be satisfied:

  • Python
    The runtime environment, currently only tested on 3.11. The python3 compatibility executable must be available in your command search PATHs.
  • A recent version of PyPDF2
    The python library for operating the input PDF file, currently only tested on version 2.12.

Usage

The following usage instructions assumes you are using a Unix-like operating system, instructions may not fully apply to other operating systems.

  1. Download the release package from the project's Releases page

  2. Extract the release package

  3. Edit the pdf-toc-to-gfm-tasklist.py Python program using a text editor, edit the open() call to load the correct input PDF file to extract ToC from and save the file

  4. Launch a text terminal application

  5. Run the following command(with all the placeholder texts replaced) to switch the working directory to the extracted release package directory:

    cd /path/to/pdf-toc-to-gfm-tasklist-X.Y.Z
  6. Run the following command to execute the ToC extraction utility, the converted ToC Markdown tasklist markup will be printed to the standard output device:

    ./pdf-toc-to-gfm-tasklist.py

    You may also use the I/O redirection functionality of your shell(e.g. Bash) to write the output to a file:

    ./pdf-toc-to-gfm-tasklist.py > toc.md

    Note that the I/O redirection syntax may be different for different shells, refer their user manual for more information.

Have fun!

References

This section documents the third-party resources that this project has referenced:

Licensing

Unless otherwise noted(individual file's header/REUSE DEP5), this product is licensed under version 3 of The GNU Affero General Public License, or any of its more recent versions of your preference.

This work complies to the REUSE Specification, refer the REUSE - Make licensing easy for everyone website for info regarding the licensing of this product.

About

(MIRRORED FROM GITLAB) Extract the table of content(ToC) from a PDF document into a GitHub-flavored markdown(GFM) task list markup

Topics

Resources

Stars

Watchers

Forks