Extract the table of content(ToC) from a PDF document into a GitHub-flavored markdown(GFM) task list markup
https://gitlab.com/brlin/pdf-toc-to-gfm-tasklist
Before running this software, the following prerequites must be satisfied:
- Python
The runtime environment, currently only tested on 3.11. Thepython3
compatibility executable must be available in your command search PATHs. - A recent version of PyPDF2
The python library for operating the input PDF file, currently only tested on version 2.12.
The following usage instructions assumes you are using a Unix-like operating system, instructions may not fully apply to other operating systems.
-
Download the release package from the project's Releases page
-
Extract the release package
-
Edit the pdf-toc-to-gfm-tasklist.py Python program using a text editor, edit the
open()
call to load the correct input PDF file to extract ToC from and save the file -
Launch a text terminal application
-
Run the following command(with all the placeholder texts replaced) to switch the working directory to the extracted release package directory:
cd /path/to/pdf-toc-to-gfm-tasklist-X.Y.Z
-
Run the following command to execute the ToC extraction utility, the converted ToC Markdown tasklist markup will be printed to the standard output device:
./pdf-toc-to-gfm-tasklist.py
You may also use the I/O redirection functionality of your shell(e.g. Bash) to write the output to a file:
./pdf-toc-to-gfm-tasklist.py > toc.md
Note that the I/O redirection syntax may be different for different shells, refer their user manual for more information.
Have fun!
This section documents the third-party resources that this project has referenced:
- PDF to GFM Conversion | ChatGPT
The discussion thread to render the utility prototype
Unless otherwise noted(individual file's header/REUSE DEP5), this product is licensed under version 3 of The GNU Affero General Public License, or any of its more recent versions of your preference.
This work complies to the REUSE Specification, refer the REUSE - Make licensing easy for everyone website for info regarding the licensing of this product.