Pure Go Full Text Search of PDF Files
This library implements full text search for PDFs.
- The public APIs are in index_search.go.
The are some command lines programs that demonstrate the library's functionality.
- examples/pdf_search_demo.go demonstrates the main APIs.
- examples/index.go builds an index over a set of PDFs.
- examples/search.go searches the index build by examples/index.go.
Binary versions (executables) of these three programs are available in releases. There are 64-bit binaries for Windows, Mac and Linux. The binaries do not require a UniDoc license.
git clone https://github.com/PaperCutSoftware/pdfsearch
cd pdfsearch/examples go build pdf_search_demo.go go build index.go go build search.go
./pdf_search_demo -f <PDF path> <search term>
./pdf_search_demo -f PDF32000_2008.pdf cubic Bézier curve
The example will search
PDF32000_2008.pdf for cubic Bézier curve.
pdf_search_demo.go shows how to use the APIs in index_search.go to
- create indexes over PDFs,
- search those indexes using full-text search, and
- mark up PDFs with the locations of the search matches on pages.
./index <file pattern>
The example creates an on-disk index over the PDFs in
~/climate/ and its subdirectories.
./search <search term>
./search integrated assessment model
The example searches the on-disk index created by examples/index.go for integrated assessment model.