Add utility to get PDF info for proper titles on PDF entries

Content of PDF documents is not indexed for suggestions, while on some ZIM it is the "core" of the ZIM.

Having a utility in scraperlib to extract PDF info and get the document title would probably help.

See https://github.com/openzim/warc2zim/issues/290 for one use-case.