Skip to content

Releases: httpreserve/tikalinkextract

Tikalinkextract 0.0.3

16 Mar 17:52
Compare
Choose a tag to compare

Tikalinkextract 0.0.3 adds the ability for users to use custom protocols.

This commit gives users the ability to specify custom protocols in an extensions file so that uri-types that may not be as well-known can be identified by the link scanner. Examples might include those used by content management systems as internal mechanisms of accessing information.

An extension file might looks as follows:

{
	"Extensions": [
		"pw://",
		"info:ark/",
		"info:pronom/",
		"info:hdl/"
	]
}

And using the extensions-test folder distributed with the tool, can be tested as follows: ./tikalinkextract --file extensions-test/ -extensions "extensions.json" 2> /dev/null where the output will look like:

extensions-protocols.txt, pw://somedata.dat
extensions-protocols.txt, info:ark/somedata.dat
extensions-protocols.txt, info:pronom/somedata.dat
extensions-protocols.txt, info:hdl/somedata.dat

For more information about tikalinkextract please take a look at my Open Preservation Foundation blog about it.

tle-0.0.2

21 Oct 11:56
Compare
Choose a tag to compare

To support my latest OPF blog. All releases contain Apache Tika 1.16 for maximum usability, https://tika.apache.org/download.html

Releases available for Windows and Linux.

Blog

Hyperlinks in your files? How to get them out using tikalinkextract