WORK IN PROGRESS WARNING
fondz is a command line tool for auto-generating an "archival description" for a set of born digital content found in a bag or series of bags. The name fondz was borrowed from a humorous take on the archival principle of provenance or respect des fonds. fondz works best if you point it at a collection of content that has some thematic unity, such as a collection associated with an individual, family or organization.
The description that fondz generates is expressed as HTML, suitable for viewing in both online and offline mode without the need for running software other than your Web browser. The HTML also contains structured metadata, so that it can be easily processed by other applications. fondz is largely an integration layer for other tools such as:
- libreoffice - for converting various document formats to html
- mallet - for creating topic models of textual content
- fido - for identifying file formats using the Pronom registry.
- exiftool - for extracting image metadata
As better opensource tools for converting documents and summarizing their contents become available they can be swapped out for the ones that are being used now.
Thanks go to Twitter and Facebook for their HTML based archive packages, and to Jekyll and other static site generators, which were an inspiration for fondz.
Here's an example for generating a finding aid for two bags, the first bag
which contains a Word document
abc.doc, and the second which contains a
% fondz create "Collection Name" example /path/to/a/bag /path/to/another/bag % tree example/ example/ ├── css │ └── style.css ├── derivatives │ ├── 1 │ │ └── abc.doc.html │ └── 2 │ └── def.wpd.html ├── index.html ├── js │ │── topics.json │ └── formats.json └── originals ├── 1 -> /path/to/a/bag └── 2 -> /path/to/another/bag
Once fondz has run you should be able to open
index.html in your
browser and browse the content.
On Ubuntu you should be able to install fondz by running:
If you are on OSX or Windows you should be able to read the install script and figure out what you need to install and put in your PATH.
If you've got ideas for things fondz could do, please submit them as an enhancement issue at GitHub.
These are ideas that seem theoretically doable. If you have similar ideas to add, please send me a pull request for your changes to the README.md, or open an issue here on Github.
If you are on OSX and want to convince fondz to use your OpenOffice install put
a script like this named
libreoffice in your PATH:
#!/bin/sh cd /Applications/LibreOffice.app/Contents/MacOS/ ./soffice $@