Skip to content

Cookbook: Using the Filesystem fetcher

Mark Jordan edited this page Jun 29, 2017 · 3 revisions

In some cases, you may not have any metadata to use as the basis for ingest packages. The most common situation where this might arise is if all you want to do is create minimal Islandora objects which you will add metadata to at a later time.

MIK provides a "Filesystem" fetcher that reads image, PDF, or other files from a directory and generates Islandora ingest packages from them in the absence of a CSV metadata input file. A sample input directory looks like this:

my_input_directory/
├── sample1.tif
├── sample2.tif
├── sample3.tif
├── sample4.tif
├── myfile_01.tif
├── myfile01.tif
├── otherfile100.tif
└── otherfile103.tif

Currently this fetcher only creates packages adhering to single-file content models such as basic image, large image, PDF, movie, and audio. If you want to mix files of different Islandora content models in the same directory (e.g., large image and audio), see the "Separating out packages by file extension" section below.

To use this fetcher, specify "Filesystem" in the [FETCHER] class setting in your .ini file, and use "CsvSingleFile" as the value of your [FILE_GETTER] class and [WRITER] class settings:

[FETCHER]
class = Filesystem
temp_directory = "/tmp/cartoons_temp"

Notice that this fetcher does not use an input_file or record_key setting. The [METADATA_PARSER] and [FILE_GETTER] sections of your .ini file have a couple of additional requirements:

[FETCHER]
class = Filesystem

[METADATA_PARSER]
; This metadata parser class is required with the Filesystem fetcher.
class = templated\Templated
; This is the template provided for use with this fetcher, although you can use your own if you wish.
template = extras/templates/filesystemfetcher.twig

[FILE_GETTER]
class = CsvSingleFile
; This value must be 'title' when using the Filesystem fetcher.
file_name_field = title

[WRITER]
class = CsvSingleFile

As indicated in the sample .ini entries, you must use the Templated metadata parser with this fetcher. The MODS that the provided template produces is minimal. It simply inserts the filename into the MODS title and identifier elements:

<?xml version="1.0"?>
<mods xmlns="http://www.loc.gov/mods/v3" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
  <titleInfo>
     <title>sample2.tif</title>
  </titleInfo>
  <identifier type="local" displayLabel="Local identifier">sample2.tif</identifier>
</mods>

You can use a custom template if you wish, but the only metadata fields you can use in your template are title, ID (which is the filename without its extension).

Separating out packages by file extension

If you want MIK to generate packages for files that are not of the same Islandora content model, for example PDFs and basic images, you can use the move_packages_by_extension post-write hook script to tell MIK to separate out the packages by file extension. You will need to configure the script with your extension => directory mappings:

// Directories must already exist.
$destinations = array(
    'pdf' => '/tmp/filesystemfetcher/pdf',
    'tif' => '/tmp/filesystemfetcher/largeimage',
    'jp2' => '/tmp/filesystemfetcher/largeimage',
    'jpg' => '/tmp/filesystemfetcher/basicimage',
);

Then, register the script in your .ini file like this:

[WRITER]
postwritehooks[] = "/usr/bin/php extras/scripts/postwritehooks/move_packages_by_extension.php"

When MIK finishes running, the packages will be in the directories you have configured for each file extension.

Cookbook table of contents

Clone this wiki locally