Skip to content

Cookbook: Using the Excel fetcher

Mark Jordan edited this page Jun 26, 2017 · 7 revisions

If using CSV files as input is inconvenient, you can replace the CSV fetcher in all CSV toolchains with the Excel fetcher. For this to work, your Excel file must follow these guidelines:

  • it must be in Excel 2007/2010/2013 format (which normally uses the file extension .xlsx)
  • the Excel file may contain multiple worksheets, but the metadata must be in the first worksheet
  • the first row must contain column labels/headings
    • all column headings must be unique, and the heading row cannot contain any empty headings
  • each row in the Excel file must correspond to one Islandora object (image, PDF, etc., compound, newspaper issue, or book)
  • one of the fields must contain a unique identifier for each row in the file (in the [FETCHER] section's "record_key" configuration setting, as described below), and
  • one of the fields must contain the name of the file that is to be used in each of the created objects (the name of the field is configured in the [FILE_GETTER] file_name_field setting).
    • like with the Csv fetcher, the field defined in the [FILE_GETTER] file_name_field names a directory, not a file, when you are creating book or newspaper issue ingest packages
    • like with the Csv fetcher, the field defined in the [FILE_GETTER] compound_directory_field names a directory, not a file, when you are creating compound ingest packages

To use this fetcher in the CSV toolchains, use "Excel" as the value of the [FETCHER] class entry in your .ini file, like this:

[FETCHER]
class = Excel

The Excel fetcher requires input_file, temp_directory, and record_key values, just like the Csv fetcher:

[FETCHER]
class = Excel
input_file = "/home/mark/Downloads/cartoons.xslx"
temp_directory = "/tmp/cartoons_temp"
record_key = "CartoonID"

However, the Excel fetcher does not need the CSV-specific values field_delimiter, field_enclosure, and escape_character.

Note that the Excel fetcher only replaces its Csv equivalent. The other components in the Csv toolchains (the file getter, metadata parser, and writer), all remain the same. For example, if you wanted to use the Excel fetcher with the CsvSingleFile toolchain, the class entries in your .ini file would look like this:

[FETCHER]
class = Excel

[METADATA_PARSER]
class = mods\CsvToMods

[FILE_GETTER]
class = CsvSingleFile

[WRITER]
class = CsvSingleFile

Cookbook table of contents

Clone this wiki locally