Skip to content

Files_Directory_Importer

Julie Allinson edited this page Mar 19, 2018 · 4 revisions

Files Directory Importer

This is a simple importer for directories of files:

Add to a Hyrax or Hyku repository with:

rails g hyku_leaf:importers -f

Run:

bin/import_files_to_existing_objects <server> <path_to_csv_file> <path_to_directory> <depth>

For example:

bin/import_files_to_existing_objects localhost path/to/my/file.csv path/to/my/files 0

What it does

  • Given a csv file with two columns, the script can import files from a directory or sub-directories

Instructions

CSV File

  • The csv file must contain two columns and no header row.
  • Column one must contain the id of an existing Work to which you want to add FileSets/Files
  • Column two must contain a list of files or directories within the files directory provided

Files Directory and Depth Parameter

The depth parameter tells the where the files to be ingested are:

  • A depth of 0 means that files are contained directly within the files directory. The filename name will match the value in the second column of the csv file.
  • A depth of 1 means that the files are contained within a folder within the files directory. The folder name will match the value in the second column of the csv file.
  • A depth of two or more means that there are sub-directories beneath the folder name in the csv file.

For example:

Command:

bin/import_files_to_existing_objects localhost csv_file files_directory 0

CSV:

12345,file_to_ingest.pdf

File to ingest:

files_directory\file_to_ingest.pdf

Command:

bin/import_files_to_existing_objects localhost csv_file files_directory 1

CSV:

12345,directory_for_12345

Directory to ingest - all files in this directory are ingested to the object:

files_directory\directory_for_12345

Command:

bin/import_files_to_existing_objects localhost csv_file files_directory 2

CSV:

12345,directory_for_12345\

Directory to ingest - all files in these directories are ingested to the object:

files_directory\directory_for_12345\01\
files_directory\directory_for_12345\02\
files_directory\directory_for_12345\03\

Files Order

When ingesting from a directory rather than individual files (ie. the depth is greater than 0), the files will be sorted alphabetically for ingest. Therefore, if order matters, make sure the directory contents are in the correct order when sorted alphabetically.

The Code

The importer code can be found here:

lib/importer/directory_files_importer.rb
lib/importer/files_parser.rb 

Extending the importer

The importer could be extended:

  • to create new objects with minimal metadata
  • to add metadata to the files / filesets themselves (eg. a fileset title, or visibility setting)