Skip to content

BagIt Importer

Alisha Evans edited this page Jul 18, 2022 · 9 revisions

Bulkrax can import valid BagIt bags, either individually, or multiple bags in a single folder. The bag, or folder of bags may be supplied in a zip file.

The Bag(s)

Bulkrax assumes that each bag will contain one or more works, within a single metadata file and one or more data files.

A Single Bag

This single bag containing two images and one metadata file would be imported as a single Work with two files attached. The metadata file (my_metadata.csv) can be at the top level as it is here, or it can be in the data folder.

my_bag
  data
    my_image.tif
    my_other_image.jpg
  my_metadata.csv
  (bagit files)

Multiple Bags

This folder would import each bag as a separate work - 3 works in total.

folder
  my_bag
    data
      my_image.tif
    more_metadata.csv
    (bagit files)
  my_second_bag
    (structured as per my_bag)
  my_third_bag
    (ditto)

Multiple Works (WIP)

This bag would be unpacked to create three works, one per metadata file.

my_bag
  data
    work1
      my_image1.tif
      my_metadata.csv
    work2
      my_image2.tif
      my_metadata.csv
    work3
      my_image3.tif
      my_metadata.csv
   (bagit files)

Metadata

  • If a CSV is supplied, File Sets and Collections can also be imported on the same csv. e.g. importing a work, file set and collection together

    model title parents source_identifier
    Work My work my_collection my_work
    Collection My collection my_collection
    FileSet My file set my_work my_file
  • If there are multiple bags, or multiple works, each metadata file MUST have the same filename and MUST be co-located with the data files (as per the example above).

  • Metadata can be supplied as RDF or CSV.

Creating Bags for Import

There are various tools for creating BagIt bags. For example, using the ruby 'bagit' library in an irb console:

gem install bagit
irb
> require 'bagit'
> # make a new bag from existing files
> bag = BagIt::Bag.new path_to_files
  # e.g.: bag = BagIt::Bag.new '/Users/computer_name/Work/my_bag'
> bag.manifest!(algo: 'sha256') # ref: https://www.geeksforgeeks.org/difference-between-sha1-and-sha256/