vigetlabs / simple_importer

Simple API for importing from csv.

This URL has Read+Write access

gotascii (author)
Thu Oct 22 18:09:19 -0700 2009
commit  cce4d7eb9cb9b738b880f42e117e4c747a5826cd
tree    8a4fdfc708bdf2349fe00b5aea1d01bccd2fe087
parent  181508367f5aa74d7882a1ce5e68d0c78ac3c8bc
name age message
file .gitignore Loading commit data...
file LICENSE Tue Oct 20 05:50:32 -0700 2009 switching to jeweler [gotascii]
file README.rdoc
file Rakefile
file VERSION
directory data/
directory importers/
directory lib/
file simple_importer.gemspec
directory test/
README.rdoc

simple_importer

Simple API for importing csv files.

Defining importers

The basic importer has a name, file, and a foreach block.

  importer :items do
    file 'items.csv'

    foreach do |row|
      Item.create(:name => row[:name])
    end
  end

simple_importer uses fastercsv, which replaces the csv standard lib in ruby1.9. Importers have configuration options that match those of fastercsv. The foreach block is called for each row in the csv file. The argument passed into the foreach block is a fastercsv row created with the specified configuration.

  importer :items do
    file 'items.csv'

    # all fastercsv options are supported
    headers false
    converters :numeric
    # etc..

    foreach do |row|
      Item.create(:name => row[:name])
    end
  end

Importers have a default configuration that matches the following importer.

  importer :items do
    file 'items.csv'

    # default configuration
    col_sep ","
    row_sep :auto
    quote_char '"'
    field_size_limit nil
    converters :all
    unconverted_fields nil
    headers true
    return_headers false
    header_converters :symbol
    skip_blanks false
    force_quotes false
    # end default configuration

    foreach do |row|
      Item.create(:name => row[:name])
    end
  end

Before callback

Define before_filter-style callbacks in the form of the before block. This block is run once before the importer runs.

  importer :items do
    file 'items.csv'

    before do
      Item.delete_all
    end

    foreach do |row|
      Item.create(:name => row[:name])
    end
  end

File processing

Passing an array of file paths to the file configuration field will process all the rows in each file.

  importer :items do
    file Dir.glob("data/*.csv")

    foreach do |row|
      Item.create(:name => row[:name])
    end
  end

Use the foreach_file block to add per-file behavior.

  importer :items do
    file Dir.glob("data/*.csv")

    foreach_file do |file|
      List.create(:name => File.basename(file, ".csv"))
    end

    foreach do |row|
      Item.create(:name => row[:name])
    end
  end

Nest the foreach block to give your row processing block access to foreach_file block variables.

  importer :items do
    file Dir.glob("data/*.csv")

    foreach_file do |file|
      list = List.create(:name => File.basename(file, ".csv"))

      foreach do |row|
        Item.create(:name => row[:name], :list => list)
      end
    end
  end

Loading and running importers

simple_importer provides a method that will find and load importer definitions in the importers, lib/importers, and app/importers directories.

  # importer definition is in lib/importers/item.rb
  SimpleImporter.find_importers

SimpleImporter now has a collection of importers and each importer can be run now that it has been loaded.

  # the following will fire off the actual importers
  SimpleImporter.importers.each do |importer|
    importer.run
  end

Rake tasks and autoloaded importers

simple_importer provides rake tasks to run importers individually or all at once.

  # add this to your rakefile
  # lib/tasks/simple_importer.rake is a good place in your rails app
  require 'simple_importer/tasks'

rake -T simple_importer will show a rake task for each importer as well as an import task which runs all of the importers. Descriptions defined in the importer will be used for the rake task description.

  importer :items do
    file 'items.csv'
    desc 'Import all of the cool items'

    ...
  end

Requirements

  • fastercsv (if you are not using ruby 1.9)

Note on Patches/Pull Requests

  • Fork the project.
  • Make your feature addition or bug fix.
  • Add tests for it. This is important so I don’t break it in a future version unintentionally.
  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
  • Send me a pull request. Bonus points for topic branches.

Copyright

Copyright © 2009 Justin Marney. See LICENSE for details.