GitHub

Overview

Importu is a framework and DSL for simplifying the process of importing structured data into your application. It is also a tool for separating import-related business logic from the rest of your code.

Current supported source formats include CSV/TSV, XML and JSON. It is fairly trivial to extend Importu to handle additional formats. See the lib/importu/importer directory for implementations of supported importers.

The current version of Importu depends on both ActiveRecord and ActiveSupport, which will become optional in a future release.

Example

Please read the tutorial in the import-examples repository for a more complete overview of available features.

Assuming you have the following data in the file data.csv:

"isbn10","title","author","release_date","pages"
"0596516177","The Ruby Programming Language","David Flanagan and Yukihiro Matsumoto","Feb 1, 2008","448"
"1449355978","Computer Science Programming Basics in Ruby","Ophir Frieder, Gideon Frieder and David Grossman","May 1, 2013","188"
"0596523696","Ruby Cookbook"," Lucas Carlson and Leonard Richardson","Jul 26, 2006","910"

You can create a minimal importer to read the CSV data:

class BookImporter < Importu::Importer::Csv
  # fields we expect to find in the CSV file, field order is not important
  fields :title, :author, :isbn10, :pages, :release_date
end

And then load that data in your application:

require 'importu'

filename = File.expand_path('../data.csv', __FILE__)
importer = BookImporter.new(filename)

# importer.records returns an Enumerable
importer.records.count # => 3
importer.records.select {|r| r[:author] =~ /Matsumoto/ }.count # => 1
importer.records.each do |record|
  # ...
end

importer.records.map(&:to_hash)

A more complete example of the book importer above might look like the following:

require 'importu'

class BookImporter < Importu::Importer::Csv
  # if you want to define multiple fields with similar rules, use 'fields'
  # NOTE: ':required => true' is redundant in this example; any defined
  # fields must have a corresponding column in the source data by default
  fields :title, :isbn10, :authors, :required => true

  # to mark a field as optional in the source data
  field :pages, :required => false

  # you can reference the same field multiple times and apply rules
  # incrementally; this provides a lot of flexibility in describing your
  # importer rules, such as grouping all the required fields together and
  # explicitly stating that "these are required"; the importer becomes the
  # reference document:
  #
  # fields :title, :isbn10, :authors, :release_date, :required => true
  # fields :pages, :required => false
  #
  # ...or keep all the rules for that field with that field, whatever makes
  # sense for your particular use case.

  # if your field is not named the same as the source data, you can use
  # :label => '...' to reference the correct field, where the label is what
  # the field is labelled in the source data
  field :authors, :label => 'author'

  # you can convert fields using one of the built-in converters
  field :pages, &convert_to(:integer)
  field :release_date, &convert_to(:date) # date format is guessed

  # some converters allow you to pass additional arguments; in the case of
  # the date converter, you can pass an explicit format and it will raise an
  # error if a date is encountered that doesn't match
  field :release_date, &convert_to(:date, :format => '%b %d, %Y')

  # passing a block to a field definition allows you to add your own logic
  # for converting data or checking for unexpected values
  field :authors do
    value = clean(:authors) # apply :clean converter which strips whitespace
    authors = value ? value.split(/(?:, )|(?: and )|(?: & )/i) : []

    if authors.none?
      # ArgumentError will be converted to an Importu::FieldParseError, which
      # will include the name of the field affected
      raise ArgumentError, "at least one author is required"
    end

    authors
  end

  # abstract fields that are not part of the original data set can be created
  field :by_matz, :abstract => true do
    # field conversion rules can reference other fields; the field value is
    # what would be returned after referenced field's rules have been applied
    field_value(:authors).include?('Yukihiro Matsumoto')
  end
end

A more condensed version of the above, with all the rules grouped into individual field definitions:

class BookImporter < Importu::Importer::Csv
  fields :title, :isbn10

  field :authors, :label => 'author' do
    authors = clean(:authors).to_s.split(/(?:, )|(?: and )|(?: & )/i)
    raise ArgumentError, "at least one author is required" if authors.none?
    
    authors
  end

  field :pages, :required => false, &convert_to(:integer)
  field :release_date, &convert_to(:date, :format => '%b %d, %Y') 

  field :by_matz, :abstract => true do
    field_value(:authors).include?('Yukihiro Matsumoto')
  end
end

Rails / ActiveRecord

If you define a model in the importer definition and the importer fields are named the same as the attributes in your model, Importu can iterate through and create or update records for you:

class BookImporter < Importu::Importer::Csv
  model 'Book'

  # ...
end

filename = File.expand_path('../data.csv', __FILE__)
importer = BookImporter.new(filename)

importer.import!

importer.total # => 3
importer.invalid # => 0
importer.created # => 3
importer.updated # => 0
importer.unchanged # => 0

importer.import!

importer.total # => 3
importer.created # => 0
importer.unchanged # => 3

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
lib		lib
spec		spec
.gitignore		.gitignore
Gemfile		Gemfile
README.md		README.md
importu.gemspec		importu.gemspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib

lib

spec

spec

.gitignore

.gitignore

Gemfile

Gemfile

README.md

README.md

importu.gemspec

importu.gemspec

Repository files navigation

Overview

Example

Rails / ActiveRecord

About

Releases

Packages

Languages

hungrything/importu

Folders and files

Latest commit

History

Repository files navigation

Overview

Example

Rails / ActiveRecord

About

Resources

Stars

Watchers

Forks

Languages