Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

initial commit

  • Loading branch information...
commit 20e3f723291348ea907728156d4070d08b46357c 0 parents
@ryanb authored
20 LICENSE
@@ -0,0 +1,20 @@
+Copyright (c) 2010 Ryan Bates
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
74 README.rdoc
@@ -0,0 +1,74 @@
+= Importex
+
+This Ruby gem helps import an Excel document into a database or some other format. Just create a class defining the columns and pass in a path to an "xls" file. It will automatically format the columns into specified Ruby objects and raise errors on bad data.
+
+This is extracted from an internal set of administration scripts used for importing products into an e-commerce application. Rather than going through a web interface or directly into an SQL database, it is easiest to fill out an Excel spreadsheet with a row for each product, and filter that through a Ruby script.
+
+Note: This library has some hacks and is not intended to be a full featured, production quality library. I designed it to fit my needs for importing records through internal administration scripts.
+
+
+== Installation
+
+It is not yet available as a gem, but will be soon.
+
+ gem install importex
+
+In the meantime you'll have to include the lib files directly.
+
+Note: This relies on the parseexcel gem so you will need to have that installed as well.
+
+
+== Usage
+
+First create a class which inherits from Importex::Base and define the columns there.
+
+ require 'importex'
+ class Product < Importex::Base
+ column "Name", :required => true
+ column "Price", :format => /^\d+\.\d\d$/, :required => true
+ column "Amount in Stock", :type => Integer
+ column "Release Date", :type => Date
+ column "Discontinued", :type => Boolean
+ end
+
+Pass in an "xls" file to the Import class method to import the data. It expects the first row to be the column names and every row after that to be the records.
+
+ Product.import("path/to/products.xls")
+
+Use the "all" method to fetch the product instances for all of the records. You can access the columns like a hash.
+
+ products = Product.all
+ products.first["Discontinued"] # => false
+
+It is up to you to import this data into the database or other location. You can do this through something like Active Record, DataMapper, or Sequel.
+
+
+== Handling Bad Data
+
+If the Excel document is formatted improperly it will raise some form of Importex::ImportError exception. I recommend rescuing from this and handling it in a clean way for the user so they do not get a full stack trace.
+
+ begin
+ Product.import(...)
+ rescue Importex::ImportError => e
+ puts e.message
+ end
+
+
+== Custom Types
+
+It is possible to have smart columns which reference other Ruby objects. Importex expects a class method called "importex_value" to exist which it passes the Excel content to and expects a ruby object in return. Let's say you have a Category model in Active Record and you have the name of the category in the Products Excel sheet.
+
+ class Category < ActiveRecord::Base
+ def self.importex_value(str)
+ find_by_name!(str)
+ rescue ActiveRecord::RecordNotFound
+ raise Importex::InvalidCell, "No category with that name."
+ end
+ end
+
+ class Product < Importex::Base
+ column "Category", :type => Category
+ end
+
+Then product["Category"] will return an instance of the found Category.
+
12 Rakefile
@@ -0,0 +1,12 @@
+require 'rubygems'
+require 'rake'
+require 'echoe'
+require 'spec/rake/spectask'
+
+spec_files = Rake::FileList["spec/**/*_spec.rb"]
+
+desc "Run specs"
+Spec::Rake::SpecTask.new do |t|
+ t.spec_files = spec_files
+ t.spec_opts = ["-c"]
+end
12 lib/importex.rb
@@ -0,0 +1,12 @@
+require 'rubygems'
+require 'parseexcel'
+
+require File.expand_path(File.dirname(__FILE__) + '/importex/base')
+require File.expand_path(File.dirname(__FILE__) + '/importex/column')
+require File.expand_path(File.dirname(__FILE__) + '/importex/ruby_additions')
+
+module Importex
+ class ImportError < StandardError; end
+ class InvalidCell < ImportError; end
+ class MissingColumn < ImportError; end
+end
53 lib/importex/base.rb
@@ -0,0 +1,53 @@
+module Importex
+ class Base
+ attr_reader :attributes
+
+ def self.column(*args)
+ @columns ||= []
+ @columns << Column.new(*args)
+ end
+
+ def self.import(path, worksheet_index = 0)
+ @records ||= []
+ workbook = Spreadsheet::ParseExcel.parse(path)
+ worksheet = workbook.worksheet(worksheet_index)
+ columns = worksheet.row(0).map do |cell|
+ @columns.detect { |column| column.name == cell.to_s('latin1') }
+ end
+ (@columns.select(&:required?) - columns).each do |column|
+ raise MissingColumn, "Column #{column.name} is required but it doesn't exist."
+ end
+ (1...worksheet.num_rows).each do |row_number|
+ row = worksheet.row(row_number)
+ unless row.at(0).nil?
+ attributes = {}
+ columns.each_with_index do |column, index|
+ if column
+ if row.at(index).nil?
+ value = ""
+ elsif row.at(index).type == :date
+ value = row.at(index).date.strftime("%Y-%m-%d %H:%M:%I")
+ else
+ value = row.at(index).to_s('latin1')
+ end
+ attributes[column.name] = column.cell_value(value, row_number)
+ end
+ end
+ @records << new(attributes)
+ end
+ end
+ end
+
+ def self.all
+ @records
+ end
+
+ def initialize(attributes = {})
+ @attributes = attributes
+ end
+
+ def [](name)
+ @attributes[name]
+ end
+ end
+end
37 lib/importex/column.rb
@@ -0,0 +1,37 @@
+module Importex
+ class Column
+ attr_reader :name
+
+ def initialize(name, options = {})
+ @name = name
+ @type = options[:type]
+ @format = [options[:format]].compact.flatten
+ @required = options[:required]
+ end
+
+ def cell_value(str, row_number)
+ validate_cell(str)
+ @type ? @type.importex_value(str) : str
+ rescue InvalidCell => e
+ raise InvalidCell, "#{str} (column #{name}, row #{row_number+1}) does not match required format: #{e.message}"
+ end
+
+ def validate_cell(str)
+ if @format && !@format.empty? && !@format.any? { |format| match_format?(str, format) }
+ raise InvalidCell, @format.reject { |r| r.kind_of? Proc }.inspect
+ end
+ end
+
+ def match_format?(str, format)
+ case format
+ when String then str == format
+ when Regexp then str =~ format
+ when Proc then format.call(str)
+ end
+ end
+
+ def required?
+ @required
+ end
+ end
+end
51 lib/importex/ruby_additions.rb
@@ -0,0 +1,51 @@
+class Integer
+ def self.importex_value(str)
+ unless str.blank?
+ if str =~ /^[.\d]+$/
+ str.to_i
+ else
+ raise Importex::InvalidCell, "Not a number."
+ end
+ end
+ end
+end
+
+class Float
+ def self.importex_value(str)
+ unless str.blank?
+ if str =~ /^[.\d]+$/
+ str.to_f
+ else
+ raise Importex::InvalidCell, "Not a number."
+ end
+ end
+ end
+end
+
+class Boolean
+ def self.importex_value(str)
+ !["", "f", "F", "n", "N", "0"].include?(str)
+ end
+end
+
+class Date
+ def self.importex_value(str)
+ !["", "f", "F", "n", "N", "0"].include?(str)
+ end
+end
+
+class Time
+ def self.importex_value(str)
+ Time.parse(str) unless str.blank?
+ rescue ArgumentError
+ raise Importex::InvalidCell, "Not a time."
+ end
+end
+
+class Date
+ def self.importex_value(str)
+ Date.parse(str) unless str.blank?
+ rescue ArgumentError
+ raise Importex::InvalidCell, "Not a date."
+ end
+end
BIN  spec/fixtures/simple.xls
Binary file not shown
62 spec/importex/base_spec.rb
@@ -0,0 +1,62 @@
+require File.expand_path(File.dirname(__FILE__) + '/../spec_helper')
+
+describe Importex::Base do
+ before(:each) do
+ @simple_class = Class.new(Importex::Base)
+ @xls_file = File.dirname(__FILE__) + '/../fixtures/simple.xls'
+ end
+
+ it "should import simple excel doc" do
+ @simple_class.column "Name"
+ @simple_class.column "Age", :type => Integer
+ @simple_class.import(@xls_file)
+ @simple_class.all.map(&:attributes).should == [{"Name" => "Foo", "Age" => 27}, {"Name" => "Bar", "Age" => 42}]
+ end
+
+ it "should import only the column given and ignore others" do
+ @simple_class.column "Age", :type => Integer
+ @simple_class.column "Nothing"
+ @simple_class.import(@xls_file)
+ @simple_class.all.map(&:attributes).should == [{"Age" => 27}, {"Age" => 42}]
+ end
+
+ it "should add restrictions through an array of strings or regular expressions" do
+ @simple_class.column "Age", :format => ["foo", /bar/]
+ lambda {
+ @simple_class.import(@xls_file)
+ }.should raise_error(Importex::InvalidCell, '27.0 (column Age, row 2) does not match required format: ["foo", /bar/]')
+ end
+
+ it "should support a lambda as a requirement" do
+ @simple_class.column "Age", :format => lambda { |age| age.to_i < 30 }
+ lambda {
+ @simple_class.import(@xls_file)
+ }.should raise_error(Importex::InvalidCell, '42.0 (column Age, row 3) does not match required format: []')
+ end
+
+ it "should have some default requirements" do
+ @simple_class.column "Name", :type => Integer
+ lambda {
+ @simple_class.import(@xls_file)
+ }.should raise_error(Importex::InvalidCell, 'Foo (column Name, row 2) does not match required format: Not a number.')
+ end
+
+ it "should have a [] method which returns attributes" do
+ simple = @simple_class.new("Foo" => "Bar")
+ simple["Foo"].should == "Bar"
+ end
+
+ it "should import if it matches one of the requirements given in array" do
+ @simple_class.column "Age", :type => Integer, :format => ["", /^[.\d]+$/]
+ @simple_class.import(@xls_file)
+ @simple_class.all.map(&:attributes).should == [{"Age" => 27}, {"Age" => 42}]
+ end
+
+ it "should raise an exception if required column is missing" do
+ @simple_class.column "Age", :required => true
+ @simple_class.column "Foo", :required => true
+ lambda {
+ @simple_class.import(@xls_file)
+ }.should raise_error(Importex::MissingColumn, "Column Foo is required but it doesn't exist.")
+ end
+end
4 spec/importex/column_spec.rb
@@ -0,0 +1,4 @@
+require File.expand_path(File.dirname(__FILE__) + '/../spec_helper')
+
+describe Importex::Column do
+end
9 spec/spec_helper.rb
@@ -0,0 +1,9 @@
+require 'rubygems'
+require 'spec'
+require 'active_support'
+require 'fileutils'
+require File.dirname(__FILE__) + '/../lib/importex'
+
+Spec::Runner.configure do |config|
+ config.mock_with :rr
+end
Please sign in to comment.
Something went wrong with that request. Please try again.