Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

significantly faster update tasks #33

Open
schwern opened this issue Aug 20, 2021 · 1 comment
Open

significantly faster update tasks #33

schwern opened this issue Aug 20, 2021 · 1 comment

Comments

@schwern
Copy link

schwern commented Aug 20, 2021

Hi, I used your gem as a template for making my own Zipcode and State models. I stripped down the models for my own needs.

I've significantly sped up the update tasks by 1) caching the States in memory, 2) parsing the CSV directly from the URI, and 3) doing a bulk insert. Now it will update in a few seconds.

The bulk insert is done by activerecord-import. upsert_all will also work. I prefer activerecord-import because it will run validations and does not require Rails 6.

Here is the version for my stripped down models.

require 'csv'

namespace :zipcodes do
  desc "Update states table"
  task update_states: :environment do
    puts ">>> Begin update of states table..."
    csv = CSV.new(
      URI.open("https://github.com/midwire/free_zipcode_data/raw/master/all_us_states.csv"),
      headers: true
    )

    states = csv.map do |row|
      {
        abbr: row['abbr'],
        name: row['name']
      }
    end

    State.import(
      states,
      on_duplicate_key_update: {
        conflict_target: [:abbr],
        columns: [:abbr, :name]
      }
    )

    puts ">>> End update of states table..."
  end

  desc "Update zipcodes table"
  task update_zipcodes: :update_states do
    puts ">>> Begin update of zipcodes table..."
    csv = CSV.new(
      URI.open("https://github.com/midwire/free_zipcode_data/raw/master/all_us_zipcodes.csv"),
      headers: true
    )

    zipcodes = csv.map do |row|
      {
        code: row['code'],
        city: row['city'].titleize,
        state_id: row['state']
      }
    end

    Zipcode.import(
      zipcodes,
      on_duplicate_key_update: {
        conflict_target: [:code],
        columns: [:code, :city, :state_id]
      }
    )

    puts ">>> End update of zipcodes table..."
  end

  desc "Populate or update the zipcodes related tables"
  task update: :update_zipcodes

  desc "Clear the zipcode tables"
  task clear: :environment do
    puts ">>> Deleting zip codes"
    Zipcode.delete_all
    puts ">>> Deleting states"
    State.delete_all
  end

  desc "Clear and then repopulate the zipcodes and related tables"
  task reset: [:clear, :update]
end
@midwire
Copy link
Owner

midwire commented Aug 23, 2021

@schwern thanks for this. I will try to get it incorporated in the coming days. Swamped lately, but I appreciate it. 👍🏼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants