Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@
/stubs
/vendor/bundle/
/pkg
.claude/
41 changes: 25 additions & 16 deletions .rubocop.yml
Original file line number Diff line number Diff line change
@@ -1,23 +1,24 @@
AllCops:
TargetRubyVersion: 2.7
TargetRubyVersion: 3.4

# Include gemspec and Rakefile
Include:
- '**/*.gemspec'
- '**/*.podspec'
- '**/*.jbuilder'
- '**/*.rake'
- '**/Gemfile'
- '**/Rakefile'
- '**/Capfile'
- '**/Guardfile'
- '**/Podfile'
- '**/Thorfile'
- '**/Vagrantfile'
- "**/*.rb"
- "**/*.gemspec"
- "**/*.podspec"
- "**/*.jbuilder"
- "**/*.rake"
- "**/Gemfile"
- "**/Rakefile"
- "**/Capfile"
- "**/Guardfile"
- "**/Podfile"
- "**/Thorfile"
- "**/Vagrantfile"
Exclude:
- 'vendor/**/*'
- 'stubs/**/*'
- 'spec/support/shared_contexts/*'
- "vendor/**/*"
- "stubs/**/*"
- "spec/support/shared_contexts/*"

NewCops: enable

Expand Down Expand Up @@ -51,6 +52,10 @@ Style/DoubleNegation:
Style/PerlBackrefs:
Enabled: false

Style/OpenStructUse:
Exclude:
- "spec/**/*"

########################################
# Lint Cops

Expand All @@ -66,6 +71,10 @@ Security/Eval:
########################################
# Metrics Cops

Metrics/BlockLength:
Exclude:
- "spec/**/*"

Metrics/MethodLength:
CountComments: false # count full line comments?
Max: 30
Expand All @@ -77,7 +86,7 @@ Metrics/AbcSize:
Enabled: false

########################################
# Metrics Cops
# Naming Cops

Naming/FileName:
Enabled: false
Expand Down
2 changes: 1 addition & 1 deletion .ruby-version
Original file line number Diff line number Diff line change
@@ -1 +1 @@
3.0.2
3.4.8
89 changes: 89 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

A Ruby gem that downloads postal/zipcode data from GeoNames.org, processes it via an ETL pipeline, and outputs an SQLite3 database and optional CSV files. Supports single-country or all-countries processing.

## Commands

```bash
# Install dependencies (vendored to vendor/bundle, binstubs in stubs/)
bundle install

# Run all tests
bundle exec rspec

# Run a single test file
bundle exec rspec spec/path/to/file_spec.rb

# Run a specific test by line number
bundle exec rspec spec/path/to/file_spec.rb:42

# Lint
bundle exec rubocop

# Lint with auto-correct
bundle exec rubocop -a

# Version bumping (do on develop branch, not master)
bundle exec rake version:bump_patch
bundle exec rake version:bump_minor
bundle exec rake version:bump_major

# Build and install gem
bundle exec rake build
bundle exec rake install

# Release gem
bundle exec rake release
```

## Architecture

The gem follows an ETL (Extract, Transform, Load) pattern using the Kiba gem:

1. **Extract**: `DataSource` downloads zip files from GeoNames.org, extracts them, and prepares CSV files with headers
2. **Source**: `CsvSource` (Kiba source) feeds rows from the prepared CSV into the pipeline
3. **Load**: Four Kiba destination table classes write rows into an in-memory SQLite database

### Key Flow

`bin/free_zipcode_data` → `Runner#start` → `DataSource#download` → `DataSource#datafile` (extract zip + add CSV headers) → `SqliteRam` (in-memory DB) → `ETL::FreeZipcodeDataJob` (Kiba pipeline) → `SqliteRam#save_to_disk`

### Core Classes

- **`FreeZipcodeData::Runner`** - CLI entry point; parses args via Optimist, orchestrates the full pipeline
- **`FreeZipcodeData::DataSource`** - Downloads and extracts GeoNames zip files, prepares CSV with headers
- **`SqliteRam`** - Wraps SQLite3; works entirely in-memory then saves to disk via `SQLite3::Backup`
- **`FreeZipcodeData::DbTable`** - Base class for all table classes; provides progress bar, SQL helpers, and country lookup from `country_lookup_table.yml`
- **`FreeZipcodeData::CountryTable`/`StateTable`/`CountyTable`/`ZipcodeTable`** - Kiba destinations; each has `build` (creates schema + indexes) and `write` (inserts rows, swallows duplicate constraint violations)
- **`ETL::FreeZipcodeDataJob`** - Configures the Kiba pipeline with one source and four destinations
- **`CsvSource`** - Kiba-compatible CSV reader

### Singletons

`Options` and `Logger` are singletons (via Ruby's `Singleton` module). `Runner` has an `.instance` convenience class method (returns `new` each time, not cached).

## Configuration

- `.ruby-version`: 3.4.8
- Bundle path: `vendor/bundle` (binstubs in `stubs/`)
- Environment: `APP_ENV` controls environment (`test`, `development`)
- Config file: `~/.free_zipcode_data.yml` (overridable via `FZD_CONFIG_FILE` env var; uses `spec/fixtures/` version in test)

## Rubocop

Key style settings (`.rubocop.yml`):
- Target Ruby 3.4
- Max line length: 110
- Max method length: 30 lines
- `Style/ClassVars`, `Style/Documentation`, `Metrics/AbcSize`, `Lint/SuppressedException` disabled
- `vendor/` and `stubs/` excluded

## Git Workflow

- `master` is the release branch
- `develop` is the development branch
- Version bumps should happen on `develop`, then merge to `master` before `rake release`
10 changes: 10 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,13 @@ source 'https://rubygems.org'
git_source(:github) { |repo| "https://github.com/#{repo}.git" }

gemspec

group :development do
gem 'bundler'
gem 'pry-nav', '~> 0.2'
gem 'rake', '~> 13.0'
gem 'rspec', '~> 3.7'
gem 'rubocop'
gem 'ruby-prof', '~> 0.17'
gem 'simplecov', '~> 0.16'
end
84 changes: 49 additions & 35 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@ PATH
specs:
free_zipcode_data (1.0.6)
colored (~> 1.2)
csv
kiba (~> 4.0)
logger
optimist (~> 3.0)
ruby-progressbar (~> 1.9)
rubyzip (>= 1.2.2)
Expand All @@ -12,63 +14,75 @@ PATH
GEM
remote: https://rubygems.org/
specs:
ast (2.4.2)
ast (2.4.3)
coderay (1.1.3)
colored (1.2)
diff-lcs (1.4.4)
docile (1.4.0)
csv (3.3.5)
diff-lcs (1.6.2)
docile (1.4.1)
json (2.18.1)
kiba (4.0.0)
language_server-protocol (3.17.0.5)
lint_roller (1.1.0)
logger (1.7.0)
method_source (0.9.2)
mini_portile2 (2.8.9)
optimist (3.2.1)
parallel (1.21.0)
parser (3.0.2.0)
parallel (1.27.0)
parser (3.3.10.1)
ast (~> 2.4.1)
racc
prism (1.9.0)
pry (0.12.2)
coderay (~> 1.1.0)
method_source (~> 0.9.0)
pry-nav (0.3.0)
pry (>= 0.9.10, < 0.13.0)
rainbow (3.0.0)
rake (13.0.6)
regexp_parser (2.1.1)
rexml (3.4.2)
rspec (3.10.0)
rspec-core (~> 3.10.0)
rspec-expectations (~> 3.10.0)
rspec-mocks (~> 3.10.0)
rspec-core (3.10.1)
rspec-support (~> 3.10.0)
rspec-expectations (3.10.1)
racc (1.8.1)
rainbow (3.1.1)
rake (13.3.1)
regexp_parser (2.11.3)
rspec (3.13.2)
rspec-core (~> 3.13.0)
rspec-expectations (~> 3.13.0)
rspec-mocks (~> 3.13.0)
rspec-core (3.13.6)
rspec-support (~> 3.13.0)
rspec-expectations (3.13.5)
diff-lcs (>= 1.2.0, < 2.0)
rspec-support (~> 3.10.0)
rspec-mocks (3.10.2)
rspec-support (~> 3.13.0)
rspec-mocks (3.13.7)
diff-lcs (>= 1.2.0, < 2.0)
rspec-support (~> 3.10.0)
rspec-support (3.10.3)
rubocop (1.22.3)
rspec-support (~> 3.13.0)
rspec-support (3.13.7)
rubocop (1.84.2)
json (~> 2.3)
language_server-protocol (~> 3.17.0.2)
lint_roller (~> 1.1.0)
parallel (~> 1.10)
parser (>= 3.0.0.0)
parser (>= 3.3.0.2)
rainbow (>= 2.2.2, < 4.0)
regexp_parser (>= 1.8, < 3.0)
rexml
rubocop-ast (>= 1.12.0, < 2.0)
regexp_parser (>= 2.9.3, < 3.0)
rubocop-ast (>= 1.49.0, < 2.0)
ruby-progressbar (~> 1.7)
unicode-display_width (>= 1.4.0, < 3.0)
rubocop-ast (1.12.0)
parser (>= 3.0.1.1)
unicode-display_width (>= 2.4.0, < 4.0)
rubocop-ast (1.49.0)
parser (>= 3.3.7.2)
prism (~> 1.7)
ruby-prof (0.18.0)
ruby-progressbar (1.11.0)
rubyzip (3.1.1)
simplecov (0.21.2)
ruby-progressbar (1.13.0)
rubyzip (3.2.2)
simplecov (0.22.0)
docile (~> 1.1)
simplecov-html (~> 0.11)
simplecov_json_formatter (~> 0.1)
simplecov-html (0.12.3)
simplecov_json_formatter (0.1.3)
simplecov-html (0.13.2)
simplecov_json_formatter (0.1.4)
sqlite3 (1.7.3)
mini_portile2 (~> 2.8.0)
unicode-display_width (2.1.0)
unicode-display_width (3.2.0)
unicode-emoji (~> 4.1)
unicode-emoji (4.2.0)

PLATFORMS
ruby
Expand All @@ -84,4 +98,4 @@ DEPENDENCIES
simplecov (~> 0.16)

BUNDLED WITH
2.2.22
2.6.9
2 changes: 1 addition & 1 deletion Rakefile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ require 'rubygems'
require 'bundler/setup'

require 'rake'
Dir['lib/tasks/**/*.rake'].sort.each { |ext| load ext }
Dir['lib/tasks/**/*.rake'].each { |ext| load ext }

# Install rubygem tasks
Bundler::GemHelper.install_tasks
22 changes: 8 additions & 14 deletions free_zipcode_data.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -23,18 +23,12 @@ Gem::Specification.new do |spec|
spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
spec.require_paths = ['lib']

spec.add_development_dependency 'bundler'
spec.add_development_dependency 'pry-nav', '~> 0.2'
spec.add_development_dependency 'rake', '~> 13.0'
spec.add_development_dependency 'rspec', '~> 3.7'
spec.add_development_dependency 'rubocop'
spec.add_development_dependency 'ruby-prof', '~> 0.17'
spec.add_development_dependency 'simplecov', '~> 0.16'

spec.add_runtime_dependency 'colored', '~> 1.2'
spec.add_runtime_dependency 'kiba', '~> 4.0'
spec.add_runtime_dependency 'optimist', '~> 3.0'
spec.add_runtime_dependency 'ruby-progressbar', '~> 1.9'
spec.add_runtime_dependency 'rubyzip', '>= 1.2.2'
spec.add_runtime_dependency 'sqlite3', '~> 1.3'
spec.add_dependency 'colored', '~> 1.2'
spec.add_dependency 'csv'
spec.add_dependency 'kiba', '~> 4.0'
spec.add_dependency 'logger'
spec.add_dependency 'optimist', '~> 3.0'
spec.add_dependency 'ruby-progressbar', '~> 1.9'
spec.add_dependency 'rubyzip', '>= 1.2.2'
spec.add_dependency 'sqlite3', '~> 1.3'
end
1 change: 1 addition & 0 deletions lib/etl/common.rb
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ def show_me
def limit(count)
count = Integer(count || -1)
return if count == -1

transform do |row|
@counter ||= 0
@counter += 1
Expand Down
8 changes: 4 additions & 4 deletions lib/etl/csv_source.rb
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@ def initialize(filename:, headers: true, delimeter: "\t", quote_char: '"')

def each
CSV.open(filename,
col_sep: delimeter,
headers: headers,
header_converters: :symbol,
quote_char: quote_char) do |csv|
col_sep: delimeter,
headers: headers,
header_converters: :symbol,
quote_char: quote_char) do |csv|
csv.each do |row|
yield(row.to_hash)
end
Expand Down
6 changes: 3 additions & 3 deletions lib/free_zipcode_data.rb
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,16 @@ def self.current_environment
ENV.fetch('APP_ENV', 'development')
end

#:nocov:
# :nocov:
def self.config_file(filename = '.free_zipcode_data.yml')
return root.join('spec', 'fixtures', filename) if current_environment == 'test'

home = ENV.fetch('HOME')
home = Dir.home
file = ENV.fetch('FZD_CONFIG_FILE', File.join(home, '.free_zipcode_data.yml'))
FileUtils.touch(file)
file
end
#:nocov:
# :nocov:

def self.os
if RUBY_PLATFORM.match?(/cygwin|mswin|mingw|bccwin|wince|emx/)
Expand Down
Loading