unlocodes is a Ruby gem that exposes the UN/LOCODE dataset (United Nations Code for Trade and Transport Locations) as a queryable in-memory registry.
The dataset is sourced from the UNECE/UNCEFACT LOCODE vocabulary, version-tagged 2025-1: https://opensource.unicc.org/un/unece/uncefact/vocab-locode/-/tags/2025-1
The full 2025-1 dataset (115,928 LOCODEs across 249 countries, ~39 MB) is bundled inside the gem as lib/unlocodes/data/locode.jsonld and loads lazily into a typed registry on first use.
Ruby 3.1 or newer is required.
Add to your Gemfile:
gem 'unlocodes'Or install directly:
gem install unlocodesrequire 'unlocodes'
# Lookup by 5-char LOCODE (case-insensitive)
Unlocodes.find('CNSHA') # => #<Unlocodes::Entry code="CNSHA" name="Shanghai Hongqiao International Apt">
Unlocodes['NLRTM'].name # => "Rotterdam" ([] is an alias for find)
# Filters — single value or array (any-of)
Unlocodes.where(country: 'CN').count # => 1670
Unlocodes.where(country: %w[CN HK]).count # => entries in China OR Hong Kong
Unlocodes.where(function: 'B').count # => 17912 (sea ports)
Unlocodes.where(function: 'A').count # => 9009 (airports)
Unlocodes.where(function: %w[B A]).count # => 25136 (port OR airport)
Unlocodes.where(subdivision: 'CNSH').count # => entries in the Shanghai subdivision
# Filters combine
Unlocodes.where(country: 'CN', function: 'A').count # => CN airports
# Name search — Regexp (substring) or String (case-insensitive equality)
Unlocodes.where(name: /shanghai/i).map(&:code)
Unlocodes.where(name: 'rotterdam').map(&:code)
# Iteration
Unlocodes.each { |e| puts e.code if e.port? }
# Country listing
Unlocodes.countries # => ["AD", "AE", ..., "ZW"] (249 codes)
Unlocodes.counts_by_country.first(5) # => [["US", 20852], ["FR", 14325], ...]
# Indexed lookups (faster than `where` for a single value)
Unlocodes.registry.by_country('CN').size # => 1670
Unlocodes.registry.by_function('B').size # => 17912Each Unlocodes::Entry exposes the fields the JSON-LD vocabulary populates:
| Attribute | Description |
|---|---|
|
5-char LOCODE (ISO country + 3-char location) |
|
ISO 3166-1 alpha-2 country code |
|
ISO 3166-2 country subdivision code (e.g. |
|
Display name |
|
Array of single-letter function codes (see below) |
|
WGS-84 decimal degrees |
Convenience predicates and value-type accessors:
entry = Unlocodes.find('NLRTM')
entry.port? # => true (function code B)
entry.airport? # => true (function code A)
entry.rail_terminal? # => true (function code R)
entry.road_terminal? # => true (function code T)
entry.functions.map(&:code) # => ["B", "R", "T", "A", "P"]
entry.functions.first.description # => "Port (sea)"
entry.coordinates # => #<Unlocodes::Coordinates ...>The vocabulary publishes functions as unlcdf:1..unlcdf:9. The gem maps them to the UN/LOCODE manual’s letters:
| unlcdf: | Letter | Meaning |
|---|---|---|
1 |
B |
Port (sea) |
2 |
R |
Rail terminal |
3 |
T |
Road terminal |
4 |
A |
Airport |
5 |
P |
Postal exchange office |
6 |
I |
Inland water transport |
7 |
F |
Ferry port |
8 |
V |
Pipeline |
9 |
O |
Other / border crossing |
Use the letters in where(function: …). Passing an array matches any-of (OR), not all-of (AND):
Unlocodes.where(function: 'B').count # sea ports
Unlocodes.where(function: %w[B A]).count # port OR airport (union)
# For AND (entries that are BOTH port AND airport), filter in Ruby:
Unlocodes.entries.count { |e| e.port? && e.airport? } # => 1785shanghai = Unlocodes.find('CNPDG').coordinates # Shanghai Pudong
rotterdam = Unlocodes.find('NLRTM').coordinates
shanghai.to_s # => "31.2333 121.5000"
shanghai.distance_to(rotterdam) # => 8922.7 (km, great-circle via haversine)The JSON-LD vocabulary only publishes code, country, subdivision, name, functions, and coordinates. The UN/LOCODE manual additionally defines a status change indicator (AA, RL, XX, …), IATA code, change date, and remarks — these are in the per-country CSV files upstream, not the JSON-LD vocab, and are intentionally not modelled here.
Calling where(status: …) or where(iata: …) raises ArgumentError rather than silently returning empty results.
UN/LOCODE is published twice-yearly (typically YYYY-1 around Q1, YYYY-2 around Q3). The gem ships two workflows to keep the bundled data fresh:
.github/workflows/check-upstream.yml runs every Monday, asks the upstream GitLab project for its latest tag, and compares against the bundled SOURCE_TAG. When a new edition is detected it opens a GitHub issue labelled data-update describing the diff and the next steps. The workflow also runs on manual dispatch.
.github/workflows/update-data.yml is what a maintainer runs after check-upstream flags a new edition. It takes the new tag as input, fetches the data, commits to a branch, and opens a PR. Merging that PR does not, by itself, publish a new gem — it just lands the new data on main. To ship a new gem version, trigger the release workflow (see below).
End-to-end refresh flow:
-
check-upstreamopens issue: "new tag2025-2available, bundled is `2025-1`" -
Maintainer runs
update-dataworkflow withtag=2025-2 -
Workflow opens PR
chore/data-2025-2with the refreshedlocode.jsonldandSOURCE_TAG -
Maintainer reviews and merges the PR
-
Maintainer triggers the
releaseworkflow withnext_version=patchto publish a new gem version
Manual local equivalent of the workflow:
bundle exec rake unlocodes:fetch # default tag: 2025-1
UNLOCODE_TAG=2025-2 bundle exec rake unlocodes:fetchIf the upstream tag layout changes, point UNLOCODE_PATH at the full JSON-LD URL:
UNLOCODE_PATH=https://example.org/path/unlocode.jsonld bundle exec rake unlocodes:fetchbundle install # install dev deps
bundle exec rake # spec + rubocop
bundle exec rspec spec/path/to_spec.rb:42 # one example
bundle exec rake unlocodes:fetch # refresh bundled dataThe 39 MB dataset at lib/unlocodes/data/locode.jsonld is committed so the gem works offline. The first call to Unlocodes.registry in a process pays the parse cost (~3-5 s); subsequent calls are cached.
BSD-2-Clause. See LICENSE.