Skip to content

chore(WEB-0): Improve compile time#85

Open
dgigafox wants to merge 5 commits intomainfrom
fix/improve-compile-time
Open

chore(WEB-0): Improve compile time#85
dgigafox wants to merge 5 commits intomainfrom
fix/improve-compile-time

Conversation

@dgigafox
Copy link
Copy Markdown
Contributor

  1. Move ports dataset to pre-parsed ETF loaded at app start
  2. Add a CI check to see if the ETF is up-to-date based on the ports CSV source
  3. Other functions remain to be loaded during compile-time

Loading the ~116k-row UN/LOCODE port list at compile time forced
dependents to spend seconds expanding struct literals into AST and
serializing them into BEAM files on every fresh deps compile. Move the
port list to a pre-parsed, compressed ETF shipped in priv/data/, loaded
once into :persistent_term at application start (~79 ms). Cold compile
of this library drops from ~7 s to ~0.7 s.

The four small datasets (countries, functions, statuses, subdivisions)
remain compile-time module attributes — they're tiny and don't impact
compile time.

Add mix ports.gen_etf task to regenerate priv/data/ports.etf from the
source CSVs whenever they're updated.
CI now fails if priv/data/ports.etf drifts from the source CSVs,
catching cases where a contributor updates a CSV without
regenerating the ETF. The check compares decoded terms rather than
raw bytes so it isn't sensitive to serialization determinism.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request transitions the loading of port data from compile-time to runtime using :persistent_term. It introduces a new application module to handle data loading on startup, a Mix task to generate the binary ETF file from source CSVs, and updates the Ports.all/0 function to retrieve data from persistent storage. Feedback suggests improving error handling in the Mix task when the ETF file is missing and providing a more descriptive error message if Ports.all/0 is called before the application has initialized.

Comment thread lib/mix/tasks/ports.gen_etf.ex
Comment thread lib/ports.ex Outdated
The ETF is a build artifact we produce, but decoding with the bare
:erlang.binary_to_term can still exhaust the atom table or evaluate
function/reference terms if the file is ever tampered with.
non_executable_binary_to_term/2 layers a recursive rejection of
function and reference terms on top of the :safe option, giving real
hardening rather than silencing sobelow's Misc.BinToTerm check.
Addresses PR feedback on #85:

- mix ports.gen_etf --check now reports a missing ETF with the same
  actionable "run mix ports.gen_etf" message instead of a generic
  File.Error.
- Ports.all/0 raises a named error when the :ports application has
  not been started, instead of a bare ArgumentError from
  :persistent_term.get/1.
@dgigafox dgigafox requested a review from samhamilton April 21, 2026 05:36
@samhamilton samhamilton requested review from dgross881 and removed request for samhamilton April 21, 2026 08:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant