Skip to content

ETL: sync star systems and orbits into station_* tables #191

@GitAddRemote

Description

@GitAddRemote

User Story

As a backend engineer, I need station_star_systems and station_orbits populated from UEX API data so that all planetary bodies, stations, and jump points have valid parent FK references in the location hierarchy.

Definition of Done

  • StarSystemsSyncStep ETL step created and registered at tier-2 position
  • OrbitsSyncStep ETL step created and registered at tier-3 position
  • station_star_systems populated with idempotent upsert by uex_id
  • station_orbits populated with idempotent upsert by uex_id, with FK star_system_id resolved from station_star_systems
  • station_orbit_distances populated from the orbit distances endpoint
  • Orphaned orbit records (FK to unknown star system) logged to station_etl_warnings and skipped
  • Unit tests cover: success path, orphan handling, upsert idempotency
  • pnpm test passes

Acceptance Criteria

  • After ETL: SELECT count(*) FROM station_star_systems matches UEX star systems count
  • After ETL: all station_orbits rows have a non-null star_system_id FK
  • An orbit with an unknown id_star_system emits a warning row and is not inserted
  • Re-running ETL does not duplicate rows

Technical Elaboration

UEX Endpoints

  • GET /star_systems{ id, name, code, is_available, is_available_live, is_default }
  • GET /orbits{ id, name, code, id_star_system, is_available, is_available_live, ... }
  • GET /orbit_distances{ id_orbit_origin, id_orbit_destination, distance_km, travel_time_s, ... }

Target Schema

station_star_systems: uex_id INT UNIQUE, name VARCHAR(255), code VARCHAR(50), is_available BOOLEAN, is_available_live BOOLEAN, is_default BOOLEAN

station_orbits: uex_id INT UNIQUE, name VARCHAR(255), code VARCHAR(50), star_system_id UUID FK → station_star_systems, is_available BOOLEAN, is_available_live BOOLEAN

station_orbit_distances: origin_orbit_id UUID FK → station_orbits, destination_orbit_id UUID FK → station_orbits, distance_km DECIMAL(15,2), travel_time_s INTEGER, PRIMARY KEY (origin_orbit_id, destination_orbit_id)

FK Resolution Pattern

Before inserting orbits, build an in-memory map: Map<uexId: number, uuid: string> from the just-inserted station_star_systems. Resolve id_star_system against this map. If not found, emit warning and skip.

For station_orbit_distances, both FK sides must resolve. Skip and warn if either is missing.

Step Order Dependency

OrbitsSyncStep must run after StarSystemsSyncStep completes. If StarSystemsSyncStep failed, OrbitsSyncStep should detect an empty station_star_systems table and emit a top-level warning rather than silently inserting orphan rows.

Design Elaboration

station_orbits is the central node of the entire location hierarchy — every physical body (planet, moon, city, station, outpost) references an orbit. Getting this table fully populated and FK-correct before running any downstream steps is critical. The in-memory UUID map pattern avoids N+1 queries during bulk upsert and is fast enough for the expected record counts (~200 orbits across all star systems).

Orbit distances are supplementary data used for route planning features (future). They are populated in this step because they are logically part of the orbit data, but a failure here should produce severity='warn' warnings only — not block the rest of the ETL.


Depends on: #188, #189

Metadata

Metadata

Assignees

No one assigned

    Labels

    backendBackend services and logicdatabaseSchema, migrations, indexingenhancementNew feature or requesttech-storyTechnical implementation storyuex-syncUEXcorp API sync and integration

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions