feat: ETL sync for categories and category attributes (ISSUE-196)#225
Merged
Conversation
- Fetches GET /categories from UEX and upserts into station_category - Synthetic section rows created per unique (type, section) pair using ON CONFLICT (type, name) WHERE is_section = TRUE; RETURNING id used to resolve parent_id FK for leaf rows in the same execute() call - Section rows always upserted before leaf rows to satisfy self-referencing FK - Leaf rows conflict on uex_id WHERE uex_id IS NOT NULL; parent_id, type, section, is_game_related, is_mining kept current on re-runs - Category attributes upserted into station_category_attribute per uex_id - Warns on missing name (category or attribute), unknown type (stored as null) - Registered in CatalogEtlModule and ETL_STEPS pipeline at tier-8 (after jump-points-sync, before items/vehicles/commodities) - 22 unit tests: section creation, parent FK resolution, attribute upsert, idempotency, warnings, empty list handling
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds a new Catalog ETL step to ingest UEX /categories into the station_category and station_category_attribute tables, building a two-level hierarchy (synthetic section rows → leaf categories) and persisting per-category attributes.
Changes:
- Added
CategoriesSyncStepto fetch UEX categories, upsert section + leaf category rows, and upsert category attributes with warning emission for invalid data. - Registered
CategoriesSyncStepinCatalogEtlModuleand appended it to theCatalogEtlServiceETL step sequence. - Added unit tests covering section grouping, parent FK resolution, leaf/attribute upserts, warnings, and basic idempotency expectations.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| backend/src/modules/catalog-etl/steps/categories-sync.step.ts | New ETL step implementing section/leaf category upserts and attribute upserts from UEX /categories. |
| backend/src/modules/catalog-etl/steps/categories-sync.step.spec.ts | Unit tests for the new categories ETL step behavior. |
| backend/src/modules/catalog-etl/catalog-etl.service.ts | Wires the new step into the ETL execution order. |
| backend/src/modules/catalog-etl/catalog-etl.service.spec.ts | Updates DI test setup to include the new step provider. |
| backend/src/modules/catalog-etl/catalog-etl.module.ts | Registers the new step in the Nest module providers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…erts Two correctness bugs in the categories-sync step: - NULL type in ON CONFLICT: Postgres unique indexes treat NULLs as distinct, so ON CONFLICT (type, name) WHERE is_section = TRUE would insert a new section row on every run when type is NULL. Changed the index and ON CONFLICT clause to use COALESCE(type, '') so NULL types are treated as a single deterministic sentinel value. - Empty-string section: cat.section truthiness check correctly skips '' for section-row collection, but record.section ?? null passed '' into the leaf INSERT. Normalize once via section?.trim() || null and use that value for section-row collection, parent lookup, and leaf insert. Updated spec: renamed conflict-target test to assert the COALESCE form; added tests for empty-string and whitespace-only section normalization.
…gories section index
- Wrap COALESCE expression in extra parens in ON CONFLICT clause:
(COALESCE(type, ''), name) → ((COALESCE(type, '')), name) so Postgres
matches it against the expression index rather than treating it as a
column list (which would fail with no matching unique constraint)
- Update spec assertion to match corrected ON CONFLICT form
- Add migration 1779700000000 to drop/recreate uq_categories_section_type
on already-applied databases: drops the old plain-column index
("type", "name") and creates the expression index (COALESCE("type", ''), "name")
so existing environments stay in sync without re-running the baseline
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #196
Summary
CategoriesSyncStep: fetchesGET /categoriesfrom UEX and populatesstation_categoryandstation_category_attribute(type, section)pair) usingON CONFLICT (type, name) WHERE is_section = TRUE … RETURNING id; leaf category rows are upserted in a second pass withparent_idset to the BIGSERIAL id returned by the section upsert — the self-referencing FK constraint is satisfied because sections always exist before leavesON CONFLICT (uex_id) WHERE uex_id IS NOT NULL DO UPDATE SETkeeps name, section, type, flags, and timestamps current on re-runsattributes[]entry upserted intostation_category_attributekeyed onuex_id;is_lower_bettercast to boolean or null; missing-name attributes emitseverity=warnand are skipped('item', 'service', 'contract')CHECK constraint values; unknown types stored as null with a warn warningCatalogEtlModuleproviders and appended toETL_STEPSat tier-8 (afterjump-points-sync, before items/vehicles/commodities)Test plan
pnpm test --filter backendpasses (149 tests, 14 suites green)(type, section)pair, not per categoryparent_idon leaf row equals id returned by section upsertsection: nullgetparent_id = nullcategory_uex_idandis_lower_betterboolean