Skip to content

feat(database): tag taxonomy update, storage padding, and alias canonicalization#713

Merged
wizzomafizzo merged 1 commit intomainfrom
feat/tag-taxonomy-storage-padding
Apr 19, 2026
Merged

feat(database): tag taxonomy update, storage padding, and alias canonicalization#713
wizzomafizzo merged 1 commit intomainfrom
feat/tag-taxonomy-storage-padding

Conversation

@wizzomafizzo
Copy link
Copy Markdown
Member

@wizzomafizzo wizzomafizzo commented Apr 19, 2026

Summary

  • Storage-only numeric padding: purely-numeric tag values are zero-padded to width 4 in SQLite (disc:1 → stored as disc:0001) so ORDER BY sorts correctly. PadTagValue applied at every DB write site; UnpadTagValue strips at every read site. Public API, NFC tokens, and ZapScript remain in natural form — no external contract change.

  • Net-new upstream tags from PigSaint/GameDataBase: keyboard, touchscreen, positional:4 (input); barcode namespace with barcodeboy/barcodereader (addon); vibration:rumble, accelerometer (embedded); vicdual, g80, h1, model1model3/variants, naomi, and manufacturers nichibutsu, taiyo, tecfri:ambush, tourvision (arcadeboard); gameboy:infrared, gameboy:gba (compatibility); archimedes, atari:falcon, sega:32x, nintendo:disksystem, nintendo:gameandwatch, wonderswan (port); vr, keyword:ubikey (search); comicclassics (reboxed); pcemini, ninjajajamaru, zeldacollection, 3dfukkoku:01/:02 (rerelease); rev:f, set:f1/f2, alt:4/5/6 (range fills); ca (lang); ddrgb, fullchanger (addon controller); mobileadaptergb (link); glasses:mvd, led:powerantenna/bugsensor, pocketsakura, spectrumcommunicator (addon misc); seganet (reboxed).

  • Deprecated alias canonicalization: addon:barcodeboyaddon:barcode:barcodeboy; addon:controller:jcartembedded:slot:jcart; addon:controller:rumbleembedded:vibration:rumble. Old NFC tokens and ZapScript files using former names resolve transparently at query time via CanonicalizeTagAlias in resolveFilter. MediaDB is ephemeral — no migration needed; reindex produces canonical rows.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for numerous new tags including arcade boards (Sega boards, Nichibutsu, etc.), embedded hardware, input devices (keyboard, touchscreen), and additional port/rerelease variants.
    • Implemented automatic canonicalization of deprecated tag aliases to their current forms.
  • Improvements

    • Enhanced tag value normalization for consistent storage and retrieval across the database.
    • Improved tag caching mechanisms to ensure consistent tag handling throughout the application.

…icalization

Three related changes to the tag system:

1. Storage-only numeric padding: purely-numeric tag values are zero-padded to
   width 4 in SQLite (e.g. disc:1 → disc:0001) so ORDER BY sorts correctly.
   PadTagValue is applied at every DB write site; UnpadTagValue strips at every
   read site. Public API, NFC tokens, and ZapScript remain in natural form.

2. Net-new upstream tags from PigSaint/GameDataBase: keyboard, touchscreen,
   positional:4 (input); barcode namespace (addon); vibration:rumble and
   accelerometer (embedded); vicdual/g80/h1/model1-3/naomi and new
   manufacturers nichibutsu/taiyo/tecfri:ambush/tourvision (arcadeboard);
   gameboy:infrared and gameboy:gba (compatibility); archimedes/atari:falcon/
   sega:32x/nintendo:disksystem/nintendo:gameandwatch/wonderswan (port);
   vr and keyword:ubikey (search); comicclassics (reboxed); pcemini/
   ninjajajamaru/zeldacollection and 3dfukkoku:01/02 (rerelease); rev:f,
   set:f1/f2, alt:4/5/6 (range fills); ca (lang); ddrgb/fullchanger (addon
   controller); mobileadaptergb (link); glasses:mvd, led:powerantenna/bugsensor,
   pocketsakura, spectrumcommunicator (addon misc); seganet (reboxed).

3. Deprecated alias canonicalization: addon:barcodeboy rewrites to
   addon:barcode:barcodeboy; addon:controller:jcart to embedded:slot:jcart;
   addon:controller:rumble to embedded:vibration:rumble. Old NFC tokens and
   ZapScript files using the former names resolve transparently at query time
   via CanonicalizeTagAlias in resolveFilter.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 19, 2026

📝 Walkthrough

Walkthrough

The PR introduces comprehensive tag value normalization by implementing padding/unpadding functions for numeric tag segments, adding tag alias canonicalization for deprecated formats, and systematically applying these transformations across database queries, caching, and indexing operations. Canonical tag definitions are expanded with new hardware, arcade board, and input device tags.

Changes

Cohort / File(s) Summary
Tag Storage & Normalization
pkg/database/tags/storage.go, pkg/database/tags/storage_test.go
Added PadTagValue and UnpadTagValue functions to transparently handle left-padding of numeric tag segments to 4 digits (e.g., 10001, prg:1prg:0001) and the inverse operation. Comprehensive test coverage verifies padding behavior, round-trip idempotency, and handling of hierarchical tags.
Tag Alias Canonicalization
pkg/database/tags/aliases.go, pkg/database/tags/aliases_test.go
Introduced CanonicalizeTagAlias function with a mapping table converting deprecated tag formats (e.g., addon:barcodeboyaddon:barcode:barcodeboy) to canonical forms. Tests validate alias rewrites, prevent cycles, and ensure all targets exist in canonical definitions.
SQL Query Normalization
pkg/database/mediadb/sql_cache.go, pkg/database/mediadb/sql_search.go, pkg/database/mediadb/sql_tags.go
Applied UnpadTagValue when reading tag values from database across three SQL query files. Tags retrieved from MediaTags, SystemTags, and tag lookup functions are now normalized before being stored in database.TagInfo structures.
Tag Filter Resolution
pkg/database/mediadb/tagfilter.go, pkg/database/mediadb/tagfilter_test.go
Added resolveFilter helper to canonicalize deprecated aliases and apply PadTagValue for storage-ready values. Updated BuildTagFilterSQL to resolve AND/NOT/OR filter clauses and duplicate padded arguments for file-level and title-level subqueries. New test validates alias canonicalization in filter SQL generation.
Cache Construction
pkg/database/mediadb/tag_cache.go
Normalized cached tag values by applying UnpadTagValue when populating database.TagInfo during cache construction. Refactored single-system fast path to use intermediate tagList variable.
Indexing Pipeline
pkg/database/mediascanner/indexing_pipeline.go
Applied PadTagValue normalization when dynamically creating revision tags (rev: prefixed) in AddMediaPath and when seeding canonical tags in SeedCanonicalTags.
Tag Definitions & Mappings
pkg/database/tags/tag_mappings.go, pkg/database/tags/tag_values.go, pkg/database/tags/tags.go
Expanded canonical tag definitions with new input types (keyboard variants, touchscreen variants, positional input), reorganized arcade board tags (added Sega boards and additional vendors), added embedded hardware tags (vibration, accelerometer), and introduced new compatibility, port, search, and reboxed tag categories. Remapped addon controller and barcode-related constants to hierarchical namespaces.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~35 minutes

Poem

🐰 Tags once wild with numbers bare,
Now padded neat with zeros fair!
Aliases bend to canonical light,
Four digits wide, the queries take flight,
From storage deep to cache so bright! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 65.38% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the three main changes: tag taxonomy updates, storage padding implementation, and alias canonicalization.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/tag-taxonomy-storage-padding

Comment @coderabbitai help to get the list of available commands and usage tips.

@sentry
Copy link
Copy Markdown

sentry bot commented Apr 19, 2026

Codecov Report

❌ Patch coverage is 90.24390% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pkg/database/mediadb/sql_tags.go 71.42% 4 Missing ⚠️
pkg/database/mediadb/tagfilter.go 88.23% 1 Missing and 1 partial ⚠️
pkg/database/tags/storage.go 94.11% 1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pkg/database/mediadb/tagfilter_test.go (1)

292-311: Assert the exact alias-argument count.

require.Greater(t, len(args), 7) will still pass if BuildTagFilterSQL starts emitting extra duplicated args. This test is checking a fixed three-filter layout, so require.Len(t, args, 12) would catch that regression.

🧪 Tighten the assertion
-	require.Greater(t, len(args), 7)
+	require.Len(t, args, 12)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/database/mediadb/tagfilter_test.go` around lines 292 - 311, The test
currently uses require.Greater(t, len(args), 7) which allows extra duplicated
args to slip by; update the assertion to require the exact expected count by
replacing that check with require.Len(t, args, 12) (this targets the args
returned from BuildTagFilterSQL), so the test will fail if BuildTagFilterSQL
emits more or fewer than the expected 12 alias/argument entries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/database/mediadb/tagfilter_test.go`:
- Around line 292-311: The test currently uses require.Greater(t, len(args), 7)
which allows extra duplicated args to slip by; update the assertion to require
the exact expected count by replacing that check with require.Len(t, args, 12)
(this targets the args returned from BuildTagFilterSQL), so the test will fail
if BuildTagFilterSQL emits more or fewer than the expected 12 alias/argument
entries.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9140a644-24a8-4301-ab6a-7a4cf8c1e425

📥 Commits

Reviewing files that changed from the base of the PR and between 080e53f and 09bd959.

📒 Files selected for processing (14)
  • pkg/database/mediadb/sql_cache.go
  • pkg/database/mediadb/sql_search.go
  • pkg/database/mediadb/sql_tags.go
  • pkg/database/mediadb/tag_cache.go
  • pkg/database/mediadb/tagfilter.go
  • pkg/database/mediadb/tagfilter_test.go
  • pkg/database/mediascanner/indexing_pipeline.go
  • pkg/database/tags/aliases.go
  • pkg/database/tags/aliases_test.go
  • pkg/database/tags/storage.go
  • pkg/database/tags/storage_test.go
  • pkg/database/tags/tag_mappings.go
  • pkg/database/tags/tag_values.go
  • pkg/database/tags/tags.go

Comment thread pkg/database/tags/tag_values.go
@wizzomafizzo wizzomafizzo merged commit 7ac5dd4 into main Apr 19, 2026
12 checks passed
@wizzomafizzo wizzomafizzo deleted the feat/tag-taxonomy-storage-padding branch April 19, 2026 06:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant