fix(discogs): accept Cat# identifiers (closes #101)#102
Conversation
…101) Hans reported v0.5.8 still rejecting 'CDP 7912682' (Bonnie Raitt "Nick of Time", Capitol 1989) — a valid Discogs Catalog Number. The old validateBarcode only accepted 12/13-digit numeric barcodes, so any release older than barcode adoption or any promo/limited pressing without a printed EAN couldn't be scraped. Discogs identifies releases via three keys: EAN/UPC (barcode on sleeve), Cat# a.k.a. catalog number (printed on disc/spine/label — alphanumeric), or free-text. MusicBrainz exposes the same split via its Lucene `barcode:` vs `catno:` filters. Changes: - validateBarcode now accepts Cat# patterns via new isCatalogNumber() - identifierKind() classifies inputs as 'barcode' / 'catno' / 'unknown' - fetchFromDiscogs routes catno inputs to ?catno= (not ?barcode=) - searchMusicBrainz routes catno inputs to catno: (not barcode:) - BMG-style pure-digit Cat# with internal separators ("74321 66847 2") are accepted via hasDigit && (hasLetter || hasSeparator) heuristic; bare 12/13-digit strings still classify as barcode (checked first). Test coverage: tests/discogs-catno.unit.php (22 assertions, pure PHP).
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughClassifica identificatori in Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Plugin as DiscogsPlugin
participant DiscogsAPI as Discogs (API)
participant MusicBrainz as MusicBrainz (API)
participant DB as Persistence
Client->>Plugin: invia identificatore
Plugin->>Plugin: identifierKind(input) -> "barcode"/"catno"/"unknown"
alt kind == barcode
Plugin->>DiscogsAPI: GET /database/search?barcode=<digits>
else kind == catno
Plugin->>DiscogsAPI: GET /database/search?catno=<trimmed value>
else
Plugin->>DiscogsAPI: GET /database/search? (senza barcode/catno)
end
DiscogsAPI-->>Plugin: risultati
alt nessun risultato rilevante
Plugin->>Plugin: query = buildMusicBrainzQuery(input)
alt query != ""
Plugin->>MusicBrainz: GET /ws/2/...?query=<fragment>
MusicBrainz-->>Plugin: risultati fallback
end
end
alt risultato contiene barcode
Plugin->>DB: persisti EAN/barcode
else risultato solo catno
Plugin->>DB: non persistere come EAN (salva solo catno)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minuti 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@storage/plugins/discogs/DiscogsPlugin.php`:
- Around line 344-347: The code is propagating catalog numbers into the
barcode/ean field because when identifierKind($isbn) returns 'catno' the same
$isbn is still passed as a fallback barcode; update the logic in the Discogs
plugin so the fallbackBarcode (or fallback barcode parameter) is only supplied
when the identifier kind is an actual barcode (i.e., kind === 'barcode' or
$param === 'barcode'), not when kind === 'catno'; locate uses of identifierKind,
$param, $isbn and any $fallbackBarcode assignments in the methods around the
current diff and the other occurrences referenced (the blocks using $param = ...
and code that passes $fallbackBarcode) and change them to pass null/omit the
fallback when kind is 'catno'.
- Around line 1112-1115: The MusicBrainz query must quote multi-word catalog
numbers to avoid Lucene splitting: when identifierKind($barcode) returns 'catno'
set the query value to a quoted string (e.g. "\"$barcode\"") before URL-encoding
so $url contains catno:"CDP 7912682" rather than catno:CDP 7912682; update the
$url construction to apply the quotes only for $field === 'catno' (use urlencode
or rawurlencode on the quoted value) while leaving barcode searches unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 256b2cff-6072-4421-ba3a-a3ecb80a4d1f
📒 Files selected for processing (3)
.gitignorestorage/plugins/discogs/DiscogsPlugin.phptests/discogs-catno.unit.php
…ne quoting Two Major findings from CodeRabbit on #102: 1. Cat# propagation into EAN field: fetchFromDiscogs() was passing the original input ($isbn) as fallbackBarcode to searchMusicBrainz() and fetchDiscogsReleaseFromSearchResult(). Both eventually reach mapReleaseToPinakes() / mapMusicBrainzToPinakes() which do `$releaseBarcode = $fallbackBarcode ?? extractBarcodeFromRelease(...)`, meaning a Cat# like "CDP 7912682" could end up persisted in the book's EAN column. Now compute $fallbackBarcode = $kind === 'barcode' ? $isbn : null and propagate that instead — Cat# inputs rely on extractBarcodeFromRelease() pulling a real EAN from the upstream payload. 2. MusicBrainz Lucene query split on spaces: `catno:CDP 7912682` is parsed by Lucene as `catno:CDP AND 7912682`, returning false-positive matches on unrelated releases. Per the MusicBrainz Indexed Search Syntax docs, multi-word phrases must be double-quoted. Extracted the field/value construction into a new static buildMusicBrainzQuery() that quotes catno values and strips non-digits from barcode values. rawurlencode() is applied to the full field:value fragment (not just the value) so the `:` stays URL-safe and the quotes survive transit. Test coverage: tests/discogs-catno.unit.php grew from 22 to 28 assertions covering the new buildMusicBrainzQuery() contract. CodeRabbit's curl verification confirmed the unquoted query returns wrong results.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@storage/plugins/discogs/DiscogsPlugin.php`:
- Around line 370-379: La variabile $fallbackBarcode viene assegnata
direttamente dall'input quando identifierKind($isbn) restituisce 'barcode',
permettendo che valori formattati tipo "5099902-988023" vengano preferiti e
salvati in ean; invece normalizza il valore prima di assegnarlo: quando
identifierKind(...) === 'barcode' passa $isbn attraverso la routine di
canonicalizzazione (es. rimuovere caratteri non numerici / conversione a EAN
canonico, o riusare extractBarcodeFromRelease se adatta) e assegna il risultato
solo se la normalizzazione produce un barcode valido; aggiorna la assegnazione
di $fallbackBarcode e assicurati che mapReleaseToPinakes riceva il valore
normalizzato o null se non valido.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e6b14be6-4d04-46df-98e0-a343194f8a85
📒 Files selected for processing (2)
storage/plugins/discogs/DiscogsPlugin.phptests/discogs-catno.unit.php
…t round 3)
CodeRabbit round 3 flagged a residual issue in the round-2 fix: when
identifierKind() returns 'barcode', the raw input was assigned to
$fallbackBarcode. For inputs like '5099902-988023' (hyphenated EAN
accepted via isNumericBarcode's digit-stripping check), the hyphenated
form would then flow through mapReleaseToPinakes() into the `ean`
column — persisting a non-canonical value.
Extracted canonicalSearchIdentifier(string, string) as a shared public
static helper that:
- For kind='barcode': strips non-digits ('5099902-988023' → '5099902988023')
- For kind='catno' / 'unknown': trims whitespace only (Cat# spaces are
semantically part of the identifier, e.g. 'CDP 7912682')
fetchFromDiscogs() now uses canonicalSearchIdentifier() for BOTH the
Discogs search URL and the fallbackBarcode value, ensuring the form
passed to mapReleaseToPinakes()'s `$fallbackBarcode ?? extract...`
chain is always canonical digits.
Test coverage grew to 33 assertions — 5 new cases exercise
canonicalSearchIdentifier directly.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
28 regression assertions on the DDL emitted by ArchivesPlugin (the strings that will hit the production database when an admin activates the plugin) + reflection-based checks on the DI contract. The .gitignore whitelist mirrors the one added on feature/discogs-catno-support (PR #102) so future plugin unit tests can live alongside the Playwright *.spec.js files without requiring case-by-case exceptions. Run: php tests/archives-plugin.unit.php → Passed: 28 Failed: 0
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@storage/plugins/discogs/DiscogsPlugin.php`:
- Around line 172-183: buildMusicBrainzQuery currently treats an identifierKind
of 'unknown' as a barcode which can produce malformed MusicBrainz queries;
change buildMusicBrainzQuery to detect identifierKind($input) === 'unknown' and
immediately return an empty string (or alternatively throw an
InvalidArgumentException) so callers won't generate queries like "barcode:abc",
and update the function docblock to state that it returns an empty string for
unknown identifiers and callers must filter that case; reference the
buildMusicBrainzQuery and identifierKind symbols when applying the fix.
- Around line 81-92: The isNumericBarcode function currently treats any string
composed of digits, spaces and hyphens as a barcode; tighten the validation in
isNumericBarcode by replacing the permissive /^[\s0-9\-]+$/ check with a pattern
that allows only digits with optional internal hyphen-separated groups and
optional surrounding spaces (e.g. /^\s*\d+(?:-\d+)*\s*$/) while keeping the
existing digit-count check (12 or 13 after stripping non-digits); update the
inline comment in isNumericBarcode to document this tradeoff and the allowed
hyphen format (so pure hyphenated catnos like 5099902-988023 remain valid but
arbitrary hyphen/space-only inputs do not get misclassified).
In `@tests/discogs-catno.unit.php`:
- Around line 22-24: The file loads Hooks and HookManager with require_once but
the comment says Hooks isn't autoloaded and activate() is never invoked, so
remove the redundant require_once lines for Hooks.php and HookManager.php (leave
the require for storage/plugins/discogs/DiscogsPlugin.php). If you prefer to
keep them as a precaution, instead update the comment above lines referencing
Hooks/HookManager to state they are intentionally required for parsing/type
usage (mention Hooks and HookManager by name) so the code and comment are
consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b23b9ba1-2d1a-493c-a4c2-a35335e7baca
📒 Files selected for processing (3)
.gitignorestorage/plugins/discogs/DiscogsPlugin.phptests/discogs-catno.unit.php
…empty query on unknown - isNumericBarcode: regex ristretta a /^\s*\d+(?:-\d+)*\s*$/ per escludere input con whitespace interno o pathologici fatti di soli separatori. 5099902-988023 (EAN hyphenated) resta valido. - buildMusicBrainzQuery: ritorna '' per identifierKind()==='unknown' invece di emettere barcode:<gibberish>. Rimosso else per soddisfare PHPMD. - searchMusicBrainz: guard early-return se query è vuota (no HTTP round-trip con query= malformato). - tests/discogs-catno.unit.php: rimossi require_once inutili per Hooks.php e HookManager.php (commento già indicava che activate() non è invocato). Aggiunti 2 test di regressione per il nuovo contratto 'unknown' → ''.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
1 similar comment
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@storage/plugins/discogs/DiscogsPlugin.php`:
- Around line 81-97: The isNumericBarcode() check currently rejects barcodes
with internal whitespace (e.g. "0777 7836 4627") even after verifying 12/13
digits, so update the regex in isNumericBarcode to allow spaces as valid
separators between digit-groups while still rejecting letters and pathological
separators; specifically change the pattern used in preg_match from
'/^\s*\d+(?:-\d+)*\s*$/' to accept either a single space or hyphen between
groups (for example '/^\s*\d+(?:[ -]\d+)*\s*$/'), keeping the leading/trailing
whitespace handling and relying on the existing digit-count check.
- Around line 184-196: La funzione buildMusicBrainzQuery usa l'input grezzo per
i catalog numbers (catno) causando query con spazi non trim-mati; invece di
concatenare $input direttamente nell'if ($kind === 'catno') usa
canonicalSearchIdentifier($input) per ottenere la versione
normalizzata/trim-data, poi applica addcslashes su quel valore normalizzato e
costruisci la stringa catno:"..."; mantieni il flusso esistente per
identifierKind/ barcode (cioè lascia il branch barcode così com'è) e sostituisci
solo l'uso di $input nel ramo catno con la chiamata a canonicalSearchIdentifier.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 19dfdd63-bf09-4e5a-9acf-910851e5c8a6
📒 Files selected for processing (2)
storage/plugins/discogs/DiscogsPlugin.phptests/discogs-catno.unit.php
Add reusable Cat#/barcode seed fixtures and Playwright coverage for persistent music documents. Keep raw plugin-accepted identifiers through scrape hooks while preserving canonical ISBN handling.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
…in MB query
- isNumericBarcode: regex ora /^\s*\d+(?:(?:-|\s+)\d+)*\s*$/ per ammettere
UPC-A con spazi di raggruppamento (es. "0777 7836 4627") oltre a quelli
hyphenated. Prima, dopo il round 4, finivano in isCatalogNumber().
- buildMusicBrainzQuery: ramo catno ora passa per canonicalSearchIdentifier()
per trim di whitespace padding. Allineato alla canonicalizzazione già in
uso sul ramo barcode (" SRX-6272 " → catno:"SRX-6272").
- tests: aggiunti regression test per UPC-A spaziato e catno con padding.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/discogs-catno-documents.spec.js`:
- Around line 125-136: The spec currently asserts database values written by the
seeder (via SELECT from libri) so the expectation "kind==='catno' → ean===''"
only verifies seedDocument behavior, not DiscogsPlugin::fetchFromDiscogs or
ScrapeController::byIsbn; update the test to either (a) call the real pipeline
(DiscogsPlugin::fetchFromDiscogs → ScrapeController::byIsbn) and assert the
resulting libri row, or (b) keep it as a persistent fixture/smoke UI spec but
add a clear top-of-file comment and test description stating this is a fixture
persistence check (not a regression test for DiscogsPlugin), referencing the
variables/functions used here (dbQuery, seedDocument, doc, and the libri table)
so future readers understand the intent.
- Around line 49-83: The UPDATE in seedDocument currently uses WHERE titolo =
... which can affect non-seed rows; change the seed logic to target a
deterministic seed key: when running the UPDATE/SELECT/INSERT use the E2E key
embedded in note_varie (e.g. add WHERE note_varie LIKE '%E2E seed key:
${doc.key}%' to the UPDATE and subsequent SELECT) or add/consume a dedicated
seed_key column and filter by it; update the dbExec UPDATE, the dbQuery SELECT
id lookup, and the INSERT to set that same seed identifier so the operation
becomes idempotent and only touches seeded rows.
In `@tests/seeds/discogs-catno-documents.json`:
- Line 11: The fixture entry E2E_CATNO_010 is misclassified as kind:"catno"
because its identifier "00946 3 21438 2 9" now matches the updated
isNumericBarcode() and will be treated as a barcode; either change E2E_CATNO_010
to kind:"barcode" and set canonical:"0094632143829" (and ensure ean/related
fields reflect barcode usage), or replace the identifier with a non-numeric Cat#
(e.g., "EMI CDP 5099902") and keep kind:"catno"; make the same corresponding
update in the paired identifiers fixture (the discogs-catno-identifiers entry
for E2E_CATNO_010) so both seeds remain consistent with production behavior.
In `@tests/seeds/discogs-catno-identifiers.json`:
- Around line 74-81: The seed "EMI spaced numeric Cat# with 13 digits" is being
matched by isNumericBarcode (/^\s*\d+(?:(?:-|\s+)\d+)*\s*$/) and thus
identifierKind, canonicalSearchIdentifier and buildMusicBrainzQuery treat it as
a barcode; either (A) update this seed to be a barcode-consistent test by
changing its kind to "barcode", setting canonical to the digits-only form
"0094632143829" and musicBrainzQuery to "barcode:0094632143829" (so
identifierKind -> "barcode" matches expectations), or (B) preserve catno
semantics by changing the input string to avoid a pure 13-digit spaced sequence
(e.g. prefix with "EMI " -> "EMI 00946 3 21438 2 9") so isNumericBarcode no
longer matches and identifierKind returns "catno" and
canonicalSearchIdentifier/buildMusicBrainzQuery remain as the catno values;
apply one of these fixes and run the unit tests referencing isNumericBarcode,
identifierKind, canonicalSearchIdentifier, and buildMusicBrainzQuery to verify.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b3d76485-bb1a-4ef2-93c5-0929cae71406
📒 Files selected for processing (8)
.gitignoreapp/Controllers/ScrapeController.phpstorage/plugins/discogs/DiscogsPlugin.phptests/discogs-catno-documents.spec.jstests/discogs-catno-seeds.unit.phptests/discogs-catno.unit.phptests/seeds/discogs-catno-documents.jsontests/seeds/discogs-catno-identifiers.json
…code reclass Risolve 3 findings CodeRabbit su PR #102: 1. tests/discogs-catno-documents.spec.js (Minor): UPDATE/SELECT di seedDocument ora filtra per il marker "E2E seed key: <doc.key>" embedded in note_varie invece che per titolo. Prima, un titolo non-seed identico a quello del fixture (o un soft-deleted con stesso titolo) veniva mutato dalla UPDATE, producendo seed non idempotenti su DB pre-esistenti. Ora la cycle UPDATE→SELECT→INSERT tocca esclusivamente righe seed-owned. 2. tests/seeds/discogs-catno-documents.json (Major): E2E_CATNO_010 "00946 3 21438 2 9" era classificato kind="catno" ma con la regex aggiornata di isNumericBarcode (che ammette spazi come separatori fra digit-group per UPC-A spaziati) identifierKind() ora lo restituisce come "barcode". Riclassificato kind="barcode", canonical="0094632143829" per allinearsi al comportamento reale del plugin — altrimenti il seeder scriveva ean='' per un input che in produzione finirebbe regolarmente in ean=0094632143829. 3. tests/seeds/discogs-catno-identifiers.json (Critical): Stessa inconsistenza sul gemello "EMI spaced numeric ... 13 digits". Riclassificato come barcode: canonical digit-stripped + MB query barcode:0094632143829. Senza questo fix discogs-catno-seeds.unit.php falliva 3 assertion (identifierKind, canonical, buildMusicBrainzQuery) sul case #10. Verifiche: seeds unit 21/21 OK, main unit 38/38 OK, PHPStan clean.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@storage/plugins/discogs/DiscogsPlugin.php`:
- Around line 106-119: La funzione isCatalogNumber() attualmente classifica
stringhe come "080442957X" come catno anche se sono ISBN-10 validi; update
isCatalogNumber(string $input) per escludere ISBN-10 validi prima di restituire
true: implementa o riusa una funzione helper isValidIsbn10(string $s) che
normalizzi il valore e verifichi checksum (supportando 'X' come check digit) e,
se isValidIsbn10($trimmed) è true, far tornare false da isCatalogNumber;
aggiungi un unit test che verifica che identifierKind('080442957X') non ritorni
'catno' (usa identifierKind or the codepath that maps inputs to 'catno') per
prevenire ricerche Discogs sbagliate.
In `@tests/discogs-catno-documents.spec.js`:
- Around line 7-8: Aggiorna l'intestazione del test in
tests/discogs-catno-documents.spec.js che documenta il comando Playwright:
sostituisci il riferimento a /tmp/run-e2e.sh e aggiungi l'opzione --workers=1
così che il comando esegua il wrapper per credenziali DB/admin e forzi
l'esecuzione seriale; assicurati di modificare la riga contenente il
suggerimento "npx playwright test tests/discogs-catno-documents.spec.js
--config=tests/playwright.config.js" per includere entrambi i vincoli richiesti.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6038b00f-842e-4ab1-b56c-a9e04af57554
📒 Files selected for processing (8)
.gitignoreapp/Controllers/ScrapeController.phpstorage/plugins/discogs/DiscogsPlugin.phptests/discogs-catno-documents.spec.jstests/discogs-catno-seeds.unit.phptests/discogs-catno.unit.phptests/seeds/discogs-catno-documents.jsontests/seeds/discogs-catno-identifiers.json
| private static function isCatalogNumber(string $input): bool | ||
| { | ||
| $trimmed = trim($input); | ||
| $len = strlen($trimmed); | ||
| if ($len < 4 || $len > 30) { | ||
| return false; | ||
| } | ||
| if (preg_match('/^[A-Za-z0-9][A-Za-z0-9 .\-_\/]*[A-Za-z0-9]$/', $trimmed) !== 1) { | ||
| return false; | ||
| } | ||
| $hasLetter = preg_match('/[A-Za-z]/', $trimmed) === 1; | ||
| $hasDigit = preg_match('/[0-9]/', $trimmed) === 1; | ||
| $hasSeparator = preg_match('/[ .\-_\/]/', $trimmed) === 1; | ||
| return $hasDigit && ($hasLetter || $hasSeparator); |
There was a problem hiding this comment.
Escludi gli ISBN-10 validi dal rilevamento Cat#.
isCatalogNumber() accetta valori come 080442957X come catno; è un ISBN-10 valido e contiene cifre + X. Nel flusso controller questo può far partire una ricerca Discogs catno=080442957X e fondere metadati musicali in una scheda libro.
🐛 Fix proposto
private static function isCatalogNumber(string $input): bool
{
$trimmed = trim($input);
$len = strlen($trimmed);
if ($len < 4 || $len > 30) {
return false;
}
+ if (self::isIsbn10($trimmed)) {
+ return false;
+ }
if (preg_match('/^[A-Za-z0-9][A-Za-z0-9 .\-_\/]*[A-Za-z0-9]$/', $trimmed) !== 1) {
return false;
}
@@
return $hasDigit && ($hasLetter || $hasSeparator);
}
+
+private static function isIsbn10(string $input): bool
+{
+ $value = strtoupper(trim($input));
+ if (preg_match('/^\d{9}[\dX]$/', $value) !== 1) {
+ return false;
+ }
+
+ $sum = 0;
+ for ($i = 0; $i < 9; $i++) {
+ $sum += (int) $value[$i] * (10 - $i);
+ }
+ $checkDigit = (11 - ($sum % 11)) % 11;
+ $expected = $checkDigit === 10 ? 'X' : (string) $checkDigit;
+
+ return $value[9] === $expected;
+}Aggiungi anche un test per 080442957X affinché identifierKind() non ritorni catno.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@storage/plugins/discogs/DiscogsPlugin.php` around lines 106 - 119, La
funzione isCatalogNumber() attualmente classifica stringhe come "080442957X"
come catno anche se sono ISBN-10 validi; update isCatalogNumber(string $input)
per escludere ISBN-10 validi prima di restituire true: implementa o riusa una
funzione helper isValidIsbn10(string $s) che normalizzi il valore e verifichi
checksum (supportando 'X' come check digit) e, se isValidIsbn10($trimmed) è
true, far tornare false da isCatalogNumber; aggiungi un unit test che verifica
che identifierKind('080442957X') non ritorni 'catno' (usa identifierKind or the
codepath that maps inputs to 'catno') per prevenire ricerche Discogs sbagliate.
| * Playwright. Run with: | ||
| * npx playwright test tests/discogs-catno-documents.spec.js --config=tests/playwright.config.js |
There was a problem hiding this comment.
Aggiorna il comando E2E con wrapper e worker seriale.
Questa spec usa credenziali DB/admin e fixture persistenti: il comando documentato deve passare da /tmp/run-e2e.sh e includere --workers=1.
🧪 Fix proposto
- * npx playwright test tests/discogs-catno-documents.spec.js --config=tests/playwright.config.js
+ * /tmp/run-e2e.sh npx playwright test tests/discogs-catno-documents.spec.js --config=tests/playwright.config.js --workers=1As per coding guidelines, tests/**: "I test E2E richiedono /tmp/run-e2e.sh per credenziali DB/admin" e "--workers=1 obbligatorio per esecuzione seriale."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| * Playwright. Run with: | |
| * npx playwright test tests/discogs-catno-documents.spec.js --config=tests/playwright.config.js | |
| * Playwright. Run with: | |
| * /tmp/run-e2e.sh npx playwright test tests/discogs-catno-documents.spec.js --config=tests/playwright.config.js --workers=1 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/discogs-catno-documents.spec.js` around lines 7 - 8, Aggiorna
l'intestazione del test in tests/discogs-catno-documents.spec.js che documenta
il comando Playwright: sostituisci il riferimento a /tmp/run-e2e.sh e aggiungi
l'opzione --workers=1 così che il comando esegua il wrapper per credenziali
DB/admin e forzi l'esecuzione seriale; assicurati di modificare la riga
contenente il suggerimento "npx playwright test
tests/discogs-catno-documents.spec.js --config=tests/playwright.config.js" per
includere entrambi i vincoli richiesti.
Summary
Fixes Hans' report on issue #101: Discogs plugin rejected
CDP 7912682(Bonnie Raitt — Nick of Time, Capitol 1989) because the oldvalidateBarcode()only accepted 12/13-digit numeric barcodes. Any release older than barcode adoption, or any promo/limited pressing without a printed EAN, couldn't be scraped.Discogs identifies releases through three distinct keys: EAN/UPC (barcode on sleeve), Cat# a.k.a. catalog number (printed on disc/spine/label — alphanumeric), and free-text (title + artist). MusicBrainz exposes the same split via its Lucene
barcode:vscatno:filters.Changes
validateBarcode()now accepts Cat# patterns via newisCatalogNumber()helper.identifierKind()classifies inputs asbarcode/catno/unknown.fetchFromDiscogs()routescatnoinputs to?catno=(not?barcode=).searchMusicBrainz()routescatnoinputs tocatno:Lucene field (notbarcode:).74321 66847 2) are accepted viahasDigit && (hasLetter || hasSeparator)heuristic; bare 12/13-digit strings still classify as barcode (checked first, so no regression).Test plan
php tests/discogs-catno.unit.php— 22/22 assertions green (ISBN, EAN-13, UPC-A, Cat# variants, rejection edge cases, classification)CDP 7912682in a real install with Discogs plugin active (user verification)5099902988023etc.)Notes
tests/*.unit.phpwhitelisted in.gitignoreso future plugin unit tests can live alongside*.spec.jsE2E tests.phpstan.neongot astorage/pluginsentry inscanDirectorieslocally (file is gitignored), which lets PHPStan resolve plugin classes from test files.@coderabbitai review
Summary by CodeRabbit
Miglioramenti
Test
Chore