Skip to content

feat(archives): Dublin Core + EAD3 export + OAI-PMH 2.0#127

Merged
fabiodalez-dev merged 11 commits intomainfrom
feat/archives-interop-standards
May 4, 2026
Merged

feat(archives): Dublin Core + EAD3 export + OAI-PMH 2.0#127
fabiodalez-dev merged 11 commits intomainfrom
feat/archives-interop-standards

Conversation

@fabiodalez-dev
Copy link
Copy Markdown
Owner

@fabiodalez-dev fabiodalez-dev commented May 1, 2026

Summary

Roadmap correlata (future)

Test plan

  • GET /admin/archives/{id}/dc.xml — verifica XML Dublin Core ben formato con namespace corretti
  • GET /admin/archives/export.ead3 — download file EAD3, valida struttura XML
  • GET /archives/oai?verb=Identify — risposta Identify con repository info
  • GET /archives/oai?verb=ListMetadataFormats — restituisce oai_dc e ead3
  • GET /archives/oai?verb=ListRecords&metadataPrefix=oai_dc — paginazione con resumptionToken
  • GET /archives/oai?verb=GetRecord&identifier=oai:pinakes:archival_unit:1&metadataPrefix=oai_dc — record singolo
  • GET /archives/oai?verb=ListSets — risposta noSetHierarchy
  • Pulsanti "Esporta EAD3" e "OAI-PMH" visibili nella lista archivi (solo admin attivo)
  • <link rel="alternate" type="application/rdf+xml"> presente nel <body> della pagina di dettaglio

@coderabbitai full review

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Esportazione bulk in formato EAD3
    • Endpoint OAI-PMH 2.0 pubblico e amministrativo
    • Esportazione Dublin Core per singole unità archivistiche
    • Nuovi link/azioni nell'interfaccia per accesso alle esportazioni e identificazione OAI
  • UI

    • Il layout ora può includere link aggiuntivi nell'head per metadati alternativi
  • Traduzioni

    • Aggiunte etichette in IT, EN e DE per "Esporta EAD3" e "OAI-PMH"
  • Tests

    • Aggiunti test di regressione per OAI-PMH, resumption token e rendering headLinks

- Dublin Core XML: GET /admin/archives/{id}/dc.xml with full ISAD(G)→DC
  crosswalk; discovery <link rel="alternate"> in detail view (#124)
- EAD3 export: GET /admin/archives/export.ead3 downloads full collection
  as EAD3 XML with ISAD(G)→EAD3 crosswalk (#125)
- OAI-PMH 2.0: public GET|POST /archives/oai endpoint — Identify,
  ListMetadataFormats (oai_dc + ead3), ListIdentifiers, ListRecords,
  GetRecord, ListSets; cursor-based pagination via resumptionToken (#126)
- Export buttons added to archives list view
- Translations added to it_IT, en_US, de_DE

Closes #124, #125, #126
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 1, 2026

Warning

Rate limit exceeded

@fabiodalez-dev has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 16 minutes and 21 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 97024591-a916-438f-b7b6-0b501ed98f39

📥 Commits

Reviewing files that changed from the base of the PR and between 6d5ae9c and 942d7a8.

📒 Files selected for processing (7)
  • app/Views/frontend/layout.php
  • app/Views/layout.php
  • locale/de_DE.json
  • locale/en_US.json
  • storage/plugins/archives/ArchivesPlugin.php
  • storage/plugins/archives/plugin.json
  • tests/archives-plugin.unit.php
📝 Walkthrough

Walkthrough

La PR aggiunge tre endpoint di interoperabilità (Dublin Core per unità, esportazione bulk EAD3, endpoint OAI‑PMH pubblico), estende i layout per emettere link in <head> tramite la variabile $headLinks, aggiunge link UI nell'indice degli archivi e inserisce chiavi di localizzazione (IT/EN/DE). Un test esercita OAI‑PMH e $headLinks.

Changes

Interoperabilità Archivistica e OAI‑PMH

Layer / File(s) Summary
Routing / Registrazione
storage/plugins/archives/ArchivesPlugin.php (lines ~598–640)
Registrate quattro route: /admin/archives/{id}/dc.xml, /archives/{id}/dc.xml, /admin/archives/export.ead3, /archives/oai (GET/POST).
UI — Head link injection
storage/plugins/archives/ArchivesPlugin.php (lines ~1028–1031)
showAction() aggiunge un elemento headLinks con <link rel="alternate"> verso il Dublin Core pubblico per l'unità.
Template — Head links rendering (frontend)
app/Views/frontend/layout.php (lines ~209–221)
Dopo assets.head il layout itera $headLinks se array non vuoto e stampa <link> con rel, type, title, href.
Template — Head links rendering (admin/layout)
app/Views/layout.php (lines ~61–75)
Stessa logica di rendering per $headLinks nel layout principale; valori HTML-escaped.
Core Implementation: Dublin Core export
storage/plugins/archives/ArchivesPlugin.php (lines ~4042–4148)
exportDublinCoreAction() costruisce XML oai_dc:dc per una singola unità (titolo, creatori filtrati per ruolo creator, identificatore, date, descrizione, tipo, formato, lingua, diritti).
Core Implementation: EAD3 bulk export
storage/plugins/archives/ArchivesPlugin.php (lines ~4150–4448)
exportEad3CollectionAction() serializza fino a 1000 unità non cancellate in un contenitore <eadlist>, writeEad3Document() emette <ead> namespaced e mappa campi/unità/autorità nei rispettivi elementi.
Core Implementation: OAI‑PMH dispatcher
storage/plugins/archives/ArchivesPlugin.php (lines ~4449–4889)
oaiPmhAction() implementa i verbi OAI (Identify, ListMetadataFormats, ListRecords/ListIdentifiers, GetRecord, ListSets), supporta oai_dc/marcxml/ead3, filtra unità non cancellate, applica from/until su updated_at, set su level, usa pagination cursor tramite resumptionToken e helper per token/errori.
UI — Index actions
storage/plugins/archives/views/index.php (lines ~131–138)
Aggiunti due pulsanti nell'header: "Esporta EAD3" (/admin/archives/export.ead3) e "OAI‑PMH" (/archives/oai?verb=Identify, apre in nuova tab).
Localizzazioni
locale/en_US.json (lines ~4437–4438), locale/de_DE.json (lines ~4437–4438), locale/it_IT.json (lines ~237–238)
Aggiunte chiavi di traduzione: Esporta EAD3 e OAI‑PMH.
Tests
tests/archives-plugin.unit.php (lines ~72–83, 98–113)
Aggiunte asserzioni per presenza route DC pubblica, comportamento di decodifica resumptionToken prima della validazione, che ListIdentifiers non avvolga con <record>, e test reflection encode→decode per i metodi privati di resumption token; verifica che i layout rispettino $headLinks.

Sequence Diagram

sequenceDiagram
    actor Client
    participant OAI as OAI-PMH Handler
    participant DB as Database
    participant Serializer as DC/EAD3 Serializer
    Client->>OAI: GET /archives/oai?verb=Identify
    OAI->>OAI: Validate verb & params
    OAI->>Serializer: Build Identify response
    Serializer-->>OAI: XML Identify element
    OAI-->>Client: 200 + OAI Identify XML

    Client->>OAI: GET /archives/oai?verb=ListRecords&metadataPrefix=oai_dc
    OAI->>DB: Query non-deleted units (from/until/set filters, cursor)
    DB-->>OAI: Unit rows + cursor
    OAI->>Serializer: Build DC record per unit
    Serializer-->>OAI: oai_dc:dc elements
    OAI->>OAI: Encode resumptionToken(cursor, prefix, filters)
    OAI-->>Client: 200 + ListRecords (records + resumptionToken)

    Client->>OAI: GET /archives/oai?verb=ListRecords&resumptionToken=...
    OAI->>OAI: Decode resumptionToken → cursor, prefix, filters
    OAI->>DB: Query units from cursor with filters
    DB-->>OAI: Next batch + new cursor
    OAI->>Serializer: Build next batch of records
    Serializer-->>OAI: Record elements
    OAI-->>Client: 200 + ListRecords (next batch)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minuti

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 72.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Il titolo descrive chiaramente e accuratamente le tre funzionalità principali implementate: esportazione Dublin Core, esportazione EAD3 e endpoint OAI-PMH 2.0.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/archives-interop-standards

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 16 minutes and 21 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

- oaiIdentify: read repositoryName and adminEmail from ConfigStore
  instead of hardcoded placeholders (breaks every non-default install)
- oaiListRecords: filter from/until on updated_at, not created_at;
  OAI-PMH datestamps track last-modified, so harvesters using
  incremental selects would miss updated records
- ListMetadataFormats: add ead3 as a supported prefix (the EAD3 writer
  existed but was not advertised or reachable via OAI); handle
  metadataPrefix=ead3 in both oaiListRecords and oaiGetRecord
- marcxml schema URL: remove duplicate namespace prefix that was
  incorrectly embedded in the <schema> element value
- exportEad3CollectionAction: document <eadlist> as a Pinakes-specific
  container (not part of EAD3 schema) so consuming tools know to strip
  the outer wrapper
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@storage/plugins/archives/ArchivesPlugin.php`:
- Around line 4575-4593: The resumption token handling is done after validating
$metadataPrefix, so a continuation request that supplies only resumptionToken
triggers cannotDisseminateFormat; move token parsing ahead of the
$metadataPrefix validation so a token can override/restore
metadataPrefix/from/until/set; specifically, in the block handling $token and
$cursor (variables $token, $cursor, $parts) parse and decode the token first to
populate $metadataPrefix, $from, $until, $set and only then validate
$metadataPrefix via the existing in_array check and oaiError call; also replace
the fragile colon-joined token generation/consumption used around the code that
builds tokens (currently at the token creation around lines 4709-4714) with a
robust encoding (e.g., urlencoded components, JSON or base64-encoded payload) so
date/time and values containing ':' are safe to split and restore.
- Around line 4670-4702: The code incorrectly wraps a <header> inside a <record>
even when $identifiersOnly is true (i.e. $verbElement = 'ListIdentifiers'),
which breaks OAI-PMH schema; change the loop so that when $identifiersOnly is
true you do NOT call $xw->startElement('record') / $xw->endElement() and instead
write the <header> (identifier, datestamp, optional setSpec) as a direct child
of the top-level verb element; keep the existing metadata branch
(fetchAuthoritiesForArchivalUnit, writeDublinCoreRecord, writeEad3Document,
writeArchivalUnitMarcRecord) only inside the !$identifiersOnly path and ensure
corresponding $xw->startElement('metadata') / $xw->endElement() calls remain
balanced.

In `@storage/plugins/archives/views/show.php`:
- Around line 502-504: The <link rel="alternate" ... href="<?=
$e(url('/admin/archives/' . $id . '/dc.xml')) ?>"> tag is currently rendered in
the view body and should be moved into the page <head>; remove the inline
emission from show.php and instead emit the same link element into the template
engine's head slot/layout (use the head/layout slot API your renderer provides)
so the exact href expression $e(url('/admin/archives/' . $id . '/dc.xml')) is
output in the document head rather than the body.
- Line 504: Il link in storage/plugins/archives/views/show.php currently punta
alla route protetta "/admin/archives/{id}/dc.xml" usando href="<?=
$e(url('/admin/archives/' . $id . '/dc.xml')) ?>", impedendo a harvester non
autenticati di recuperare il DC; update il view per usare un endpoint pubblico
(es. "/archives/{id}/dc.xml") o creare un equivalente pubblico controller/route
che esponga il record Dublin Core e poi sostituire la chiamata a
url('/admin/archives/' . $id . '/dc.xml') con url('/archives/' . $id .
'/dc.xml') (oppure implementare e collegare una strategia machine-to-machine
esplicita se si vuole mantenere protezione), mantenendo l'uso della variabile
$id e della funzione url()/escaper $e.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 47f3c9fd-b7b8-4bb3-a70f-e10d57533130

📥 Commits

Reviewing files that changed from the base of the PR and between 18c168c and 876f7f0.

📒 Files selected for processing (6)
  • locale/de_DE.json
  • locale/en_US.json
  • locale/it_IT.json
  • storage/plugins/archives/ArchivesPlugin.php
  • storage/plugins/archives/views/index.php
  • storage/plugins/archives/views/show.php

Comment thread storage/plugins/archives/ArchivesPlugin.php
Comment thread storage/plugins/archives/ArchivesPlugin.php
Comment thread storage/plugins/archives/views/show.php Outdated
Comment on lines +502 to +504
<?php /* Dublin Core discovery link — discoverable by aggregators and web crawlers. */ ?>
<link rel="alternate" type="application/rdf+xml" title="Dublin Core"
href="<?= $e(url('/admin/archives/' . $id . '/dc.xml')) ?>">
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sposta il discovery link nel <head> della pagina

Alle Line 502-504 il tag <link rel="alternate"> viene emesso nel body della view: così può non essere considerato dai consumer di metadati. Per la discovery va pubblicato nell’<head> (via layout/slot head del template engine).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/plugins/archives/views/show.php` around lines 502 - 504, The <link
rel="alternate" ... href="<?= $e(url('/admin/archives/' . $id . '/dc.xml')) ?>">
tag is currently rendered in the view body and should be moved into the page
<head>; remove the inline emission from show.php and instead emit the same link
element into the template engine's head slot/layout (use the head/layout slot
API your renderer provides) so the exact href expression
$e(url('/admin/archives/' . $id . '/dc.xml')) is output in the document head
rather than the body.

Comment thread storage/plugins/archives/views/show.php Outdated

<?php /* Dublin Core discovery link — discoverable by aggregators and web crawlers. */ ?>
<link rel="alternate" type="application/rdf+xml" title="Dublin Core"
href="<?= $e(url('/admin/archives/' . $id . '/dc.xml')) ?>">
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

URL di discovery su /admin non compatibile con harvester pubblici

Alla Line 504 l’href usa /admin/archives/{id}/dc.xml: se la route è protetta, aggregatori e crawler non autenticati non possono leggere il record Dublin Core. Serve un endpoint pubblico equivalente (o una strategia di accesso machine-to-machine esplicita).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/plugins/archives/views/show.php` at line 504, Il link in
storage/plugins/archives/views/show.php currently punta alla route protetta
"/admin/archives/{id}/dc.xml" usando href="<?= $e(url('/admin/archives/' . $id .
'/dc.xml')) ?>", impedendo a harvester non autenticati di recuperare il DC;
update il view per usare un endpoint pubblico (es. "/archives/{id}/dc.xml") o
creare un equivalente pubblico controller/route che esponga il record Dublin
Core e poi sostituire la chiamata a url('/admin/archives/' . $id . '/dc.xml')
con url('/archives/' . $id . '/dc.xml') (oppure implementare e collegare una
strategia machine-to-machine esplicita se si vuole mantenere protezione),
mantenendo l'uso della variabile $id e della funzione url()/escaper $e.

@fabiodalez-dev
Copy link
Copy Markdown
Owner Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 2, 2026

✅ Actions performed

Full review triggered.

@fabiodalez-dev
Copy link
Copy Markdown
Owner Author

All three CodeRabbit findings (reviewed against 876f7f0) are already resolved in the follow-up commit 9d07ce2 (fix(archives): harden OAI-PMH and DC discovery):

Finding Status
resumptionToken decoded after metadataPrefix validation ✅ Fixed — decodeOaiResumptionToken() is now called at line 4599, before the in_array check at line 4612. Token uses base64url-JSON, not colon-delimited string.
ListIdentifiers wraps <header> inside <record> ✅ Fixed — $xw->startElement('record') / endElement() are now conditional on !$identifiersOnly (lines 4703–4729). <header> is emitted as a direct child of <ListIdentifiers>.
<link rel="alternate"> in <body>, pointing to admin-only route ✅ Fixed — the inline tag was removed from show.php; both app/Views/layout.php and app/Views/frontend/layout.php now expose a $headExtra slot (lines 64–65 / 210–211); renderShowAction() injects <link rel="alternate"> pointing to the public /archives/{id}/dc.xml route (added alongside the admin route at line 609) via that slot.

@fabiodalez-dev
Copy link
Copy Markdown
Owner Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 2, 2026

✅ Actions performed

Full review triggered.

1 similar comment
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 2, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/Views/frontend/layout.php`:
- Around line 210-212: Non fare echo diretto di $headExtra: smantella il sink
XSS e invece accetta/converti dati strutturati (es. un array di link/meta) e
renderizza esplicitamente ogni elemento usando escaping sugli attributi;
sostituisci l'echo di $headExtra con un renderer che itera su qualcosa come
$headExtra['links'] / $headExtra['metas'] e costruisce tag <link> e <meta> con
tutti gli attributi passati attraverso htmlspecialchars(..., ENT_QUOTES,
'UTF-8') (es. per href/action/src usare htmlspecialchars(url(...), ENT_QUOTES,
'UTF-8')), oppure rimuovi l'injection libero se non servono dati dinamici;
assicurati di riferirti a $headExtra nel codice esistente per la migrazione.

In `@app/Views/layout.php`:
- Around line 64-66: Non stampare più la stringa raw $headExtra: sostituisci
l'echo diretto in layout.php con un rendering sicuro che accetti dati
strutturati (es. un array $headExtraItems o una funzione renderHeadExtra) e non
HTML arbitrario; per ogni item, valida che il tipo sia consentito (es. 'link') e
che contenga solo gli attributi ammessi ('rel','href','type','title'), sanitizza
ogni attributo usando htmlspecialchars($value, ENT_QUOTES, 'UTF-8') e valida gli
URL con filter_var(..., FILTER_VALIDATE_URL) prima di inserirli, e se devi
mantenere compatibilità con stringhe legacy, non stamparle raw ma eseguire
htmlspecialchars sul contenuto intero o deprecarle.

In `@storage/plugins/archives/ArchivesPlugin.php`:
- Around line 4061-4063: The response is using the wrong MIME for OAI-DC: change
the Content-Type header set via response->withHeader('Content-Type',
'application/rdf+xml; charset=utf-8') to use application/xml (or another
OAI-PMH-consistent type) while keeping charset; also update the discovery links
(the link elements referenced around the discovery code paths that currently
declare type="application/rdf+xml") to use the same corrected type attribute so
the payload (<oai_dc:dc>) is served and advertised consistently; ensure the
Content-Disposition filename logic (the slug) remains unchanged.
- Around line 4110-4113: The in_array check inside the loop over $authorities
that maps roles to dc:creator must be tightened: remove 'photographer' and
'associated' so only 'creator' maps to dc:creator (update the condition in the
block that writes dc:creator where $auth['role'] is checked), and drop any
reference to the dead 'photographer' role (it’s not in AUTHORITY_ROLES); ensure
the writer call $xw->writeElementNs('dc','creator',...) only runs for role
'creator' to conform with Dublin Core crosswalk rules.
- Around line 4210-4215: The exported file is labeled as `.ead3.xml` but the
code emits a Pinakes-specific `<eadlist>` wrapper
(`$xw->startElement('eadlist')` / `$xw->writeAttribute('xmlns:xsi', ...)`) which
makes the XML non-conformant with EAD3; update the export behaviour to avoid
misleading consumers by either (A) removing the custom wrapper and emitting a
single valid `<ead>` per export (or producing multiple separate EAD3 files/ZIP)
or (B) keep the wrapper but change the filename/Content-Type to a generic `.xml`
(or append a `.nonstandard`/`-eadlist.xml` marker) and document the non-standard
wrapper in headers; locate the EAD bulk-export logic in ArchivesPlugin.php where
`$xw->startElement('eadlist')` is called and implement one of these two fixes
consistently for filename generation and HTTP Content-Type/header output.
- Around line 4429-4435: La logica che mappa le authority ISAAR in EAD3 va
aggiornata: invece di trattare tutto ciò che non è "corporate" come "persname",
rileva il valore "family" in $authType e usa il tag "famname" per quei casi;
aggiorna anche l'attributo di encodinganalog di conseguenza (es. usare '600' per
family), mantenendo invariati gli altri attributi ($role -> relator, testo da
authorised_form). Modifica la selezione del tag/encoding vicino alle variabili
$authType, $tag e alla chiamata a $xw->writeAttribute('encodinganalog') in modo
che 'corporate' => 'corpname'/'710', 'family' => 'famname'/'600', altrimenti
'persname'/'700'.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c2f0a636-540a-4c84-af32-1b6ba45cea4d

📥 Commits

Reviewing files that changed from the base of the PR and between 18c168c and 9d07ce2.

📒 Files selected for processing (8)
  • app/Views/frontend/layout.php
  • app/Views/layout.php
  • locale/de_DE.json
  • locale/en_US.json
  • locale/it_IT.json
  • storage/plugins/archives/ArchivesPlugin.php
  • storage/plugins/archives/views/index.php
  • tests/archives-plugin.unit.php

Comment thread app/Views/frontend/layout.php Outdated
Comment thread app/Views/layout.php Outdated
Comment thread storage/plugins/archives/ArchivesPlugin.php
Comment thread storage/plugins/archives/ArchivesPlugin.php
Comment thread storage/plugins/archives/ArchivesPlugin.php
Comment thread storage/plugins/archives/ArchivesPlugin.php
- headExtra raw-echo XSS sink: replace string-based $headExtra with
  structured $headLinks array in both layouts; each link's attributes
  are individually escaped with htmlspecialchars(..., ENT_QUOTES, 'UTF-8')
  — no arbitrary HTML injection possible
- MIME type: change application/rdf+xml → application/xml for Dublin
  Core XML downloads and discovery <link> elements; oai_dc:dc is not
  RDF/XML and rdf+xml causes parse failures in strict clients
- dc:creator crosswalk: remove 'photographer' (dead code, not in
  AUTHORITY_ROLES) and 'associated' (semantically wrong per DC spec)
  from the role filter; only role='creator' maps to dc:creator
- Tests: update headExtra checks → headLinks; add negative assertion
  that raw headExtra echo is absent
- famname: family-type authorities now emit <famname> in EAD3
  controlaccess instead of <persname>; encodinganalog stays 700 (added
  entry) for both person and family, 710 for corporate — unchanged
- EAD3 bulk export filename: rename archives_export.ead3.xml →
  archives_export.xml to avoid claiming pure EAD3 compliance when the
  root element is the Pinakes-specific <eadlist> container
@fabiodalez-dev
Copy link
Copy Markdown
Owner Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 2, 2026

✅ Actions performed

Full review triggered.

fabiodalez-dev added a commit that referenced this pull request May 2, 2026
…AD3, fix broken links

- Fix broken links: create guida/arricchimento.md and en/guide/bulk-enrichment.md
- Add guida/dischi.md + en/guide/discs.md: media types, dynamic labels, Discogs/MusicBrainz/Deezer plugins
- Add guida/collane.md + en/guide/series.md: series hierarchy (cycles, seasons, spin-offs) from v0.5.9.6
- Update guida/archivi.md + en/guide/archives.md: unified search (PR #120) + interoperability section (OAI-PMH 2.0, Dublin Core XML, EAD3 bulk export — PR #127)
- Update _sidebar.md + en/_sidebar.md: add Dischi/Discs and Collane/Series entries
- Update changelog: add v0.5.9.6 entry (series hierarchy) and "In arrivo" section (PR #118/119/120/127)
- Update tecnico/requisiti.md + en/technical/requirements.md: note on zip extension installer check (PR #118)
@fabiodalez-dev
Copy link
Copy Markdown
Owner Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

✅ Actions performed

Full review triggered.

Missing closing `>` caused unclosed <link rel="alternate"> tags on
every public archival unit detail page. Admin layout already correct.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/Views/frontend/layout.php`:
- Around line 210-221: The output loop over $headLinks renders $hl['href']
without URL-normalization/validation; update the loop that builds $out (the
foreach over $headLinks and inner foreach using $hl) to pass the href through
url() (or at minimum filter_var(..., FILTER_VALIDATE_URL)) before escaping, i.e.
compute a validated/normalizedHref from $hl['href'] and then use
htmlspecialchars(normalizedHref, ENT_QUOTES, 'UTF-8') when appending the href
attribute; ensure you only emit the href attribute when validation succeeds to
match the fix applied in app/Views/layout.php (the same $headLinks/$hl
handling).

In `@app/Views/layout.php`:
- Around line 64-75: The href attribute in the head links loop is only escaped
with htmlspecialchars but must be validated and normalized as a URL before
output; update the foreach block that processes $headLinks/$hl so that for
'href' you first verify filter_var($hl['href'], FILTER_VALIDATE_URL) (and/or
ensure an allowed scheme like http/https) or normalize via your url(...) helper,
then pass the result through htmlspecialchars(url(...), ENT_QUOTES, 'UTF-8')
before appending to $out—leave other attributes escaped as-is; ensure invalid or
non-absolute href values are skipped or sanitized to an empty value to prevent
injection.

In `@storage/plugins/archives/ArchivesPlugin.php`:
- Around line 4562-4585: oaiListMetadataFormats currently ignores an optional
identifier — update the oaiListMetadataFormats(XMLWriter $xw) implementation to
accept and check the incoming identifier parameter, lookup/validate the
corresponding item (use the existing repository/item lookup methods used
elsewhere in this class), and when identifier is present: if item not found
write an <error code="idDoesNotExist">…</error>, if item exists but has no
supported metadata formats write an <error code="noMetadataFormats">…</error>,
otherwise output only the metadataFormat elements supported for that item; keep
the current repository-wide behavior when identifier is not supplied. Ensure you
reference oaiListMetadataFormats and reuse the class’s existing item
lookup/format-detection utilities so responses conform to OAI-PMH v2.0.
- Around line 4475-4504: The <request> element is being written (via $xw on
XMLWriter) with verb/argument attributes before OAI-PMH argument validation,
causing non-conformant responses for badVerb/badArgument; fix by deferring
serialization of the <request> element until after you validate $verb and other
OAI args (or alternatively write the <request> inside each verb handler after
its validation), and ensure oaiError() responses produce a <request> containing
only the base URL (no attributes) when emitting badVerb/badArgument errors;
update the code that currently starts/writes/ends the 'request' element (the
$xw->startElement('request') / writeAttribute / $xw->endElement() sequence) so
it executes after validation or is moved into handlers, and adjust
oaiError()/error paths to render the attribute-less <request>.
- Around line 4292-4295: Il valore passato a eventdatetime/@standarddatetime può
essere in formato DB TIMESTAMP ("YYYY-MM-DD HH:MM:SS"); modifica il codice
intorno a ArchivesPlugin.php dove si costruisce $createdAt (usando
$row['created_at'], startElement('eventdatetime') e writeAttribute) per
normalizzare sempre in ISO 8601 UTC: se $row['created_at'] è in "Y-m-d H:i:s"
crea un oggetto DateTime/DateTimeImmutable con timezone UTC e formatta con
"Y-m-d\\TH:i:s\\Z", altrimenti se è già ISO lasciarlo, e infine usare quella
stringa sia per writeAttribute('standarddatetime', $createdAt) che per
text($createdAt).

In `@tests/archives-plugin.unit.php`:
- Around line 74-75: Le asserzioni sono fragili: per la prima
(decodeOaiResumptionToken) controlla non solo la presenza ma l'ordine rispetto
alla validazione di metadataPrefix—usa strpos sul contenuto sorgente per
ottenere le posizioni di "decodeOaiResumptionToken($token)" e della chiamata di
validazione del metadataPrefix e assert che la posizione di
decodeOaiResumptionToken sia minore; per la seconda (if (!$identifiersOnly) {
... $xw->startElement('record')) rimuovi la dipendenza dall'indentazione e dal
newline letterale e usa una regex permissiva o due strpos col controllo di
ordine (es. assicurati che "if (!$identifiersOnly)" compaia e che, da quella
posizione in poi, sia presente "$xw->startElement('record')" permettendo
spazi/blocchi bianchi variabili) in modo che la verifica controlli la struttura
logica non l'indentazione.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: e76ef7d0-7551-4941-829a-5a1446c5753d

📥 Commits

Reviewing files that changed from the base of the PR and between 9d07ce2 and 6d5ae9c.

📒 Files selected for processing (4)
  • app/Views/frontend/layout.php
  • app/Views/layout.php
  • storage/plugins/archives/ArchivesPlugin.php
  • tests/archives-plugin.unit.php

Comment thread app/Views/frontend/layout.php
Comment thread app/Views/layout.php
Comment thread storage/plugins/archives/ArchivesPlugin.php Outdated
Comment thread storage/plugins/archives/ArchivesPlugin.php
Comment thread storage/plugins/archives/ArchivesPlugin.php Outdated
Comment thread tests/archives-plugin.unit.php Outdated
- layout.php + frontend/layout.php: validate href with filter_var +
  scheme check before htmlspecialchars (blocks javascript:/data: URIs)
- ArchivesPlugin: normalize EAD3 eventdatetime/@standarddatetime to
  ISO 8601 UTC via DateTimeImmutable (DB TIMESTAMP has space not T)
- ArchivesPlugin: defer <request> attributes until after verb validation
  so badVerb responses emit attribute-less <request> per OAI-PMH §2.2
- ArchivesPlugin: oaiListMetadataFormats now accepts identifier param
  and returns idDoesNotExist when the record is not found
- tests: replace fragile whitespace-sensitive str_contains with regex
  for the identifiersOnly record-wrapper assertion
@fabiodalez-dev
Copy link
Copy Markdown
Owner Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

✅ Actions performed

Full review triggered.

Replace existence-only str_contains with strpos order check: verify that
decodeOaiResumptionToken($token) appears before 'cannotDisseminateFormat'
in the plugin source, ensuring the structural invariant (decode before
metadataPrefix validation) is enforced and not just presence-checked.
@fabiodalez-dev
Copy link
Copy Markdown
Owner Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

✅ Actions performed

Full review triggered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant