Skip to content

issue-cas-7733: DwC export pipeline Phases 4-6 (DwCA, EML, GBIF, RSS)#8038

Open
foozleface wants to merge 7 commits intoissue-cas-7710from
issue-cas-7733
Open

issue-cas-7733: DwC export pipeline Phases 4-6 (DwCA, EML, GBIF, RSS)#8038
foozleface wants to merge 7 commits intoissue-cas-7710from
issue-cas-7733

Conversation

@foozleface
Copy link
Copy Markdown
Collaborator

Summary

PR 7 of 7 in CAS DwC export stack. Based on #8037. This is the top of the stack.

Implements the DwC export pipeline (Phases 4–6 of the foundation work):

  • DwC Archive (DwCA) generation from cache tables (dwca_from_cache.py)
  • Attachment URL construction for media references
  • EML editor frontend with metadata serialization
  • GBIF validator integration
  • RSS feed publishing for export packages
  • Export Package frontend (PackageForm, ClonePackage, CopyRssUrl)
  • update_feed_v2 management command for the scheduled pipeline
  • occurrenceID uniqueness validation
  • cleanup_orphan_caches for cache table maintenance

Covers Specify issues: #7721, #7728, #7730, #7732, #7733, #7734, #7735, #7736, #7740, #7741, #7742, #7743, #7744.

Stack

  1. issue-cas-7746: extensions join table + vocabulary on Schemamapping #8032, issue-cas-7737: CacheTableMeta model + cache table infrastructure #8033, issue-cas-7714: DwC schema terms vocabulary + permissions #8034, issue-cas-7709: Schema Mapper UI shell + Schema Config DwC section #8035, issue-cas-7712: Clone endpoint + list APIs + Export Packages shell #8036, issue-cas-7710: Mapping UI features (NewMappingDialog, autoMap, toolbar) #8037 (predecessors)
  2. issue-cas-7733 — DwC export pipeline Phases 4-6 (this PR)

Test plan

  • python manage.py test specifyweb.backend.export — 33/33 passing (test_dwca, test_feed, test_attachment_urls)
  • DwCA archive generates correctly from a populated cache table
  • EML metadata round-trips through the editor
  • RSS feed updates on schedule

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 27, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 346c8954-8ecc-416b-8b43-8f915608a505

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue-cas-7733

Comment @coderabbitai help to get the list of available commands and usage tips.

Adds the export-app cache infrastructure:
- CacheTableMeta model + migration tracking build state per (mapping, collection)
- export.models shim: re-exports Caroline's Schemamapping/Exportdataset/
  Exportdatasetextension under PascalCase aliases for use throughout the package
- cache.py: get_cache_table_name, create_cache_table, drop_cache_table,
  _build_single_cache, _execute_and_populate, _infer_column_type, build_cache_tables
- dwca_utils.py: shared sanitize/build helpers used by cache and archive code
- Tests for SchemaMapping, ExportDataSet, ExportDataSetExtension, CacheTableMeta,
  and cache table operations

Fixes #7737. Closes overlap with the cache mechanism part of c381907
on dwc/foundation; remaining cache features (orphan cleanup, signal handlers,
build API, progress callbacks) ship in later atomic PRs.
…ssions

Adds the DwC vocabulary infrastructure:
- schema_terms.json — full DwC term list with descriptions and types
  (Occurrence core, Identification, MeasurementOrFact, Multimedia, etc.)
- get_schema_terms API endpoint serving the JSON
- SchemaMappingPT and ExportPackagePT PermissionTarget classes restricting
  Schema Mapper and Export Packages tools to institution admins
- Frontend vocabulary helper module that loads the terms JSON

Fixes #7714 (schema terms JSON files), #7727 (admin-only access).
Adds the frontend shell for the Schema Mapper tool:
- SchemaMapper/index.tsx — main UI shell with mapping list and editor wiring
- MappingList.tsx — list view of mappings
- TermDropdown.tsx — searchable DwC term picker
- VocabularyDialog.tsx — vocabulary key selection dialog
- types.ts — shared types
- userToolDefinitions: register Schema Mapper as a User Tool
- OverlayRoutes: add the Schema Mapper route
- SchemaConfig/Field.tsx: add 'Darwin Core' section to schema config
- localization/header.ts: localized strings for the new tool

Fixes #7709 (Schema Mapper tool), #7713 (vocabulary dialog),
#7715 (Term mapping column), #7729 (Schema Config DwC section).
Adds the first set of mapping/dataset CRUD endpoints and the Export Packages
tool entry point:
- list_mappings — returns all schema mappings as JSON
- list_export_datasets — returns all export datasets as JSON
- clone_mapping — deep-copies a SchemaMapping (new SpQuery, all SpQueryFields,
  new SchemaMapping pointing to the new query)
- urls.py: register the three new endpoints
- OverlayRoutes: register Export Packages route
- ExportPackages/index.tsx: list view shell for Export Packages tool

Field references adapted to Caroline's schemamapping schema (mapping_type,
is_default), and clone_mapping sets specifyuser/createdbyagent (required by
schema). Restored test_clone_mapping which depends on this layer's endpoint.

Fixes #7712 (cloning of schema mapping queries), #7723 (Export Packages tool).
Adds the mapping editor's interactive features:
- NewMappingDialog — dialog to create a new mapping (blank or from query)
- CloneMapping — clone-mapping wrapper component
- TermTooltip — info tooltip showing DwC term description
- Toolbar — mapping editor toolbar (auto-map, save, etc.)
- autoMap.ts — automatic field-to-term mapping logic
- MappingList enhancements (delete, clone, edit actions)
- TermDropdown enhancements (custom IRI input, static value)
- Localization strings for all new UI

Fixes #7710 (New Mapping dialog), #7716 (term info icon),
#7717 (toolbar), #7718 (auto-mapping), #7719 (custom IRI),
#7720 (static text values), #7722 (locked occurrenceID row),
#7731 (duplicate term validation).
The apostrophe in 'Обов'язкове' terminated the JS string literal early,
breaking webpack production build. Switched outer quotes to double quotes
to match the same workaround used on the fr-fr line above.
…, scheduling

- Add DwCA archive generation from cache tables (dwca_from_cache.py)
- Add attachment URL construction for media references
- Add EML editor frontend with metadata serialization
- Add GBIF validator integration
- Add RSS feed publishing for export packages
- Add Export Package frontend (PackageForm, ClonePackage, CopyRssUrl)
- Add update_feed_v2 management command for scheduled pipeline
- Add occurrenceID uniqueness validation
- Add cleanup_orphan_caches for cache table maintenance
- Add post_delete receiver to drop cache tables on SchemaMapping deletion
- Tests: test_dwca, test_feed, test_attachment_urls (33 tests passing)

Covers Specify issues #7721, #7728, #7730, #7732, #7733, #7734, #7735,
#7736, #7740, #7741, #7742, #7743, #7744. Source: dwc/foundation
commit c4d5178.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 📋Back Log

Development

Successfully merging this pull request may close these issues.

1 participant