Skip to content

Add DOI metadata renderer with DataCite and CrossRef support, citation styles, and enhanced metadata extraction#355

Merged
maximiliani merged 9 commits intomainfrom
copilot/render-doi-metadata
Feb 4, 2026
Merged

Add DOI metadata renderer with DataCite and CrossRef support, citation styles, and enhanced metadata extraction#355
maximiliani merged 9 commits intomainfrom
copilot/render-doi-metadata

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 4, 2026

Implements comprehensive DOI (Digital Object Identifier) rendering to display rich metadata for academic resources. DOIs are Handle PIDs starting with 10. prefix and are resolved via DataCite or CrossRef APIs.

Implementation

Modular renderer architecture (rendererModules/DOI/):

  • DOI.ts - Detection via /^10\.\d{4,9}\/[-._;()/:A-Za-z0-9]+$/, handles doi: and doi.org URL prefixes
  • DataCiteInfo.ts - Dedicated DataCite metadata parser with schema-specific logic
  • CrossRefInfo.ts - Dedicated CrossRef metadata parser with JATS syntax support
  • DOIInfo.ts - Lightweight wrapper combining both sources with fallback
  • DOIType.tsx - Renders preview with logos and citation styles, generates metadata table
  • CitationStyles.ts - Citation formatting utilities (APA, Chicago, IEEE, Harvard, Anglia Ruskin)
  • ResourceTypeIcons.tsx - Resource type icons and logo components

Renderer priority: Set to 2 (before HandleType at 3) to catch DOIs before generic Handle processing.

Enhanced metadata extraction:

  • ORCiD identifiers from creator nameIdentifiers (DataCite) and ORCID field (CrossRef)
  • ROR identifiers from affiliationIdentifier in affiliations
  • ISO8601 formatted dates (YYYY-MM-DD)
  • Corresponding author identification (ContactPerson in DataCite, first author in CrossRef)
  • JATS syntax parsing in CrossRef abstracts

Metadata fields displayed:

  • Title, individual creators (each with separate item), corresponding author
  • Publisher, publication date (ISO8601), resource type with icons
  • Description/abstract (JATS-parsed for CrossRef)
  • Individual subject items (not joined)
  • Actions: Open resource, resolve via doi.org, view raw metadata

Citation styles:

  • 5 configurable formats: APA (default), Chicago, IEEE, Harvard, Anglia Ruskin
  • Smart title truncation at word boundaries
  • Configurable via settings: {"type":"DOIType","values":[{"name":"citationStyle","value":"APA"}]}

Storybook examples: 8 stories covering DataCite (journal papers, software, RFC, slides, preprints) and CrossRef (journal papers, books) examples.

Usage

<pid-component value="10.5281/zenodo.1234567"></pid-component>
<pid-component value="https://doi.org/10.1038/nature12373"></pid-component>
<pid-component 
  value="10.5445/IR/1000185135" 
  settings='[{"type":"DOIType","values":[{"name":"citationStyle","value":"Chicago"}]}]'
></pid-component>

Technical notes

  • Uses existing cachedFetch for API calls with proper error handling
  • Follows GenericIdentifierType pattern from ORCIDType/HandleType
  • Full TypeScript typing for DataCite/CrossRef response schemas
  • Modular architecture with separate parsers for each metadata source
  • Each parser has its own generateItems() method for schema-specific rendering
Original prompt

This section details on the original issue you should resolve

<issue_title>Render DOI metadata</issue_title>
<issue_description># Problem
A lot of academic resources provide DOIs for persistent identification and resolution. DOI is technically based on Handle records, that are already processed by the PID-Component. DOIs are easily distinguished by starting the PID prefix with "10.". They are usually resolved via https://doi.org/<put DOI here>. When resolving a DOI, usually some metadata following the DataCite or CrossRef metadata schema is be returned. This information should be made accessible in the PID component.

Tasks

  • Create new rendererModule and necessary components within the existing structure
  • Detect DOIs
  • Resolve DOIs
  • Display DataCite metadata in the data table
  • Represent CrossRef metadata in the data table
  • Add an action that refers to the DataCite/CrossRef metadata file
  • Add an action that refers to the data resource or its landing page (the article, some library, Zenodo, etc.)
  • optionally: render preview using the "renderBody()" method

Further information

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 3 commits February 4, 2026 13:30
Co-authored-by: maximiliani <43142805+maximiliani@users.noreply.github.com>
…aces

Co-authored-by: maximiliani <43142805+maximiliani@users.noreply.github.com>
Co-authored-by: maximiliani <43142805+maximiliani@users.noreply.github.com>
Copilot AI changed the title [WIP] Add DOI metadata rendering functionality Add DOI metadata renderer with DataCite and CrossRef support Feb 4, 2026
Copilot AI requested a review from maximiliani February 4, 2026 13:40
Copy link
Copy Markdown
Member

@maximiliani maximiliani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start. Please add some Storybook stories that demonstrate multiple examples. See comments for more stuff to change... Please separate the item generation and parsing logic for DataCite and Crossref.

Some example DOIs:

  • Journal paper, DataCite: 10.5445/IR/1000185135
  • Journal paper, CrossRef: 10.1109/eScience65000.2025.00022
  • Software on Zenodo, DataCite: https://doi.org/10.5281/zenodo.13629109
  • RFC, DataCite: doi:10.17487/rfc3650
  • Book, CrossRef: 10.1007/978-1-4419-8598-9
  • Slides, DataCite: 10.5445/IR/1000178054
  • Preprint on arXiv, DataCite: 10.48550/ARXIV.2505.16550

Comment thread packages/stencil-library/src/rendererModules/DOI/DOIInfo.ts
Comment thread packages/stencil-library/src/rendererModules/DOI/DOIInfo.ts Outdated
Comment thread packages/stencil-library/src/rendererModules/DOI/DOIInfo.ts Outdated
Comment thread packages/stencil-library/src/rendererModules/DOI/DOIType.tsx
Comment thread packages/stencil-library/src/rendererModules/DOI/DOIType.tsx Outdated
Comment thread packages/stencil-library/src/rendererModules/DOI/DOIType.tsx Outdated
Comment thread packages/stencil-library/src/rendererModules/DOI/DOIType.tsx Outdated
Comment thread packages/stencil-library/src/rendererModules/DOI/DOIType.tsx Outdated
Comment thread packages/stencil-library/src/rendererModules/DOI/DOIType.tsx Outdated
Comment thread packages/stencil-library/src/rendererModules/DOI/DOIType.tsx Outdated
…d citation styles, icons, and Storybook examples

Co-authored-by: maximiliani <43142805+maximiliani@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Feb 4, 2026

This is a good start. Please add some Storybook stories that demonstrate multiple examples. See comments for more stuff to change... Please separate the item generation and parsing logic for DataCite ...

Added 8 Storybook stories demonstrating all example DOIs:

  • DataCite: Journal paper (10.5445/IR/1000185135), Software (10.5281/zenodo.13629109), RFC (10.17487/rfc3650), Slides (10.5445/IR/1000178054), Preprint (10.48550/ARXIV.2505.16550)
  • CrossRef: Journal paper (10.1109/eScience65000.2025.00022), Book (10.1007/978-1-4419-8598-9)
  • Plus citation styles example

DataCite and CrossRef logic now fully separated into dedicated parser files.

Commit: d4c31ec

Copilot AI changed the title Add DOI metadata renderer with DataCite and CrossRef support Add DOI metadata renderer with DataCite and CrossRef support, citation styles, and enhanced metadata extraction Feb 4, 2026
Copilot AI requested a review from maximiliani February 4, 2026 19:16
Signed-off-by: Maximilian Inckmann <maximilian.inckmann@kit.edu>
Signed-off-by: Maximilian Inckmann <maximilian.inckmann@kit.edu>
Signed-off-by: Maximilian Inckmann <maximilian.inckmann@kit.edu>
Signed-off-by: Maximilian Inckmann <maximilian.inckmann@kit.edu>
@maximiliani maximiliani marked this pull request as ready for review February 4, 2026 22:04
Copilot AI review requested due to automatic review settings February 4, 2026 22:04
@maximiliani maximiliani merged commit 60ede31 into main Feb 4, 2026
11 checks passed
@maximiliani maximiliani deleted the copilot/render-doi-metadata branch February 4, 2026 22:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request implements comprehensive DOI (Digital Object Identifier) metadata rendering for academic resources. The implementation adds a modular DOI renderer that fetches metadata from DataCite or CrossRef APIs, displays rich citation information with configurable styles, and presents structured metadata including authors, publication dates, resource types, and descriptions.

Changes:

  • Added modular DOI renderer architecture with separate parsers for DataCite and CrossRef metadata schemas
  • Implemented 5 citation styles (APA, Chicago, IEEE, Harvard, Anglia Ruskin) with configurable settings
  • Enhanced metadata extraction including ORCiD identifiers, ROR affiliations, and JATS syntax parsing for CrossRef abstracts
  • Integrated DOI renderer into the renderer priority system (priority 2, before HandleType at 3)
  • Added comprehensive Storybook examples covering 8 different DOI scenarios
  • Updated documentation in README and MDX files

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
packages/stencil-library/src/utils/utils.ts Registers DOIType renderer at priority 2 and adjusts priorities for subsequent renderers
packages/stencil-library/src/rendererModules/DOI/DOI.ts Core DOI class with validation, parsing, and URL generation
packages/stencil-library/src/rendererModules/DOI/DataCiteInfo.ts DataCite-specific metadata parser with schema-aware field extraction
packages/stencil-library/src/rendererModules/DOI/CrossRefInfo.ts CrossRef-specific metadata parser with JATS syntax support
packages/stencil-library/src/rendererModules/DOI/DOIInfo.ts Wrapper combining DataCite and CrossRef sources with fallback logic
packages/stencil-library/src/rendererModules/DOI/DOIType.tsx Main renderer implementing GenericIdentifierType with preview and citation display
packages/stencil-library/src/rendererModules/DOI/CitationStyles.ts Citation formatting utilities for 5 academic citation styles
packages/stencil-library/src/rendererModules/DOI/ResourceTypeIcons.tsx Resource type mapping, beautification, and DataCite/CrossRef logo SVG components
packages/stencil-library/src/components/pid-component/pid-component.stories.ts 8 Storybook examples demonstrating DOI rendering with different sources and configurations
packages/stencil-library/src/components/pid-component/pid-component.mdx Documentation updates explaining DOI support, citation styles, and usage examples
packages/stencil-library/src/components/pid-pagination/readme.md Table delimiter alignment fix (auto-generated)
packages/stencil-library/src/components/pid-data-table/readme.md Table delimiter alignment fix (auto-generated)
packages/stencil-library/src/components/pid-component/readme.md Table delimiter alignment fix (auto-generated)
packages/stencil-library/src/components/pid-actions/readme.md Table delimiter alignment fix (auto-generated)
README.md Added DOI support documentation including citation styles and configuration options

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

const titleTrunc = truncateTitle(title, 60);
const yearPart = year ? `, ${year.split('-')[0]}` : '';

return `${initial} ${authorName}${etAl}, "${truncate ? titleTrunc : title}"${yearPart}`;
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the author has no givenName, the initial variable will be an empty string, resulting in a citation starting with a space: " AuthorName et al., ...". Consider adjusting the formatting to handle the case when there's no initial: ${initial}${initial ? ' ' : ''}${authorName}${etAl}

Suggested change
return `${initial} ${authorName}${etAl}, "${truncate ? titleTrunc : title}"${yearPart}`;
return `${initial}${initial ? ' ' : ''}${authorName}${etAl}, "${truncate ? titleTrunc : title}"${yearPart}`;

Copilot uses AI. Check for mistakes.
};

if (!result.name && corresponding.givenName && corresponding.familyName) {
result.name = `${corresponding.givenName} ${corresponding.familyName}`;
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The corresponding author name construction logic is incomplete. If corresponding.name is empty and only one of givenName or familyName is present (but not both), the result.name will remain empty. Add an else if (!result.name) block after line 168 to handle the case where only one name component is available, similar to the logic in the creators getter at lines 119-120.

Suggested change
result.name = `${corresponding.givenName} ${corresponding.familyName}`;
result.name = `${corresponding.givenName} ${corresponding.familyName}`;
} else if (!result.name) {
result.name = corresponding.givenName || corresponding.familyName || '';

Copilot uses AI. Check for mistakes.
export function beautifyResourceType(resourceType: string): string {
const normalized = resourceType
.toLowerCase()
.replace("_", "").replace("-", "");
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replace method only replaces the first occurrence of "_" and "-" characters. If a resource type contains multiple underscores or hyphens (e.g., "journal_article_preprint"), only the first underscore would be removed. Use replaceAll() or a global regex instead: .replace(/_/g, "").replace(/-/g, "") to replace all occurrences.

Suggested change
.replace("_", "").replace("-", "");
.replace(/_/g, "").replace(/-/g, "");

Copilot uses AI. Check for mistakes.
Comment on lines +70 to +88
private parseJATS(text: string): string {
if (!text) return text;

// Remove common JATS tags
return text
.replace(/<jats:p>/g, '')
.replace(/<\/jats:p>/g, '\n')
.replace(/<jats:italic>/g, '<i>')
.replace(/<\/jats:italic>/g, '</i>')
.replace(/<jats:bold>/g, '<b>')
.replace(/<\/jats:bold>/g, '</b>')
.replace(/<jats:sub>/g, '<sub>')
.replace(/<\/jats:sub>/g, '</sub>')
.replace(/<jats:sup>/g, '<sup>')
.replace(/<\/jats:sup>/g, '</sup>')
.replace(/<jats:title>/g, '<strong>')
.replace(/<\/jats:title>/g, '</strong>')
.replace(/\n\n+/g, '\n\n')
.trim();
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JATS parsing converts XML tags to HTML tags (e.g., <jats:italic> to <i>, <jats:bold> to <b>), which are then included in the description text. If this text is later rendered as HTML without proper sanitization, it could potentially lead to XSS vulnerabilities. Ensure that the component rendering this description properly sanitizes or escapes HTML content, or consider removing HTML tags entirely and using plain text with formatting indicators instead.

Copilot uses AI. Check for mistakes.
const authorName = author.familyName || author.name.split(' ').pop() || author.name;
const etAl = count > 1 ? ' et al.' : '';
const yearPart = year ? ` (${year.split('-')[0]})` : '';
const titleTrunc = truncateTitle(title, 60);
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title truncation length (60) is hardcoded in multiple citation format functions. Consider extracting this as a constant (e.g., const MAX_TITLE_LENGTH = 60) at the top of the file to make it easier to maintain and adjust if needed. This follows the DRY principle and makes the magic number more explicit.

Copilot uses AI. Check for mistakes.
const yearPart = year ? `, ${year.split('-')[0]}` : '';
const titleTrunc = truncateTitle(title, 60);

return `${authorName}, ${initials}${etAl}${yearPart}. ${truncate ? titleTrunc : title}`;
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the author has no givenName, the initials variable will be an empty string, resulting in a citation like "AuthorName, , Year. Title" with an extra comma and space. Consider adding a check to omit the initials and extra comma when empty, or adjust the formatting logic: ${authorName}${initials ? , ${initials} : ''}${etAl}${yearPart}

Suggested change
return `${authorName}, ${initials}${etAl}${yearPart}. ${truncate ? titleTrunc : title}`;
return `${authorName}${initials ? `, ${initials}` : ''}${etAl}${yearPart}. ${truncate ? titleTrunc : title}`;

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Render DOI metadata

3 participants