Skip to content

feat: improve PDF.js integration with custom canvas renderer#3648

Merged
hmiguim merged 8 commits intodevelopmentfrom
claude/pdf-viewer-improvements
Apr 17, 2026
Merged

feat: improve PDF.js integration with custom canvas renderer#3648
hmiguim merged 8 commits intodevelopmentfrom
claude/pdf-viewer-improvements

Conversation

@luis100
Copy link
Copy Markdown
Member

@luis100 luis100 commented Apr 15, 2026

Summary

Replaces the iframe-based PDF.js viewer (`web/viewer.html`) with a self-contained PDF renderer built on the PDF.js high-level viewer components (`PDFViewer` + `PDFFindController`), the same approach used by the official Mozilla PDF.js demo.

Changes

  • Dependency: Switched from `org.webjars:pdf-js:5.3.31` to `org.webjars.npm:pdfjs-dist:5.5.207` (npm-published build with full viewer API)
  • `rodaPdfViewer.js`: Custom viewer using PDF.js `PDFViewer`, `PDFFindController`, `PDFLinkService`, and `EventBus` from `web/pdf_viewer.mjs`
    • Lazy page rendering via `PDFViewer`'s built-in page management
    • Sticky toolbar with zoom, rotation, and page navigation
    • Full-text search with native highlight support via `PDFFindController`
    • Thumbnail sidebar for quick page navigation
    • Fullscreen support
    • All `pdf_viewer.css` rules scoped to `.rodaPdfViewer` (no global CSS pollution)
  • `BitstreamPreview.java`: Replaced iframe widget with a `FlowPanel` that calls `RodaPdfViewer.init()`/`destroy()` via GWT JSNI on attach/detach
  • `JavascriptUtils.java`: Added `initRodaPdfViewer` and `destroyRodaPdfViewer` JSNI bridge methods
  • `main.gss`: Layout CSS for the viewer container (`.rodaPdfViewer`, `.rodaPdfContent`, `.rodaPdfScrollContainer`, etc.)
  • `Main.html`: Added `<script>` tag to load `rodaPdfViewer.js`
  • `logback_wui.xml`: Suppressed noisy WARN logs from Chrome DevTools probing `/.well-known/appspecific/com.chrome.devtools.json`

Motivation

The old iframe approach required URL-encoding workarounds and loaded the entire PDF.js web application (with its own UI chrome), making styling and integration difficult. The new approach:

  • Uses PDF.js's own `PDFViewer` component, which correctly handles the CSS custom property-based text layer layout introduced in PDF.js v5
  • Delegates search and highlight entirely to `PDFFindController` (same as the official viewer)
  • Scopes `pdf_viewer.css` at runtime (replacing `:root` with `.rodaPdfViewer`) to prevent dark-mode and other global variable declarations from leaking into the host application

Test plan

  • Open a PDF file in the RODA file browser and verify it renders correctly
  • Test zoom in/out and rotation controls
  • Test page navigation (previous/next and direct page input)
  • Test full-text search: enter a search term, verify words are highlighted in the document
  • Test search next/previous navigation
  • Test thumbnail sidebar navigation
  • Test fullscreen mode
  • Verify no CSS regressions on other preview types (image, text, HTML, TIFF)
  • Verify the viewer is destroyed cleanly when navigating away
  • Check that dark-mode OS preference does not affect non-PDF UI components

🤖 Generated with Claude Code

@luis100 luis100 changed the base branch from master to development April 16, 2026 07:41
Remove extracted pdf_viewer.mjs and pdf_viewer.mjs.map files that were
used to inspect the PDF.js webjar during development. These files are
not part of the actual implementation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@luis100 luis100 changed the title WIP: Improving PDF viewer feat: improve PDF.js integration with custom canvas renderer Apr 16, 2026
luis100 and others added 5 commits April 16, 2026 09:08
The previous approach modified span.innerHTML to insert <mark> elements
inside PDF.js text layer spans. This broke the scaleX transforms that
PDF.js sets on each span, causing marks to be mispositioned or invisible.

New approach:
- Use the Range API to locate the exact rendered rect of each text match
- Create absolutely-positioned <div class="rodaPdfHighlight"> overlays
  inside the page container (position:relative), not inside the text spans
- Falls back to the span bounding rect when text node content doesn't
  match exactly (e.g. nested elements)
- clearHighlights now simply removes the overlay divs
- focusSearchMatch scrolls precisely to the selected highlight element
- CSS updated: .rodaPdfHighlight is now a positioned div, removing the
  inline mark-specific margin/padding/color properties

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous manual canvas+TextLayer approach was incompatible with
PDF.js v5's CSS-custom-property-based text layer layout
(--total-scale-factor, --font-height, --scale-x, etc.), which is only
set up correctly by pdf_viewer.css.  This caused search highlights to
appear in completely wrong positions.

This commit switches to the official PDF.js high-level components,
exactly as used by the mozilla.github.io/pdf.js/web/viewer.html demo:

- PDFViewer handles page rendering (canvas + text layer) correctly
- PDFFindController handles text search and highlight via the EventBus
- pdf_viewer.css is dynamically loaded once from the webjar so that the
  text layer CSS custom properties are defined
- Our toolbar dispatches EventBus 'find'/'findbarclose' events instead
  of custom DOM manipulation
- Removed all manual canvas rendering, TextLayer construction, Range-
  based highlight overlays, and presentation-mode overlay code
- CSS updated: added rodaPdfScrollWrapper/rodaPdfScrollContainer layout,
  removed rodaPdfPages/rodaPdfPage/rodaPdfCanvas/rodaPdfTextLayer/
  rodaPdfHighlight/presentation-mode rules (pdf_viewer.css owns those)
- PDFViewer container is position:absolute as required by the library
- Viewer height is 80vh (min 400px); fullscreen remains 100vh

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PDFViewer checks getComputedStyle(container).position === 'absolute'
at construction time. The CSS class alone is not reliable (possible
GWT class pruning or loading-order race). Set all critical layout
properties inline on scrollContainer, scrollWrapper, and content so
the check always passes regardless of CSS class availability.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…kage

pdf_viewer.css sets :root CSS custom properties (including dark-mode
media queries like --csstools-color-scheme--light) that leaked into the
host application and affected unrelated components.

Instead of loading the file via a global <link>, we now fetch the CSS
text, replace every ':root' selector with '.rodaPdfViewer', and inject
the result as a <style> element. Since all PDF.js DOM elements are
descendants of .rodaPdfViewer, custom properties defined there are still
inherited by .pdfViewer, .page, .textLayer, and .highlight — but they
no longer pollute the rest of the page.

Falls back to a global <link> if fetch fails (e.g. strict CSP).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Chrome DevTools probes /.well-known/appspecific/com.chrome.devtools.json
on every page load, which causes Spring's ExceptionHandlerExceptionResolver
to log a WARN for every request. Set that logger to ERROR level so the
harmless NoResourceFoundException no longer fills the logs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@luis100 luis100 requested a review from hmiguim April 16, 2026 08:40
@luis100 luis100 marked this pull request as ready for review April 16, 2026 08:40
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement labels Apr 16, 2026
@luis100 luis100 added this to the 6.2.0 milestone Apr 16, 2026
@hmiguim hmiguim modified the milestones: 6.2.0, 6.0.3 Apr 17, 2026
@hmiguim hmiguim merged commit 90c3e4f into development Apr 17, 2026
1 of 3 checks passed
@hmiguim hmiguim deleted the claude/pdf-viewer-improvements branch April 17, 2026 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants