Skip to content

Cache compiled schematron Templates to avoid repeated XSLT compilation#1

Merged
jordanpadams merged 1 commit intomainfrom
devin/1775083236-cache-schematron-templates
Apr 1, 2026
Merged

Cache compiled schematron Templates to avoid repeated XSLT compilation#1
jordanpadams merged 1 commit intomainfrom
devin/1775083236-cache-schematron-templates

Conversation

@devin-ai-integration
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration bot commented Apr 1, 2026

Summary

Adds an in-memory cache of compiled javax.xml.transform.Templates objects to SchematronTransformer, keyed by the raw schematron source string. This avoids re-running the expensive ISO schematron → XSLT compilation for every label when the same schematron is used repeatedly (which is the common case).

Mechanism: The existing compilation logic is extracted into a private compileSchematron(Source, ProblemHandler) method that returns Templates instead of Transformer. The transform(String, ProblemHandler) overload now checks a HashMap<String, Templates> before compiling; on a cache hit it calls Templates.newTransformer() directly. The transform(Source, ProblemHandler) overload (used less frequently, and not keyed by a stable string) is not cached—it always compiles fresh.

Cache lifetime is tied to the SchematronTransformer instance. LabelValidator.clear() already creates a new instance (line 219), so no changes are needed there. A clearCache() method is provided for explicit invalidation if needed.

Ref: NASA-PDS#1565

Review & Testing Checklist for Human

  • Thread safety of HashMap: The cache uses a plain HashMap. If SchematronTransformer is ever accessed from multiple threads concurrently (e.g., parallel label validation), this is a data race. Verify whether a ConcurrentHashMap is needed by checking how LabelValidator / validation rules use this class.
  • ProblemHandler ignored on cache hit: On a cache hit, the handler parameter passed to transform(String, ProblemHandler) is silently unused — no error listener is set. Confirm this is acceptable (compilation already succeeded on first call, so there are no transform errors to report on the cached path).
  • Memory footprint of cache keys: The cache key is the full schematron XML source string, which can be large. Verify that the number of distinct schematron files per run is small enough that this is not a concern.
  • Run the existing test suite (mvn test) and confirm identical validation results on a representative set of labels (e.g., a small bundle). The behavioral contract should be unchanged — same errors/warnings, same SVRL output — just faster on repeated schematron compilations.

Notes

  • Only the transform(String, ...) path is cached. The transform(Source, ...) path compiles every time since Source objects are not stably comparable/hashable.
  • No new tests were added; the change is intended to be a transparent optimization verified by existing coverage.

Link to Devin session: https://nasa-jpl-demo.devinenterprise.com/sessions/e8943ca2e9de4879856766ebf367604c


Open with Devin

Add a HashMap<String, Templates> cache to SchematronTransformer so that
the expensive ISO schematron XSLT compilation is performed only once per
unique schematron source string. Subsequent calls to transform(String)
return a new Transformer from the cached Templates object.

- Extract compilation logic into private compileSchematron() returning Templates
- Add cache lookup in transform(String, ProblemHandler)
- Add clearCache() method (naturally reset when LabelValidator.clear() creates a new instance)
- Add debug logging for cache hits/misses

Fixes: NASA-PDS#1565
Co-Authored-By: jordan.h.padams <jordan.h.padams@jpl.nasa.gov>
@devin-ai-integration
Copy link
Copy Markdown
Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link
Copy Markdown
Author

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

@jordanpadams jordanpadams merged commit 521c9ee into main Apr 1, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant