Warn (don't error) when optional services (entity-fishing, glutton) are unavailable#13
Merged
Merged
Conversation
Entity-fishing (disambiguation) and Glutton (ref consolidation) are optional external services. When unconfigured or unreachable, datastet should warn clearly and continue, not log ERROR stacktraces or throw GrobidException. Entity-fishing (DatasetDisambiguator): - Single WARN at startup when host is blank or unreachable - All runtime failures → WARN with short reason, not ERROR/stacktrace - disambiguate() short-circuits with !serverStatus guard - Replaced all e.printStackTrace() with LOGGER.warn() Glutton (DatasetParser): - Gate all 3 Consolidation.getInstance() call sites with gluttonHost blank check (previously only 1 site was gated) - When gluttonHost unset: WARN once per JVM (AtomicBoolean flag) that consolidation will be skipped - When glutton call throws: WARN per-request instead of GrobidException (which crashed the whole request) Result: clean deployment logs when optional services are absent, but the user is still informed what was skipped and why. https://claude.ai/code/session_018EBZhK2RtGtsvN4E1rp2tF
lfoppiano
pushed a commit
that referenced
this pull request
Apr 14, 2026
The fix in #13 made DatasetDisambiguator.disambiguate() gracefully short-circuit via serverStatus when entity-fishing is unavailable, but that guard never ran: DatasetParser.disambiguator was always null, because the private constructor only takes `configuration` while the three collaborator assignments were field-to-field self-assignments (no such parameters exist on the ctor), and every caller of getInstance(...) passes `null, null, null`. A request with disambiguate=true therefore NPE'd at DatasetParser:276 before the internal short-circuit could help. - DatasetParser constructor now obtains the existing @singleton via DatasetDisambiguator.getInstance(configuration.getDatastetConfiguration()), so #13's serverStatus guard does its job when entity-fishing is not configured. - Add a warn-once AtomicBoolean guard at the disambiguate() call site mirroring the existing gluttonWarningLogged pattern, so any future null disambiguator path also degrades to a single WARN instead of crashing the request. Result: one WARN at startup ("entity-fishing host not configured, dataset disambiguation will be skipped"), no stacktrace, requests return 200. https://claude.ai/code/session_014ZyP9j7nprZKKtJMtbWmvA
4 tasks
lfoppiano
added a commit
that referenced
this pull request
Apr 14, 2026
…#18) The fix in #13 made DatasetDisambiguator.disambiguate() gracefully short-circuit via serverStatus when entity-fishing is unavailable, but that guard never ran: DatasetParser.disambiguator was always null, because the private constructor only takes `configuration` while the three collaborator assignments were field-to-field self-assignments (no such parameters exist on the ctor), and every caller of getInstance(...) passes `null, null, null`. A request with disambiguate=true therefore NPE'd at DatasetParser:276 before the internal short-circuit could help. - DatasetParser constructor now obtains the existing @singleton via DatasetDisambiguator.getInstance(configuration.getDatastetConfiguration()), so #13's serverStatus guard does its job when entity-fishing is not configured. - Add a warn-once AtomicBoolean guard at the disambiguate() call site mirroring the existing gluttonWarningLogged pattern, so any future null disambiguator path also degrades to a single WARN instead of crashing the request. Result: one WARN at startup ("entity-fishing host not configured, dataset disambiguation will be skipped"), no stacktrace, requests return 200. https://claude.ai/code/session_014ZyP9j7nprZKKtJMtbWmvA Co-authored-by: Claude <noreply@anthropic.com>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Supersedes #12. Both entity-fishing (disambiguation) and Glutton (reference consolidation) are optional external services. When unreachable or unconfigured, datastet should warn clearly and continue, not log ERROR stacktraces or throw
GrobidException.Entity-fishing (
DatasetDisambiguator)WARNat startup when host is blank or unreachable (not repeated per request)WARNwith short reason, notERROR/stacktracedisambiguate()short-circuits with!serverStatusguard (avoids wasted HTTP calls)e.printStackTrace()withLOGGER.warn()Glutton (
DatasetParser)Consolidation.getInstance()call sites with thegluttonHostblank check (previously only 1 site was gated — the other 2 inprocessPDFwould crash withGrobidExceptionif glutton was unreachable)gluttonHostunset:WARNonce per JVM (viaAtomicBooleanflag) that consolidation will be skipped — not on every requestWARNper-request instead ofGrobidException(which was crashing the whole request)Result
Test plan
./gradlew testpasseshttps://claude.ai/code/session_018EBZhK2RtGtsvN4E1rp2tF