Fix DOCTYPE is disallowed error in NCBI taxonomy parsing#623
Open
vagisha wants to merge 1 commit intorelease26.3-SNAPSHOTfrom
Open
Fix DOCTYPE is disallowed error in NCBI taxonomy parsing#623vagisha wants to merge 1 commit intorelease26.3-SNAPSHOTfrom
vagisha wants to merge 1 commit intorelease26.3-SNAPSHOTfrom
Conversation
…tils.getScientificNames threw PxException on NCBI eSummary lookup because the response begins with a <!DOCTYPE> declaration that XmlBeansUtil.DOCUMENT_BUILDER_FACTORY rejects. Switch to the new DOCUMENT_BUILDER_FACTORY_ALLOWING_DOCTYPE. - Added unit test
| private static Map<Integer, String> parseScientificNames(InputStream in) | ||
| throws ParserConfigurationException, SAXException, IOException | ||
| { | ||
| Document doc = getDocumentBuilder().parse(in); |
Contributor
There was a problem hiding this comment.
We may need to suppress this as a false positive if it's still showing up post-merge.
labkey-jeckels
approved these changes
Apr 8, 2026
| private static Map<Integer, String> parseScientificNames(InputStream in) | ||
| throws ParserConfigurationException, SAXException, IOException | ||
| { | ||
| Document doc = getDocumentBuilder().parse(in); |
Contributor
There was a problem hiding this comment.
We may need to suppress this as a false positive if it's still showing up post-merge.
Contributor
labkey-jeckels
pushed a commit
to LabKey/platform
that referenced
this pull request
Apr 9, 2026
#### Rationale `XmlBeansUtil.DOCUMENT_BUILDER_FACTORY` sets `disallow-doctype-decl=true` for XXE protection, which causes parsers to fail on any XML with a `<!DOCTYPE>` declaration. This is a problem for the Panorama Public code that parses NCBI's `esummary.fcgi` response that begins with `<!DOCTYPE eSummaryResult PUBLIC ... esummary-v1.dtd>` #### Related Pull Requests - LabKey/MacCossLabModules#605 - LabKey/MacCossLabModules#623 #### Changes - Added `DOCUMENT_BUILDER_FACTORY_ALLOWING_DOCTYPE` to `XmlBeansUtil`, mirroring the existing `SAX_PARSER_FACTORY_ALLOWING_DOCTYPE`. The DOCTYPE declaration is permitted, but every other XXE mitigation stays in place. - Extracted a private `documentBuilderFactory(boolean allowDocType)` helper, mirroring the existing `saxParserFactory(boolean)` helper.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rationale
NcbiUtils.getScientificNamesthrows an exception on NCBI eSummary lookup because the response begins with a <!DOCTYPE> declaration thatXmlBeansUtil.DOCUMENT_BUILDER_FACTORYrejects. The exception:SAXParseException: DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to trueRelated Pull Requests
Changes
XmlBeansUtil.DOCUMENT_BUILDER_FACTORY_ALLOWING_DOCTYPE