New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOLR-16296: Load elevate.xml in a more secure way #962
Conversation
solr/core/src/java/org/apache/solr/handler/component/QueryElevationComponent.java
Outdated
Show resolved
Hide resolved
solr/core/src/java/org/apache/solr/handler/component/QueryElevationComponent.java
Outdated
Show resolved
Hide resolved
solr/core/src/java/org/apache/solr/handler/component/QueryElevationComponent.java
Outdated
Show resolved
Hide resolved
solr/core/src/java/org/apache/solr/handler/component/QueryElevationComponent.java
Outdated
Show resolved
Hide resolved
solr/core/src/java/org/apache/solr/handler/component/QueryElevationComponent.java
Outdated
Show resolved
Hide resolved
solr/core/src/java/org/apache/solr/handler/component/QueryElevationComponent.java
Outdated
Show resolved
Hide resolved
solr/core/src/java/org/apache/solr/handler/component/QueryElevationComponent.java
Show resolved
Hide resolved
inputStream = core.getResourceLoader().openResource(configFileName); | ||
xmlDocument = SafeXMLParsing.parseUntrustedXML(log, inputStream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I completely overlooked something (my bad). I was initially concerned about using XmlConfigFile because it did not use Solr's SafeXMLParsing utility, and I could also see it enabling some XInclude & entity resolver stuff, which concerned me in its use for elevate.xml. But this doesn't mean it's not "safe". Upon further inspection, it appears to be doing some of the same stuff that SafeXMLParsing does in the parseConfigXML
method, which is relatively safe as it doesn't escape the resource loader. Do you see @HaythemKh ? If you agree, do you think XmlConfigFile could be modified to call SafeXMLParsing.parseConfigXml
so that the safe-ness is clear? If not, we could at least leave a comment there.
At this place in QEC, we could replace these two lines with SafeXMLParsing.parseConfigXml
and we would in fact continue to support these somewhat exotic XML features in a "safe" way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- About modifying XmlConfigFile with
SafeXMLParsing#parseConfigXML
; It is indeed possible to apply these changes. However, inXmlConfigFile
, we authorize the non-availability of SystemId so as not to activate xinclude for this purpose. UnlikeSafeXMLParsing#parseConfigXML
where we enable it by default. And, XmlConfigFile is used for reading Solr config files (seeSolrConfig#readXml
). would that have some impact? SafeXMLParsing#parseConfigXML
enables xinclude by default, whereasSafeXMLParsing#parseUntrustedXML
doesn't. It doesn't seem very "safe", is it? I think if we don't need to activate it we don't have to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uschindler -- you added the SafeXMLParsing
library. My instinct is use use the parseConfigXML
method for the elevate.xml
because it conveniently takes a SolrResourceLoader. It supports x:include too but so-what; right? It's scoped to the SRL so I don't see how it could be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, i think that's the way to go. parseConfigXML is for stuff that is part of Solr's config. parseUntrustedXML is for stuff coming over the network like a POSTed document.
As all Solr config files supoort xinclude, it should also be supported for elevantion.xml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- About modifying XmlConfigFile with
SafeXMLParsing#parseConfigXML
; It is indeed possible to apply these changes. However, inXmlConfigFile
, we authorize the non-availability of SystemId so as not to activate xinclude for this purpose. UnlikeSafeXMLParsing#parseConfigXML
where we enable it by default. And, XmlConfigFile is used for reading Solr config files (seeSolrConfig#readXml
). would that have some impact?
We have a valid system id because the ResourceLoader provides one. XInclude will then go through SystemIdResolver provided.
SafeXMLParsing#parseConfigXML
enables xinclude by default, whereasSafeXMLParsing#parseUntrustedXML
doesn't. It doesn't seem very "safe", is it? I think if we don't need to activate it we don't have to.
If it is a config file, xinclude should be allowed, as this is the standard for SOlr config files. There's also no security risk in this because the xinclude only allows to load files from resource loader and not somewhere outside (it uses SystemIdResolver to resolve includes, see
solr/solr/core/src/java/org/apache/solr/util/SystemIdResolver.java
Lines 154 to 156 in c99af20
throw new IOException( | |
"Cannot resolve absolute systemIDs / external entities (only relative paths work): " | |
+ systemId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course if the elevate.xml is coming over the network as part of the request, then it should be parsed with untrusted.
I am not familar with the code, but the rule for config files is:
- if the config file is XML in ResourceLoader, it should use SafeXMLParsing#parseConfigXML with xinclude enabled for consistency
- if the elevation file is passed as part of request body, then parse it using the request input stream using SafeXMLParsing#parseUntrustedXML
I'd tune XmlConfigFile to use those APIs. Back at that time it was too much work to me, so I left that one out. I just added the SystemIdResolver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @uschindler for the detailed context!
solr/core/src/java/org/apache/solr/schema/IndexSchemaFactory.java
Outdated
Show resolved
Hide resolved
assertThat( | ||
config.getConfigSetBaseDirectory().toAbsolutePath(), | ||
is(Paths.get("/path/to/solr/home/configsets").toAbsolutePath())); | ||
|
||
NodeConfig absConfig = | ||
SolrXmlConfig.fromString( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still need a XmlConfigFile constructor taking an inputstream as input
I have done some more local refactorings but I'm thinking about either keeping an XmlConfigFile constructor taking an inputstream as input or continuing the refactorings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done some more local refactorings but I'm thinking about either keeping an XmlConfigFile constructor taking an inputstream as input or continuing the refactorings.
If the PR doesn't get out of hand too much (i.e. increases scope a lot). Your latest change definitely did but maybe it was WIP/draft.
solr/test-framework/src/java/org/apache/solr/util/TestHarness.java
Outdated
Show resolved
Hide resolved
solr/core/src/test/org/apache/solr/metrics/reporters/SolrSlf4jReporterTest.java
Outdated
Show resolved
Hide resolved
solr/core/src/test/org/apache/solr/metrics/reporters/SolrGraphiteReporterTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for scaling back the changes to just a few files. Tests pass for me; I think this is ready to merge.
Refactorings to simplify the unwieldily constructor parameters of XmlConfigFile can happen in follow-on work. And it's inconsistent/weird that XmlConfigFile only sets the zkVersion
field when it's passed a fileSupplier in the constructor.
Waiting for some input from @noblepaul to deconflict his recent PRs. |
@noblepaul asked:
Do you mean, to not use XmlConfigFile at all? It's on a case-by-case basis what work is involved there. I think your XPath removing work could have incorporated the dis-use of XmlConfigFile because the former logic used many methods of XmlConfigFile whereas afterwards it's only used to pass along the XML Document. At that point, there's little need of XmlConfigFile; it could be replaced with some utility method on, say, DOMConfigNode that takes some of the same parameters of XmlConfigFile's constructors. Any way, we could have a new JIRA issue for exactly that. |
Update: I will limit the scope of this work to only removing the use of XmlConfigFile from QEC following up on this discussion. And once that is merged I will open a new PR for the removal of the use of XPath. |
Thanks for your persistence on this @heythm.... |
@noblepaul - I want your opinion on merging this PR? This now only includes not using XmlConfigFile in QEC, but also a small cleanup of XmlConfigFile/SafeXMLParsing for handling untrusted/trusted files. Even if we want to delete XmlConfigFile in the long run, it wouldn't hurt not to lose those changes. |
solr/core/src/java/org/apache/solr/handler/component/QueryElevationComponent.java
Show resolved
Hide resolved
…ponent (#962) Really a refactoring; no change in behavior or actual safety. Co-authored-by: David Smiley <dsmiley@apache.org>
https://issues.apache.org/jira/browse/SOLR-16296
This removes the use of XmlConfigFile by QueryElevationComponent.