Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-16296: Load elevate.xml in a more secure way #962

Merged
merged 34 commits into from Sep 7, 2022

Conversation

heythm
Copy link
Contributor

@heythm heythm commented Aug 4, 2022

https://issues.apache.org/jira/browse/SOLR-16296

This removes the use of XmlConfigFile by QueryElevationComponent.

Comment on lines 384 to 385
inputStream = core.getResourceLoader().openResource(configFileName);
xmlDocument = SafeXMLParsing.parseUntrustedXML(log, inputStream);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I completely overlooked something (my bad). I was initially concerned about using XmlConfigFile because it did not use Solr's SafeXMLParsing utility, and I could also see it enabling some XInclude & entity resolver stuff, which concerned me in its use for elevate.xml. But this doesn't mean it's not "safe". Upon further inspection, it appears to be doing some of the same stuff that SafeXMLParsing does in the parseConfigXML method, which is relatively safe as it doesn't escape the resource loader. Do you see @HaythemKh ? If you agree, do you think XmlConfigFile could be modified to call SafeXMLParsing.parseConfigXml so that the safe-ness is clear? If not, we could at least leave a comment there.

At this place in QEC, we could replace these two lines with SafeXMLParsing.parseConfigXml and we would in fact continue to support these somewhat exotic XML features in a "safe" way.

Copy link
Contributor Author

@heythm heythm Aug 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • About modifying XmlConfigFile with SafeXMLParsing#parseConfigXML; It is indeed possible to apply these changes. However, in XmlConfigFile, we authorize the non-availability of SystemId so as not to activate xinclude for this purpose. Unlike SafeXMLParsing#parseConfigXML where we enable it by default. And, XmlConfigFile is used for reading Solr config files (see SolrConfig#readXml). would that have some impact?
  • SafeXMLParsing#parseConfigXML enables xinclude by default, whereas SafeXMLParsing#parseUntrustedXML doesn't. It doesn't seem very "safe", is it? I think if we don't need to activate it we don't have to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uschindler -- you added the SafeXMLParsing library. My instinct is use use the parseConfigXML method for the elevate.xml because it conveniently takes a SolrResourceLoader. It supports x:include too but so-what; right? It's scoped to the SRL so I don't see how it could be a problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, i think that's the way to go. parseConfigXML is for stuff that is part of Solr's config. parseUntrustedXML is for stuff coming over the network like a POSTed document.
As all Solr config files supoort xinclude, it should also be supported for elevantion.xml

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • About modifying XmlConfigFile with SafeXMLParsing#parseConfigXML; It is indeed possible to apply these changes. However, in XmlConfigFile, we authorize the non-availability of SystemId so as not to activate xinclude for this purpose. Unlike SafeXMLParsing#parseConfigXML where we enable it by default. And, XmlConfigFile is used for reading Solr config files (see SolrConfig#readXml). would that have some impact?

We have a valid system id because the ResourceLoader provides one. XInclude will then go through SystemIdResolver provided.

  • SafeXMLParsing#parseConfigXML enables xinclude by default, whereas SafeXMLParsing#parseUntrustedXML doesn't. It doesn't seem very "safe", is it? I think if we don't need to activate it we don't have to.

If it is a config file, xinclude should be allowed, as this is the standard for SOlr config files. There's also no security risk in this because the xinclude only allows to load files from resource loader and not somewhere outside (it uses SystemIdResolver to resolve includes, see

throw new IOException(
"Cannot resolve absolute systemIDs / external entities (only relative paths work): "
+ systemId);
).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course if the elevate.xml is coming over the network as part of the request, then it should be parsed with untrusted.

I am not familar with the code, but the rule for config files is:

  • if the config file is XML in ResourceLoader, it should use SafeXMLParsing#parseConfigXML with xinclude enabled for consistency
  • if the elevation file is passed as part of request body, then parse it using the request input stream using SafeXMLParsing#parseUntrustedXML

I'd tune XmlConfigFile to use those APIs. Back at that time it was too much work to me, so I left that one out. I just added the SystemIdResolver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @uschindler for the detailed context!

uschindler
uschindler previously approved these changes Aug 27, 2022
solr/CHANGES.txt Outdated Show resolved Hide resolved
solr/core/src/java/org/apache/solr/core/XmlConfigFile.java Outdated Show resolved Hide resolved
@uschindler uschindler dismissed their stale review August 29, 2022 17:45

updated code, needs new review.

assertThat(
config.getConfigSetBaseDirectory().toAbsolutePath(),
is(Paths.get("/path/to/solr/home/configsets").toAbsolutePath()));

NodeConfig absConfig =
SolrXmlConfig.fromString(
Copy link
Contributor Author

@heythm heythm Aug 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still need a XmlConfigFile constructor taking an inputstream as input

@heythm
Copy link
Contributor Author

heythm commented Aug 29, 2022

I have done some more local refactorings but I'm thinking about either keeping an XmlConfigFile constructor taking an inputstream as input or continuing the refactorings.

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done some more local refactorings but I'm thinking about either keeping an XmlConfigFile constructor taking an inputstream as input or continuing the refactorings.

If the PR doesn't get out of hand too much (i.e. increases scope a lot). Your latest change definitely did but maybe it was WIP/draft.

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for scaling back the changes to just a few files. Tests pass for me; I think this is ready to merge.

Refactorings to simplify the unwieldily constructor parameters of XmlConfigFile can happen in follow-on work. And it's inconsistent/weird that XmlConfigFile only sets the zkVersion field when it's passed a fileSupplier in the constructor.

solr/CHANGES.txt Outdated Show resolved Hide resolved
@dsmiley
Copy link
Contributor

dsmiley commented Sep 2, 2022

Waiting for some input from @noblepaul to deconflict his recent PRs.

@dsmiley
Copy link
Contributor

dsmiley commented Sep 6, 2022

@noblepaul asked:

BTW can you please take a look at how the work done in #962 can be applied to the other XML files as well

Do you mean, to not use XmlConfigFile at all? It's on a case-by-case basis what work is involved there. I think your XPath removing work could have incorporated the dis-use of XmlConfigFile because the former logic used many methods of XmlConfigFile whereas afterwards it's only used to pass along the XML Document. At that point, there's little need of XmlConfigFile; it could be replaced with some utility method on, say, DOMConfigNode that takes some of the same parameters of XmlConfigFile's constructors. Any way, we could have a new JIRA issue for exactly that.

@heythm
Copy link
Contributor Author

heythm commented Sep 6, 2022

Update: I will limit the scope of this work to only removing the use of XmlConfigFile from QEC following up on this discussion. And once that is merged I will open a new PR for the removal of the use of XPath.

@epugh
Copy link
Contributor

epugh commented Sep 6, 2022

Thanks for your persistence on this @heythm....

@heythm
Copy link
Contributor Author

heythm commented Sep 6, 2022

@noblepaul - I want your opinion on merging this PR? This now only includes not using XmlConfigFile in QEC, but also a small cleanup of XmlConfigFile/SafeXMLParsing for handling untrusted/trusted files.

Even if we want to delete XmlConfigFile in the long run, it wouldn't hurt not to lose those changes.

@dsmiley dsmiley merged commit 66c784f into apache:main Sep 7, 2022
@heythm heythm deleted the hkhiri/elevate branch September 7, 2022 16:40
dsmiley added a commit that referenced this pull request Sep 7, 2022
…ponent (#962)

Really a refactoring; no change in behavior or actual safety.

Co-authored-by: David Smiley <dsmiley@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants