Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-14915: Prometheus-exporter does not depend on Solr-core any longer #1972

Merged
merged 12 commits into from Nov 27, 2020

Conversation

dsmiley
Copy link
Contributor

@dsmiley dsmiley commented Oct 11, 2020

And enhanced gradle config to operate as an application
https://issues.apache.org/jira/browse/SOLR-14915

It's cool that I can now run this thing via gw run :-)
The Java-Application plugin for Gradle has other nice distribution facilities that render our attempts to hand-edit startup scripts obsolete. However, using it would mean a much larger Solr distribution as it'd have duplicated libs that Solr has. Any way, it's nice to leave here for people who might want to activate it manually.

As-is there is something not quite right. I modified the solr-exporter bash script to not add all of the Solr server libs to the classpath. However, the assembly task only populates a small number of dependencies into the "lib" here. In particular I don't see transitive dependencies such as Jackson (used by jackson-jq). @dweiss do you have suggestions here? Alternatively, I could remove my changes to solr-exporter scripts and accept that they are going to add a ton more dependencies than actually needed. That's okay.

lucene-solr/solr/packaging/build/solr-9.0.0-SNAPSHOT/contrib/prometheus-exporter

├── bin
│   ├── main
│   ├── solr-exporter
│   ├── solr-exporter.cmd
│   └── test
├── conf
│   ├── grafana-solr-dashboard.json
│   └── solr-exporter-config.xml
└── lib
    ├── argparse4j-0.8.1.jar
    ├── disruptor-3.4.2.jar
    ├── jackson-jq-0.0.8.jar
    ├── simpleclient-0.2.0.jar
    ├── simpleclient_common-0.2.0.jar
    └── simpleclient_httpserver-0.2.0.jar

Enhanced gradle config to operate as an application
@dsmiley dsmiley requested a review from dweiss October 11, 2020 05:09
log.warn("{} XML parser doesn't support XInclude option", file);
}

final Document document = dbf.newDocumentBuilder().parse(file.toFile());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XXE_DOCUMENT: The use of DocumentBuilder.parse(...) (DocumentBuilder) is vulnerable to XML External Entity attacks (details)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged, and doesn't matter when the XML file is a local configuration file as opposed to external input.

@dweiss
Copy link
Contributor

dweiss commented Oct 12, 2020

Well, yes. I know it's tempting to use gradle's "ready-to-use" plugins but the situation is different when you have a stand-alone project and a multi-module aggregate. Let me explain.

gradle plugins add tons of stuff (configurations, dependencies, tasks, conventions) and hide what's going on behind the scenes. This has pros and cons. While you can use the automatically-added 'run' task, the subproject is also populated with stuff that will increase the overall build time (like the stand-alone distributions on assemble) and may conflict with other parts of the multi-module build.

For this reason I typically prefer to be explicit about packaging and distribution aspects, even if some things need to be done manually. My preference would be to just keep using java-library and add corresponding entries to the JAR's manifest; the "run" task is still achievable with plain java-library - Lucene's Luke configuration has an example of how this can be done. Please do as you please though - I don't even know what this particular contrib is :)

As for your question why not all JARs are copied - this is a larger question that applies to the current Solr packaging. I tried to emulate the way ant worked (in packaging.gradle) - this means copying the module's "unique" set of JARs to avoid duplication. This isn't easy and makes configuration management fairly hairy. I think stand-alone tools like prometheus-exporter or luke should have full executable distribution in the binary release - even if this means duplicating some JARs (including Lucene JARs). This makes it easier to reason about what they need, makes the configuration simpler... at the cost of increased distribution size.

The alternative is to hand-edit the build and include what's needed or hand-edit the scripts to point at the right JARs.

I'm really not sure which way is right (I believe the full set of artifacts required to run a stand-alone tool is better but it's my personal opinion).

@dsmiley
Copy link
Contributor Author

dsmiley commented Nov 15, 2020

Based on your advise, I put back the "java-library" plugin in place of the application one, but added a "run" task.

The prometheus-exporter is fundamentally dissimilar to the other contribs. The others have Solr plugins -- they run inside Solr, and thus don't need "lib" directories containing any of the libs that are in Solr. But I don't think the prometheus-exporter should have such a limited list. I see gradle/solr/packaging.gradle and it treats all contribs as the same. So I added more assemblePackage code here that ends up adding other libs like jackson because it does not exclude all so-called "Solr platform libs". Also, I ensured all server/lib/ext JARs (which are all logging) don't get copied to any contrib's libs.

Copy link
Contributor Author

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI:

├── bin
│   ├── solr-exporter
│   └── solr-exporter.cmd
├── conf
│   ├── grafana-solr-dashboard.json
│   ├── log4j2.xml
│   └── solr-exporter-config.xml
└── lib
    ├── argparse4j-0.8.1.jar
    ├── caffeine-2.8.4.jar
    ├── jackson-annotations-2.10.1.jar
    ├── jackson-core-2.10.1.jar
    ├── jackson-databind-2.10.1.jar
    ├── jackson-jq-0.0.8.jar
    ├── simpleclient-0.2.0.jar
    ├── simpleclient_common-0.2.0.jar
    └── simpleclient_httpserver-0.2.0.jar

3 directories, 14 files

@@ -76,7 +76,8 @@ configure(allprojects.findAll {project -> project.path.startsWith(":solr:contrib
return true
}
}
return externalLibs - configurations.solrPlatformLibs
// libExt has logging libs, which we don't want. Lets users decide what they want.
return externalLibs - configurations.solrPlatformLibs - project(':solr:server').configurations.getByName('libExt')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be considered hacky to reference a configuration of a specific sub-project. Maybe libExt could be declared more top-level and be called "loggingLibs" or something.

@@ -123,8 +111,6 @@ else
GC_TUNE="$GC_TUNE"
fi

EXTRA_JVM_ARGUMENTS="-Dlog4j.configurationFile=file:"$BASEDIR"/../../server/resources/log4j2-console.xml"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, /conf is put on the classpath, and thus "log4j2.xml" is loaded automatically based on Log4j2's automatic resolution.


testImplementation ('org.apache.httpcomponents:httpcore')
testImplementation ('org.eclipse.jetty:jetty-servlet')
runtimeOnly 'org.apache.logging.log4j:log4j-api'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging dependencies in Java is a pain. In a separate issue/PR, we should explore this approach: https://blog.gradle.org/addressing-logging-complexity-capabilities

@@ -68,7 +66,7 @@
private static final String[] ARG_CONFIG_FLAGS = {"-f", "--config-file"};
private static final String ARG_CONFIG_METAVAR = "CONFIG";
private static final String ARG_CONFIG_DEST = "configFile";
private static final String ARG_CONFIG_DEFAULT = "solr-exporter-config.xml";
private static final String ARG_CONFIG_DEFAULT = "conf/solr-exporter-config.xml";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HoustonPutman you removed the conf/ and I'm putting it back. I'm not sure why it worked with it gone. It may be related to other changes in this PR that it's needed again. I tested that this works here both with gw run and via executing normally from the built assembly. I didn't test Docker yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the reason I removed the conf/ is because I was reading the file off of the classpath. That way you can run solr-exporter from anywhere, not just contrib/prometheus-exporter.

The reason it doesn't work anymore is that we aren't using the SolrResourceLoader in this PR. I have a fix made that should allow for both the default config to be read off of the classpath or other configs specified via a file location. I can push it to this branch if you don't mind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure; push if you want. I don't think it's important to read off the classpath so long as it reads it from somewhere... but if it works, that's nice.

@dsmiley
Copy link
Contributor Author

dsmiley commented Nov 20, 2020

This has been ready I think; I'll merge Monday.

return MetricsConfiguration.from(config);
private static MetricsConfiguration loadMetricsConfiguration(String configPath) {
try {
return MetricsConfiguration.from(configPath);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PATH_TRAVERSAL_IN: This API (java/io/File.(Ljava/lang/String;)V) reads a file whose location might be specified by user input (details)

@HoustonPutman
Copy link
Contributor

I didn't know there was a stigma around File, but made the change to use Path and Files instead.

@dsmiley dsmiley merged commit 021de9f into apache:master Nov 27, 2020
@dsmiley dsmiley deleted the solr-14915-prometheusDeps branch November 27, 2020 20:08
ctargett pushed a commit to ctargett/lucene-solr that referenced this pull request Dec 3, 2020
…e#1972)

* Reduced dependencies from Solr server down to just SolrJ.  Don't add WEB-INF/lib.
* Was missing some dependencies in lib/; now has all except SolrJ & logging.
* Can run via gradle, "gradlew run"
* Has own log4j2.xml now

Has own CHANGES.md now.
epugh pushed a commit to epugh/lucene-solr-1 that referenced this pull request Jan 15, 2021
…e#1972)

* Reduced dependencies from Solr server down to just SolrJ.  Don't add WEB-INF/lib.
* Was missing some dependencies in lib/; now has all except SolrJ & logging.
* Can run via gradle, "gradlew run"
* Has own log4j2.xml now

Has own CHANGES.md now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants