-
Notifications
You must be signed in to change notification settings - Fork 509
METRON-2149: Shaded jar classifier is not consistent #1436
Conversation
I did a crude grep through our Stellar function classes looking for non-java, non-metron dependencies, and this is what I came up with. This is a hit-list of functions that are probably worth a test of some sort or another to verify we haven't broken anything. It might be worth adding a more robust automated test at some point that we can spin up as a topology and include in release validation. EDIT - even this list is not sufficient because in instances of more complex functions we don't have a direct dependency on the 3rd party jar. E.g. HLLP uses helper classes, and those don't show up here.
Just the files without the import context (there are currently 43 that I've found)
|
As I worked through resolving the final failing test I realized there is a use case that is not properly handled by the original changes in this PR. There are 2 versions of guava required by different classes (Stellar and HBase testing utility) so we need a way to relocate one of them. It's not possible to do this without depending on a shaded module because transitive dependencies have already been resolved (meaning only 1 version remains) by the time the final shaded jar is built. Relocating at this point just relocates the single remaining version. At this point I want to summarize my findings and present some options. Here are the requirements I see:
Using classifiers on all modules solves 1 and 2 but does not support 3 as described above. We currently support 3 but not 1 and 2 because lower level modules (like metron-common) do not use a classifier. This means other modules that depend on it inherit relocated classes. The problem is that transitive dependencies from these modules overwrite other dependencies, making it harder to determine which versions end up in the final uber jar. To get to a point where we can satisfy all 3 requirements above, I can think of a couple options. Both of these options are based on the assumption that most class version conflicts involve Stellar classes. Stellar contains most of our business logic and contains a long list of dependencies, including several that commonly conflict with other projects (guava, log4j, jackson, etc). The idea behind both of these options involves isolating Stellar from the rest of the project. Here they are:
I tested both options and was able to get both working for the these use cases:
This is all fairly complex so if anything is not clear I can elaborate. Are there other options I'm not thinking of? Thoughts? |
I really like your 3 proposed main requirements - those sound reasonable to me, with varying degrees of importance depending on how you look at them. I'd say that 3 is the most important in practice, with 1 and 2 supporting that goal. First big question I have is what about Stellar functions in metron-analytics? As I pointed out in this comment those are not core Stellar functions, but I suspect there's potential trouble there as well. Other thoughts Would it make sense to shade and relocate all of the core libs for Stellar and mark all others as scope
The catch with still using an uber jar (ie dep with the "uber" jar classifier) is that modules depending on it would still need to exclude the transitive dependencies that were relocated. The short of it being that the uber jar would have the relocated packages (e.g. org.apache.metron.jackson), but the transitive dependency still carries through as com.fasterxml.jackson when our uber jar is referenced as a dep. This is effectively very very similar to what you outlined and solved in approach 2 with the separate uber jar deploy, but should also cover the broader project classpath for all cases. We spent multiple hours working on this together offline, so I won't rehash all of that here - you've done a reasonable job of summarizing a lot of the main bits. Thanks @merrimanr! |
Thanks for the feedback. A module that includes Stellar functions is not necessarily a problem. The reason I focused on stellar-common is because it contains a substantial amount of custom code with specific dependency versions that often do not line up with versions required by other 3rd party dependencies (HBase, etc). Moving all Stellar functions to stellar-common is an option but I don't think it's necessary right now. We may want to do this for some classes but it can be done on a case-by-case basis as conflicts arise. I think the important thing is to isolate stellar-common now since we know there are several conflicts and have a strategy in place for when conflicts in other modules happen. The more extreme option would be to move all Stellar code to their own separate modules but I think that's overkill. I agree with your approach of relocating core libraries and marking others as |
This sounds sensible to me. I think the challenge with any approach we take here is that it's going to require us to effectively get it 80% of the way done before we find out if there are any gotchas we didn't think of. |
Due the the extensive testing that will be required for this, I am planning on closing and reopening against the METRON-2088-support-hdp-3.1 feature branch. These changes should directly help with that effort and take advantage of the testing that will be required. |
I think you are going to have to look at the zepplin stellar stuff, we have to add the jar's to the zeppelin configuration by hand etc, so you may break things there |
Thanks @ottobackwards. I am developing a test plan right now and will be sure to include this. |
It might just be the instructions or screenshots, but worth a look. |
The latest commits implement the change I proposed above (option 2). For clarity's sake, here it is again:
I have finished some fairly exhaustive testing and will update the original PR description with that test plan next. This approach was fairly straightforward with the exception of an integration test in |
|
I am getting a lot of flakey tests results.
EDIT - |
Any diagnostics or further detail on this? We don't appear to be having this failure in Travis, at least in the latest run. |
@MohanDV and I spent some time testing this PR on a multi node cluster. We were able to run the following and all tests passed :
+1 based on the above. |
@merrimanr How did you get through the instructions for https://github.com/apache/metron/tree/master/use-cases/typosquat_detection? I just ran up to the Stellar Bloom Filter test and am running into a stacktrace from the
I took a spin back through #1399 and it looks like the README documentation about the config options, neither local to the Stellar function nor globally, ever made it into a PR. Added a bug to track it here https://issues.apache.org/jira/browse/METRON-2228https://issues.apache.org/jira/browse/METRON-2228 |
Note for anyone else testing the typosquat usecase. The only way I was able to get this working was to add the following property to global.json.
The file size of the summarized object is larger than the default configured maximum of 1MB.
|
Increasing the OBJECT_CACHE size is in the Typosquat testing instructions of this PR. There is a curl command that sets it higher than the bloom filter object. |
Based on getting through that issue in the typosquat use case (and confirming we didn't introduce a new regression of some sort on this PR), and the added full range test comfort from @anandsubbu and @MohanDV, I'm giving this my +1. |
Contributor Comments
This PR updates the appropriate Maven pom.xml files to use a classifier when building shaded jars. After this change all shaded jars will follow the pattern
${project.artifactId}-${project.version}-uber.jar
whereas before some jars would follow this pattern while others would follow${project.artifactId}-${project.version}.jar
.I believe this will provide a substantial benefit to the project because it will alleviate 2 significant issues. When a module that uses the shaded plugin without a classifier is added to another module as a dependency:
These issues make it extremely difficult to troubleshoot and resolve classpath version problems.
Changes Included
stellar-common
maven dependencies were changed toprovided
metron-data-management
pom file needed a plugin to include stellar as an external dependencyTesting
Since this PR effectively changes the contents and ordering of dependencies in our jars, exhaustive regressing testing is needed. I have started this and will update these instructions I get think of more and get further along. Please feel free to add other tests. These instructions have all been performed on full dev.
Smoke Test
This is a quick test to verify things are working normally:
Typosquat Test
The use case is detailed here.
I had to perform these extra steps:
wget
withyum install -y wget
OBJECT_GET
cache size in the global config:REST Test
I issued several requests against REST and verified there were no errors:
Profiler Test
I followed the instructions here and verified everything worked correctly.
Zeppelin Test
I followed the instructions here and verified everything worked correctly.
Pull Request Checklist
Thank you for submitting a contribution to Apache Metron.
Please refer to our Development Guidelines for the complete guide to follow for contributions.
Please refer also to our Build Verification Guidelines for complete smoke testing guides.
In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following:
For all changes:
For code changes:
Have you included steps to reproduce the behavior or problem that is being changed or addressed?
Have you included steps or a guide to how the change may be verified and tested manually?
Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
Have you written or updated unit tests and or integration tests to verify your changes?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?
For documentation related changes:
Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via
site-book/target/site/index.html
:Have you ensured that any documentation diagrams have been updated, along with their source files, using draw.io? See Metron Development Guidelines for instructions.
Note:
Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
It is also recommended that travis-ci is set up for your personal repository such that your branches are built there before submitting a pull request.