Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MSHADE-366] - "Access denied" during 'minimizeJar' (supersedes #83) #104

Closed
wants to merge 2 commits into from

Conversation

kriegaex
Copy link
Contributor

@kriegaex kriegaex commented Jul 3, 2021

I wanted to avoid re-creating @JanMosigItemis's PR #83 and make sure he gets the recognition he deserves for taking initiative and creating a PR for the change I suggested in my comment in MSHADE-366. So I have committed on top of his changes, also retaining a refactored version of the new tests he created. Thank you, Jan!

I took the chance not only to fix actual the problem, but also to factor out two helper methods from removeServices, because that method was way too big to still be readable.

Supersedes #83.

@rmannibucau, please continue reviewing here. Let's get this thing merged ASAP.

Now ignoring directories when scanning the classpath for services.
@kriegaex kriegaex changed the title [MSHADE-366] - Improve #83 [MSHADE-366] - "Access denied" during 'minimizeJar' (supersedes #83) Jul 3, 2021
@kriegaex
Copy link
Contributor Author

kriegaex commented Jul 3, 2021

Hint for reviewers: If you wish to hide whitespace changes, simply do this:
image

- Simplify Jan's solution from apache#83 in order to use 'continue' instead of
  nested 'if-else'.
- Factor out two helper methods from 'removeServices', because that
  method was way too big to still be readable.
- DRY-refactor Jan's new test cases into one checking two conditions.
Copy link
Contributor

@rmannibucau rmannibucau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning looks more explicit than today so ok for me.

side note: a real fix would be to handle folders no, does not sound crazy?

@kriegaex
Copy link
Contributor Author

kriegaex commented Jul 3, 2021

side note: a real fix would be to handle folders no, does not sound crazy?

I agree, which is why I raised the question in the original PR and also CC in the Jira issue:

The more important question I have is: Is it necessary to do the same analysis and exclusion for used services during minification for files in a target directory on the classpath, which currently happens for JAR files, or is it the right thing to just ignore it? Is a classpath directory a valid use case here? It could very well be the case, because the current module's class files are usually part of the uber JAR if not explicitly excluded by a filter. So we would have to mimic the same logic here. I can try doing that, starting a fresh PR superseding this one. But I want a maintainer's qualified answer before I start potentially wasting time with a superfluous feature.

Also in the Jira issue, @JanMosigItemis replied:

I'd say, to make things more straight forward, we should do the "quick fix" for the warning and the (maybe) new logic in different PRs, so that we get fast results but do not mix things up here.

I wanted to respect his suggestion and still am. To expand this functionality to class file directories too, would still be a goal for the future. This is e.g. why when factoring out method scanServiceProviderConfigFile, I decided for it not to take a JarEntry parameter, but simply a BufferedReader, so later it can be reused for service loader config files found in file system directories. Probably, further refactoring to unify the two approaches is possible.

@Tibor17
Copy link
Contributor

Tibor17 commented Jul 19, 2021

@kriegaex
@rmannibucau
Is this work complete?

@kriegaex
Copy link
Contributor Author

@Tibor17, if you take time to read our discussion above, you will see that opinions are diverging about the scope of this PR and the connected Jira issue. While I think it is complete and a new issue should be created for what @rmannibucau wants, he argues that

  • functionality should be extended to cover handling directories and/or
  • here the existence of a directory in the classpath as such is a smell which should be fixed while building/filtering the cp, and not issuing a warning is covering up that problem.

Maybe you two guys can figure out together what you want to do here. Just please do not let this PR rot. Either add commits here (I allowed it in the PR settings) or create a new PR with what you believe is the real solution. Or merge it and simultaneously create a new issue, specifying what needs to be done to improve the situation further. Then nothing is covered up or forgotten. Just please give users a chance to not having to see this warning anymore. It gets on everyone's nerves, and users are afraid they made a mistake, which they didin't.

@rmannibucau
Copy link
Contributor

Well to be fair there is a clear bug which is about using https://github.com/apache/maven/blob/bb916d0784c7631866167928e4d0615df3317567/maven-core/src/main/java/org/apache/maven/project/MavenProject.java#L412 and not handling directories so I tend to think we should either use directories or ensure we use the .jar and not target/classes which tend to lead to not solve the original issue by a better error message but an actual support of the shade configuration.

@kriegaex
Copy link
Contributor Author

kriegaex commented Jul 20, 2021

@rmannibucau, I am not disagreeing about your assessment that it might be a bug. After all, I was the first person both here and in Jira to point out that probably something ought to be done with regard to handling directories the same way as JARs. The packaging type of something on the classpath should not determine how it is dealt with. But for now directories are simply not supported and users see an irritating warning instead of either better functionality or being left in peace. My little PR gives them peace for now. I never had the ambition to make this into a new feature myself.

If you can fix the bug or expand the functionality to cover directories, please by all means feel free to do so. If you cannot do it right away within the confines of this PR, because you are busy or need to think about it first, simply create a new issue and do it separately, letting this just be a first cosmetic iteration which stops harassing users with a cryptic warning which only causes support effort.

Neither did I introduce the bug of adding a directory to the classpath nor am I responsible for the historic decision to handle JARs, but not the own module's class directory in the first place. I replaced the cryptic warning by a log message reminding developers of this open issue, refactored a spaghetti code method into 3 smaller ones - not perfect, but better than before - and think we are done here. You said before, you do not mean to block this. So just don't. Accept a small improvement, create a new issue with exactly the task description you like, sign up to it and fix it as you see fit. I would welcome the new functionality, because like I said, I was the first one to ask for it. What happened to kaizen?

Sorry for reiterating, but it seems nobody cares about the messages I wrote yesterday, last week or last month. We are on the same side, we ultimately want the same. I just think that big chunks of work can be split into smaller ones. I am making nothing worse than before. You cannot seriously claim that the warning in the current release makes any kind of sense or that keeping it around provides any value to users. The warning is not there to provide value to the developers, being a visible reminder of a to-do list item. They have Jira and GitHub PRs in order to manage their work. That should not be user-facing.

@rmannibucau
Copy link
Contributor

@kriegaex just to be clear - assuming github comments looked ruder than it should - I don't blame you and understand the path you take but right now the PR is not just about changing the message error (just check the diff) and the work needed to handle directory requires kind of the same amount of code change, this is why I think we should skip this step which will not help users at the end functionally and just handle it properly. I'm fine doing it next month or so if you think it is too much work (but trust me, you already did more than needed with this PR ;)).
Also agree we don't need to handle JPMS services in these iterations.

@kriegaex
Copy link
Contributor Author

kriegaex commented Jul 20, 2021

My code changes are simple enough. Like I said several times before: I started with the one-line change, but then found the code too convoluted and difficult to understand, so I did the natural thing every developer should do: I refactored the code I touched a little bit according to the boyscout rule to "leave the camp gound behind a little bit cleaner" than I met it. I did not refactor excessively, just a little bit structurally. My refactoring is going to help you when you take the next step, like I explained at the end of #104 (comment). Why would you want to start over? You refactored one of my previous PRs before and I refactored your refactoring again. This worked beautifully. Starting over is just waste.

I think we should skip this step which will not help users at the end functionally

I still disagree. If 3.3.0 is released before you get around to do the next improvement - and there is always the chance that the release does not wait for you or you are simply too busy - at least they shall not be bothered by the cryptic and misleading message anymore. I cannot count anymore how many people asked me about this and I had to explain to them not to worry about it. As an Agile Coach, I am coaching developers regularly, and they are always puzzled by this message when using Shade. So yes, I think it provides user value, as humble an amount as it might be.


As for the future PR, I have a question in advance already, which also gives you one of the reasons why I did not tackle that change myself yet: Are there any Maven APIs which provide a unified view of classpath resources in a transparent way, i.e. a single way to handle directories and JARs with regard to iterating over and/or filtering class files and other resources? I did not want to re-invent the wheel. Not being a regular Maven contributor, I do not know the API, othwerwise I might have implemented the change myself already. Any hints, @rmannibucau?

@rmannibucau
Copy link
Contributor

Are there any Maven APIs which provide a unified view of classpath resources in a transparent way, i.e. a single way to handle directories and JARs with regard to iterating over and/or filtering class files and other resources?

Noone the shade plugin should use IMO since it is highly plugin dependent what will happen next. Indeed a visitor could be abstracted but it would bring more constraints than solution from my past experience.

I'll try to bring a PR up soon but to kind of answer you concern, I prefer to release no change in next release than this code which will be totally rewritten in the next minor to handle directories, gives more stability to source code which is not a bad thing for end users.

@kriegaex
Copy link
Contributor Author

Thanks to all reviewers to make it feel really worthwhile contributing here. Close the issue if you think you have to. I just don't get your reasoning, but thank God we don't have to agree. Like I said a minute ago in the other issue: You have the power to merge, I don't. At the end of the day, you win, I lose. Whatever makes you feel good.

@rmannibucau
Copy link
Contributor

@kriegaex there is no winner or loser there, goal is to make the plugin better, nothing more. Pushed a quick PR to share the idea I spoke about and what I think makes the plugin solving the issue more than just changing the error message and not resolving the underlying issue: #108. Hope it makes sense. I will let you judge what is best now with this issue but thought code would show better that we don't need a quick workaround.

@kriegaex
Copy link
Contributor Author

@rmannibucau, thanks for the new PR. Thank you so much for letting code speak and presenting an alternative solution instead of simply blocking this PR and calling it inadequate. I quickly looked at (not formally reviewed) the code and am happily conceding that your solution indeed eliminates the irritating warning in a much more fundamental way than my quick fix, namely by adding the new feature of handling input directories and input JARs uniformly with regard to service provider minification. Therefore, #108 is clearly superior to this PR, which is why I am closing it in favor of yours.

@kriegaex kriegaex closed this Jul 21, 2021
@kriegaex kriegaex reopened this Jul 21, 2021
@kriegaex
Copy link
Contributor Author

kriegaex commented Jul 21, 2021

Temporarily reopening this PR, until #108 handles minification uniformly. Presently, it only addresses the default shader, not JAR minification. As soon as that is done and avoiding warning for directories is no longer necessary, I shall close this PR again.

For now, I am even suggesting to merge this PR first due to the better handling of the current limitation and also because of the contained tests and refactorings. #108 can then be rebased on it and build upon the refactorings. It will be easier to port the uniform resource handling from default shader to minification filter on top of the previous refactoring.

@kriegaex
Copy link
Contributor Author

I experimentally cherry-picked the commit from #108 into this PR (not pushed yet) and added directory scanning. The scan is called and seems to do the right thing, but the current Maven module in which Shade is executed is not included when populating MinijarFilter.removable. That means,

  • minimisation does not work in this case, i.e. all classes, no matter if service provider stuff or not, are kept, the own module is never minimised,
  • a directory scanning implementation is useless.

So before adding such an implementation to Shade, we need to discuss the bigger issue of whether we want minimisation to encompass the own module, because this would be a breaking change. Only then would it make any sense to add directory scanning to the minimisation filter, because otherwise it simply would have no effect. If we answer that question with "yes", we can decide if self-minimisation (I just invented the term) should be done

  • mandatorily,
  • optionally, defaulting to no (opt-in) for backward compatibility or
  • optionally, defaulting to yes (opt-out) for people who do not like the new default.

WDYT, @rmannibucau? @rfscholte, you implemented the service filtering part of minification. Your opinion would also be valuable.

@rmannibucau
Copy link
Contributor

@kriegaex well to be honest minimisation never works on "not trivial apps" (ie directly wired which is the case of most apps these days) so I tend to think selfminimisation (i like it ;)) must work but that minimisation as a feature is also not critical since everything implied by minimisation is handled by transformer configuration already in a more reliable way than auto detection which relies on jdependency which is not clever enough (it is a very hard task and require to know most framework) to be reliable IMHO. So overall i'm for the consistency of handling of classpath element as a first step and if possible drop of minijar feature with a documentation replacement for N+1 version (major indeed).

@kriegaex
Copy link
Contributor Author

kriegaex commented Jul 21, 2021

@rmannibucau, you just completely lost me. I do not understand what you are talking about. MSHADE-366 is specifically about minimisation, your own PR #108 is unrelated. So I am returning to the actual main topic again, asking if we want similar behaviour for service handling during minimisation for directories (i.e. for the own module) as it already exists for dependency JARs. So far, so good.

Are you suggesting to completely remove or deprecate the minimizeJar option? Then all the debates of the previous weeks would have been for nothing. I like that feature and many, many projects use it. So we cannot simply retire or remove it without providing a migration path to something at least at powerful, covering all existing use cases, maybe more (optionally). Are you saying that there are transformer configurations which can minimise JARs? I have zero experience with transformers, so please explain what you mean, or stop me and clarify if I completely misunderstood you. In any case, we should not drop minification. Why remove a useful feature just because it is not perfect? But if you like to explain its limitations better in the documentation, we can of course do that.


Update for my previous experiment with self-minimisation: Commenting out this line of code

removable.removeAll( artifactUnit.getClazzes() );

makes the directory filter effective, but of course we have a boot-strapping problem now, because Shade does not know which class(es) to treat as entry points, consequentially treating all classes as potentially removable, including e.g. classes with main methods, even if there is an active ManifestResourceTransformer setting a mainClass attribute. Ergo, too many classes are being removed and only the service provider interfaces and implementations found via SPI config scanning are being kept. This is, of course, not what we want. But at least I could chack that my new directory scanner MinijarFilter in actually works. If you like to take a look, I can push the commit. I can always fix it later and squash with a subsequent commit.

@rmannibucau
Copy link
Contributor

@kriegaex oki, let me try to list more clearly the points:

  1. I think we want filters to be consistent with the shader (which is the central feature of the plugin) so yes we should handle dirs in both cases
  2. if minification exists it must also exists for the main artifact - for the cases of libraries like spring which put all integrations in the same module but can run without most of them (compile scope vs runtime more or less)
  3. as of today minification rarely works so - for me - 1 is the minimum requirement on it, 2 is a nice to have but we can also - my preference - drop the feature and replace it by plain transformers registration in the configuration which would be more powerful and generic. This point is only doable in next major but I mentionned it because it justifies why I don't want to put too much effort on minification now. So for next releases we must fix this dir issue and long term the discussion can be to rethink if it needs to exists, if not no big deal except doc to write, if yes it must be completely reimplemented and more configurable (which makes it back to transformers and why i think it is not a big feature of the plugin for the future).

Side note: your main example is exactly what I talked about, you assume the main is in the manifest but it is not uncommon to have a generic main launching subcommands (take any cli) so you need to track these commands which are often missed by the minijarfilter as of today because it requires some knowledge about how to register a command (let illustrate it by a META-INF/commands.txt file). So you end up needing to do the minification with a filter which excludes what you don't want.
Java has too much indirection in the code to let jdependency be relevant in a generic manner, only works for simple application with no indirections. From the ones I'm working on, it is very few cases, this is why I think we should review this param for next major version.

Hope it makes sense.

@kriegaex
Copy link
Contributor Author

kriegaex commented Jul 21, 2021

In that case, both #108 and my additional, local commits on top of this PR, both adding directory handling to minification, should not be merged, because it would be unused code. The only case in which a directory needs to be handled at all is self-minification.

Then we are back to square 1, and I suggest to merge this PR as-is, because at least until you do the "grand refactoring" for the next major, it does the right thing for the self-directory, not logging the ugly warning.

BTW, this morning (local time in Asia) I was surprised to see that someone created a 3.3.0 release already, but have not seen it on Central or on the plugin Maven site yet. But probably releasing a 3.3.1 with both this PR and my other PR concerning improved source code filtering would be a good idea.

Alternative to merge as-is: A simple way of self-minification, which could be implemented without too much hassle until you do the "grand refactoring", would be to add two specific options such as minifySelf and a list of minifySelfEntryPoints with the same syntax as includes (e.g. <minifySelfEntryPoint>org/acme/**/*Application.class</minifySelfEntryPoint>. (Never mind the option names, we can change them.) Then we would have enough information to populate MinijarFilter.removable correctly and get a simple version of self-minification running, including correct handling of services.

@rmannibucau
Copy link
Contributor

@kriegaex 3.3.0 got cancelled. Agree listing the classes to track and start the dependency management from can be a good option to solve it and it can even be done before next major I guess so let's 1. handle dirs, 2. look in that direction, wdyt?

@gnodet gnodet closed this Oct 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants