Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize class loading #37

Closed
wants to merge 2 commits into from

Conversation

kriegaex
Copy link
Contributor

@kriegaex kriegaex commented Mar 17, 2021

Closes #9
Relates to https://bugs.eclipse.org/bugs/show_bug.cgi?id=565450

I rebased Steve's old commit/patch StrongSteve/org.aspectj@07fcd35 on current master, removed his whitespace changes and commented-out parts of the code in order to minimise the patch and make it easier for @aclement to review and merge it.

Disclaimer: I have no idea what this patch does, but wanted to assist technically in order to get this into 1.9.7. Original commit comment below, slightly adjusted for better readability.


In our project we found out that during the build up of the Spring context the class loading takes a very long time. Root cause is the huge amount of file I/O during pointcut class loading. We are talking about ~250k file loads.

With these changes we managed to cut down the starting time by around 50%.

What we found out is that IMHO - the clear method of ClassLoaderRepository is called far too often -> in our settings this resulted in not a single cache hit as the cache got cleared permanently. Therefore, we deactived the cache clear calls inside ClassLoaderRepository.

Secondly, we changed Java15AnnotationFinder in a way to not always create new objects for the ClassLoaderRepository but re-use one static instance. Otherwise we experienced >100k objects being created.

Last but not least, we introduced a cache for unavailable classes so that they do not have to be looked up using file I/O over and over again.

The whole behavior is configurable via

  • org.aspectj.apache.bcel.useSingleRepositoryInstance (default: true)
  • org.aspectj.apache.bcel.useUnavailableClassesCache (default: true)
  • org.aspectj.apache.bcel.ignoreCacheClearRequests (default: true)

@aclement
Copy link
Contributor

Do you know if the defaults for those properties offer the old behaviour (and so you have to turn on these optimizations) or offer the new behaviour (and so you have to turn it off if it doesn't work). I presume the latter?

@aclement
Copy link
Contributor

no tests is a bit concerning.

@kriegaex
Copy link
Contributor Author

kriegaex commented Mar 25, 2021

Do you know if the defaults for those properties offer the old behaviour (and so you have to turn on these optimizations) or offer the new behaviour (and so you have to turn it off if it doesn't work). I presume the latter?

They are supposed to, judging from the code. I have not fully reviewed or analysed it, though, you are in a better position to judge that. (Update: The new behaviour is the default, see other comments below.)

no tests is a bit concerning.

I agree. @StrongSteve, can you please write some tests for both scenarios?


What I did notice is that the CI build failed in different tests than the builds immediately before and after on master. So there might be a problem here. Because I have fixed so many test problems on my way from Java 14 through 15 to 16, I am going to rebase this PR on my latest Java 16 branch (which is green on all Java versions and also works on Windows), push and get clearer results with the next CI build for this one. As long as the Java 16 branch is not merged, I can always rebase it back on master, if it should happen that it is ready to be merged earlier than the Java 16 branch.

@kriegaex
Copy link
Contributor Author

One more thing: @StrongSteve does not seem to have signed an ECA (Eclipse Contributor Agreement) or maybe with another e-mail address than the one he used to push his commit. Or maybe he just forgot to sign off the commit, which I could do for him by rewriting the commit with the corresponding footer in the commit message. Can someone check that, please?

@kriegaex
Copy link
Contributor Author

OK, the builds failed quite fast and definitely in a caching test. That implies that something is not 100% backwards compatible, i.e. that the if-else statements need to be adjusted in order to achieve that compatibility. After that, one or more test variants for the new feature should be added.

@kriegaex
Copy link
Contributor Author

One more thing: @StrongSteve does not seem to have signed an ECA (Eclipse Contributor Agreement) or maybe with another e-mail address than the one he used to push his commit. Or maybe he just forgot to sign off the commit, which I could do for him by rewriting the commit with the corresponding footer in the commit message. Can someone check that, please?

OK, I have experimentally added "Signed-off-by: ..." for Steve and it is clear now that his e-mail address is not covered by an ECA according to a click on "Details" beside the last check:

image

@kriegaex
Copy link
Contributor Author

kriegaex commented Mar 25, 2021

I have taken a closer look now and I see that the 3 newly introduced system properties in the 2 changed classes all default to true, as implied by the commit comment and the explanation in Bugzilla. In this case the behaviour changes. @aclement, what is your preference?

  • Make the new behaviour the default if it really is an optimisation without any drawbacks on AspectJ behaviour other than improved performance? In that case we could leave the code in the commit as is and adjust the tests accordingly.
  • Make the existing behaviour the default, staying 100% backward compatible but forcing users to set up to 3 additional system properties (which they probably will not do because they do not know about it)?
    In either of the two cases there should be some kind of documentation, at least something mentioned in the release notes for 1.9.7.

Copy link
Contributor Author

@kriegaex kriegaex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code review is strictly with regard to 100% backward compatibility, not about the fuctionality as such.

Update: I have verified locally that, with or without the changes suggested in my code review comments, the failing ClassloaderRepositoryTest passes if I set all 3 properties to "false" on the command line via -D.

@StrongSteve
Copy link
Contributor

hello ;)
sorry for the late reply
somehow i did not get notifications about this conversation :(

to be honest I am fairly new to committing stuff here - to be precise - have never done it ;)

I have now signed the My Eclipse Contributor Agreement 3.1.0 -> anything else I need to do? ;)

@kriegaex
Copy link
Contributor Author

kriegaex commented May 7, 2021

I have now signed the My Eclipse Contributor Agreement 3.1.0 -> anything else I need to do? ;)

I just rebased this branch on master and force-pushed, also in order to see if the licence check passes. It seems to, so I guess you used the correct e-mail address for signing the ECA:

image

You could tell me if you agree with the changes I suggested in my code review in order to make the changes 100% backward compatible. I could then commit and push them for you in order to save you the trouble of re-doing this patch by yourself. If then the tests for the legacy mode would pass again, the only thing missing thing would be additional tests for the new settings which you introduced. It would be nice if you would take some time to contribute them, based on the changes here in this branch, and next time to be careful to avoid commmitting lots of white-space changes like last time (I mostly fixed that for you already).

@StrongSteve
Copy link
Contributor

I have now signed the My Eclipse Contributor Agreement 3.1.0 -> anything else I need to do? ;)

I just rebased this branch on master and force-pushed, also in order to see if the licence check passes. It seems to, so I guess you used the correct e-mail address for signing the ECA:

image

You could tell me if you agree with the changes I suggested in my code review in order to make the changes 100% backward compatible. I could then commit and push them for you in order to save you the trouble of re-doing this patch by yourself. If then the tests for the legacy mode would pass again, the only thing missing thing would be additional tests for the new settings which you introduced. It would be nice if you would take some time to contribute them, based on the changes here in this branch, and next time to be careful to avoid commmitting lots of white-space changes like last time (I mostly fixed that for you already).

changes look fine to me
sorry for the whitespaces

i will try to contribute some tests in the next days but to be honest i can not promise when i will find a timeslot ;)

@kriegaex
Copy link
Contributor Author

kriegaex commented May 7, 2021

OK, I guess my changes re-established backward compatibility. All tests are green. Now we only need tests for the new opt-in (not opt-out) behaviour.

@kriegaex
Copy link
Contributor Author

kriegaex commented May 20, 2021

i will try to contribute some tests in the next days but to be honest i can not promise when i will find a timeslot ;)

@StrongSteve, it is quite obvious that you are (too?) busy, so can you at least explain how to create a situation in order to test the new feature toggles and see some real differences from the default behaviour? How would you go about testing it? More precisely: How did you test it? You must have tested it somehow, because you developed this in order to scratch your own itch. Maybe you have a sample application I can look at and evolve it into some kind of performance test.

You talked about a Spring application before. Did you use native AspectJ LTW there or was this something happening during pointcut parsing for Spring components anhanced by proxy-based Spring AOP?

@kriegaex kriegaex force-pushed the steve-patch-565450 branch 2 times, most recently from 7145b39 to 6d3dd79 Compare August 10, 2021 00:48
StrongSteve and others added 2 commits October 9, 2021 09:19
In our project we found out that during the build up of the spring context
the class loading takes a very long time.
Root cause is the huge amount of file I/O during pointcut class loading.
We are taking about ~250k file loads.

With these changes we managed to cut down the starting time by around 50%.

What we found out is that IMHO - the clear method of the ClassLoaderRepository
is called far too often -> in our settings this resulted in not a single cache
hit as the cache got cleared permanently.
Therefore we de-actived the cache clear calls inside the ClassLoaderRepository.

Secondly we changed the Java15AnnotationFinder in a way to not always create
new objects for the ClassLoaderRepository but re-use one static instance.
Otherwise we experienced >100k objects being created.

Last but not least we introduced a cache for unavailable classes so that
they do not have to be looked up using file I/O over and over again.

The whole behavior is configurable via
+ org.aspectj.apache.bcel.useSingleRepositoryInstance (default: true)
+ org.aspectj.apache.bcel.useUnavailableClassesCache (default: true)
+ org.aspectj.apache.bcel.ignoreCacheClearRequests (default: true)

Signed-off-by: Stefan Starke <stefan@starkeweb.org>
Signed-off-by: Alexander Kriegisch <Alexander@Kriegisch.name>
Now the defaults are:
+ org.aspectj.apache.bcel.useSingleRepositoryInstance (default: false)
+ org.aspectj.apache.bcel.useUnavailableClassesCache (default: false)
+ org.aspectj.apache.bcel.ignoreCacheClearRequests (default: false)

I.e. the new caching optimisations are opt-in instead of opt-out as
originally designed. This might change in the future, but for now
without any additional tests and experience with the new feature let us
be conservative and make the build green first.

I also added a few more code review findings concerning backward
compatibility, which was less than 100% even with all three flags
deactivated.

Signed-off-by: Alexander Kriegisch <Alexander@Kriegisch.name>
@kriegaex
Copy link
Contributor Author

kriegaex commented Jan 14, 2022

Because the commits were directly on master and do not come from this PR, let me mention for reference that @aclement added these changes: 1b3cead...3227aad which are probably going to superseed the PR and eventuall close #9. Andy was also so kind to add basic tests. 😊

@aclement
Copy link
Contributor

I have manually merged this PR with a couple of basic tests:

Last but not least we introduced a cache for unavailable classes so that
they do not have to be looked up using file I/O over and over again.

I added a basic test for this. I'm tempted to turn it on by default but at least in the codebase maybe we can get a few folks to give it a kick around. My only concern is that you could fill this cache up by resolving nonsensical names but no realistic system should be doing that.

What we found out is that IMHO - the clear method of the ClassLoaderRepository
is called far too often -> in our settings this resulted in not a single cache
hit as the cache got cleared permanently.
Therefore we de-actived the cache clear calls inside the ClassLoaderRepository.

I added a basic test for this too. I don't know the real environment in which this was running to create a more realistic scenario that represents a situation where the cache is being cleared too frequently, but I could believe it may happen.

Secondly we changed the Java15AnnotationFinder in a way to not always create
new objects for the ClassLoaderRepository but re-use one static instance.
Otherwise we experienced >100k objects being created.

I started writing a basic test for this but it felt too artificial. I think it needs a more realistic setup that shows a system with many Java15 based reflection delegates, either matching those with point cuts that need to expose annotations all over the members or by directly exercising the code that would fetch annotations on all the members - but then what do you measure? We don't typically measure heap usage or object allocations in tests. I actually wonder about raising the sharing concept to a higher level, instead of sharing repositories across annotation finders, associate an annotation finder with a world and have all reflection delegates within that world share the same annotation finder (and thus the same repository).

My concern here might be that suppose we create 10000 delegates to try and match point cuts and we don't need 9500 of them because they don't match anything - we may not be able to recover all the memory used due to digging out annotations for the 9500 because the state is mixed into the static shared repository underneath all the annotation finders - particularly if you run with the other change (the second one) that prevents the repository cache from clearing up after itself.

So I think it is a reasonable call to make these opt-in for now, make sure people know they can try them and we are looking for feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Review/port performance enhancements proposed in bugzilla issue
3 participants