Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow concurrent local indexing while downloading remote indexes. #5290

Merged
merged 1 commit into from
Jan 13, 2023

Conversation

mbien
Copy link
Member

@mbien mbien commented Jan 13, 2023

extracted from #4999

This aims to improve the indexing experience by allowing remote index downloads while local .m2 indices are processed.

concurrent_dl_while_indexing

CI should produce a dev build for this one for easier testing.

@mbien mbien added Maven [ci] enable "build tools" tests ci:all-tests [ci] enable all tests ci:dev-build [ci] produce a dev-build zip artifact (7 days expiration, see link on workflow summary page) labels Jan 13, 2023
@mbien mbien added this to the NB17 milestone Jan 13, 2023
@mbien mbien removed the ci:all-tests [ci] enable all tests label Jan 13, 2023
@mbien
Copy link
Member Author

mbien commented Jan 13, 2023

all tests green. removing all-tests label for subsequent syncs.

* This isn't perfect since a download can finish before local repos finished indexing, tests showed
* performance doesn't degrade significantly on solid state drives.
*/
private static final ReentrantLock localIndexingPermit = new ReentrantLock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I see a problem here when two local repostories are being indexed and a new remote repository is requested. This will be stalled until the first local repository is done indexing. Basicly like this:

  • Local Repository R1 is submitted into RP to be indexed and locks localIndexingPermit
  • Local Repository R2 is submitted into RP to be indexed and tries to lock localIndexingPermit (this thread will block at Line 658)
  • Remote Repository R3 is submitted into RP to be indexed - it does not execute, as R1 and R2 are "running" in the processor and R3 is queued.

Alternative idea: use two requestprocessors with a through put of 1. One for remote repos, one for local repos. If I understand your intention correctly, this should accomplish the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially wanted to add the lock to the main indexLoadedRepo method, which would be the most correct implementation. However the method is already large enough and it deals with mutexes already. Probably worth splitting up one day.

So I opted to keep it as simple as possible - but your suggestion is even simpler. We can still switch to locks later if this doesn't work well enough.

Copy link
Contributor

@matthiasblaesing matthiasblaesing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks sane to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci:dev-build [ci] produce a dev-build zip artifact (7 days expiration, see link on workflow summary page) Maven [ci] enable "build tools" tests performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants