Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CPU hog in config save for large users.conf #1364

Merged

Conversation

Curly060
Copy link
Contributor

If the realm file (users.conf) exceeds a certain size (about 30000 lines) saving this file causes a massive CPU hog. I have provided a branch that demonstrates the issue:
https://github.com/Curly060/gitblit/tree/show-user-config-save-cpu-hog-bug
master...Curly060:show-user-config-save-cpu-hog-bug

So simply check out this branch and run the ConfigUserService class from there. But beware: Depending on the CPU you have, this might take from 20s up to several minutes!

This patch here simply replaces the usage of JGits Config classes with a much more efficient and far more simple class.

I know that Gitblit was not made for big workgroups and clearly this is a big users.conf. Nevertheless, saving 60000 lines should not hog a CPU like that.

@Curly060 Curly060 changed the title bugfix: fix CPU hog bug in config save for large users.conf Fix CPU hog in config save for large users.conf Mar 20, 2021
@flaix
Copy link
Member

flaix commented Mar 21, 2021

Hi Ingo!

Thanks you for bringing this to attention and providing a fix. I will have a look at it as soon as I find the time.
Out of curiosity, do you run Gitblit for such large workgroups or how did you encounter that behaviour?

@Curly060
Copy link
Contributor Author

Gitblit is used at the company I work at and this company is indeed a very big global company. We use a forked version of the Gitblit Atlassian Crowd Plugin which syncs users from Crowd to Gitblit. Unfortunately the plugin only ever adds new users and never removes dead ones, so over time the users.conf has grown quite large (~62000 lines).
So recently gitblit had started to hang for several minutes quite often. Eventually this got so frustrating that I finally investigated the problem in detail.
tl;dr: JGit internally uses an array to handle Config files and about 1 billion(!!) loop runs occur when storing those 62k lines.

Now the gory details ;)
When Gitblit saves the users.conf, it rebuilds the config from scratch like this:
for (user : users) {
config.setXXX(); // with up to 18 such setXXX() methods
}
for (team : teams) {
config.setXXX(); //with up to 8 such setXXX() methods
}
These methods all internally call the setStringList() method which calls replaceStringList(). The JGit Config class holds an array of ConfigLine items. And for every setXXX method it first creates a new array of size "old array size + string list size",copies all entries and then always(!) runs through that entire(!) array doing some string comparisons ("am I in the right section and subsection?").
While this may be fine for small config files like they exist in typical git repositories, it certainly is not when the files grow larger. As I was saying, 62k lines result in approx. 1 billion internal loop runs in the Config class.

What makes the whole situation even worse is that the write method of ConfigUserService is synchronized! So if this method takes a minute, all other attempts to write need to wait this minute. And the next write of course takes another minute again. It's quite deadly ;)

So that's why I decided to replace the config save logic with a much simpler one. Now saving takes 200ms or so.

@flaix
Copy link
Member

flaix commented Mar 22, 2021

Great!

Have you tested your class and could you provide some unit tests for it? I am trying to increase the test coverage to make supporting Gitblit easier in the future.
Have you ever encountered something like in #938? Are you sure that your class is not susceptible to data loss?

That sounds like an impressive setup. If you have forked the Crowd plugin, would it be easy to make it also delete users? With a quick search I found two plugins, but didn't have a closer look.
Can I ask how large your developer base in Gitblit is, i.e. actual developers, not dead user entries? If you have many developers maybe there are other problems that you have come across with Gitblit?

@Curly060
Copy link
Contributor Author

I have made the following tests, that were all successful:

  • Load existing config file and write it again with both, old and new logic, then sort and compare => both files were identical
  • run the patched version of Gitblit in our test environment (with Crowd plugin enabled) and make operations that trigger saving the users.conf
  • run a local patched version against some of my own repositories with some local tests users and make operations that trigger saving the users.conf

But you are quite right, this may not be enough testing. After all, our users do not have all properties of the UserModel set. So yes, unit tests are probably a very good idea. However, I am not that deep in Gitblit source that I may catch all cases - and I will have to find the time to write those. So this might take a while.

About issue #938:
I wouldn't expect my class to cause such behaviour. It simply replaces StoredConfig / FileBasedConfig of JGit with a different implementation. The rest of the logic of GitBlit stays the same: build whole config first, then save (to a temp file) and only if that temp file exists and is not empty, do the rename.

About the crowd plugin: I haven't had a look at the code there yet, but yes, it should be possible to delete users again. However, fixing the config saving seemed like easiest thing to do :)

About the developer base: Hard to say, but in the last 7 days there were commits from ~180 different committers. Using git shortlog and some bash scripting: ~380 commiters in the past year. Amount of repos: 660

So far, Gitblit has served well for the past couple of years, I'd say. This config save thing was the only major issue that I ways aware of in the past 3 years that I work there.

@Curly060
Copy link
Contributor Author

I just noticed: The UserServiceTest class already implicitly tests pretty much everything. It creates various attributes in UserModel and TeamModel, saves them to the file individually, reads the file again and checks if those attributes are still the same. Would that not be sufficient testing then?

@flaix
Copy link
Member

flaix commented Mar 22, 2021

Thanks for your answer and for enduring my questions. :)
Yeah, I guess you are right that the existing tests should suffice.

@martinspielmann
Copy link
Contributor

Really nice catch and cool implementation, Ingo!

@Curly060
Copy link
Contributor Author

We have built gitblit with this patch and deployed it on prod a week ago. This has solved the CPU hogging problem completely and now gitblit runs just fine.

@Curly060
Copy link
Contributor Author

This change has now been active for over 3 months in our company and it runs perfectly fine and without any issues. Any chance this gets merged?

@flaix flaix merged commit e04cad2 into gitblit-org:master Jul 5, 2021
@flaix
Copy link
Member

flaix commented Jul 5, 2021

Ya, I had an issue with it, why it didn't use the StoredConfig interface, but since I haven't gotten around to comment or change it since then, and since it is proven to work and fixes a problem, I just merged right now.

@Curly060
Copy link
Contributor Author

Curly060 commented Jul 5, 2021

About the StoredConfig:
I have done this on purpose to clearly show that in this place using the JGit Config class is not applicable at all!
The trouble with StoredConfig is that it unfortunately is not really an interface, it's an abstract class extending the Config class from JGit.
If I had extended from StoredConfig not only this would have forced me to override load() (which makes no sense in my implementation), no, it would have pulled in all the setXXX methods which are responsible for this CPU hog in the first place.
But the whole point of this fix was to avoid exactly these methods, hence, a completely new implementation.

Yes, I know this is not ideal, but actually it is the fault of JGit which does not provide clean interfaces for its Config classes...

Oh, and thanks a lot for merging :)

@flaix
Copy link
Member

flaix commented Jul 6, 2021

Ah, yes, that rings a bell. I guess I had found that out, too, but then for some reason forgot about the PR again. So thanks for the reminder.

@flaix
Copy link
Member

flaix commented Jul 6, 2021

Or, checking the builds, maybe it was because it uses Java 8, so it would need to be part of 1.10, as I keep 1.9 on Java 7.

@flaix flaix added this to the 1.9.2 milestone Jul 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants