Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server-side of HTTP admin interface uses a lot of resources on large clusters #1660

Closed
danielmewes opened this issue Nov 19, 2013 · 13 comments
Closed
Assignees
Milestone

Comments

@danielmewes
Copy link
Member

On a 32-node cluster with something like 200 tables, just opening the HTTP admin interface can max out a core on the server. It remains maxed-out until a moment after I close the browser window.

Some operation(s) there seem(s) to become very expensive for large clusters.
I will profile this some time.

@ghost ghost assigned danielmewes Nov 19, 2013
@neumino
Copy link
Member

neumino commented Nov 19, 2013

There is just too much data in /ajax, some that are not really useful for the web interface.

Sending diff would be a quick workaround (and not too hard?)

@danielmewes
Copy link
Member Author

Thanks for chiming in @neumino.
I could imagine that there are other even easier things we could optimize though. Let's wait for profiling results and then see how we best proceed from there.

@neumino
Copy link
Member

neumino commented Nov 19, 2013

Oh, there are also the stats and the request to distribution that could be expensive (there is a timeout for the stats though).

@danielmewes
Copy link
Member Author

Also when I click "Tables" to get a table listing, my Firefox complains about a JS script that is taking too long, e.g. "http://magneto:8082/cluster-min.js?v=foo:16246". The exact line number varies though (foo is the version of my server).
@neumino: At some point, maybe we could have a look at that together?

@neumino
Copy link
Member

neumino commented Nov 19, 2013

Hum, my guess is that we just create too many backbone views.
We create one per table and 2 per database. If you have 500 tables, I can imagine it breaking.

I would tend to think that the current architecture of the code in admin/static/coffee/namespaces/index.coffee is just over engineered and create too many views. I can try to a quick/dirty thing to reduce the numbers of view and we can check if it works better.

@neumino
Copy link
Member

neumino commented Nov 19, 2013

And we compute the state of each namespace, which can be expensive (especially if you have tons of shards).

@danielmewes
Copy link
Member Author

The profiler shows most time on the server side being spent in progress_app_t::handle(). This is when viewing the dashboard, not the table view.
The function seems to be so slow because it copies std::map<peer_id_t, cluster_directory_metadata_t> a lot, which makes sense. I think I can optimize this quite easily by making use of some of the new functions that I added for improving overall directory scalability. I will give it a try.

@danielmewes
Copy link
Member Author

A fix for the main server-side problem (excessive copying, up to something like O(n^2 log n) ) is in code review 1036 by @neumino, and implemented in branch daniel_1660.

The client-side inefficiency when opening the "Tables" page is a different problem, and I'm going to open a separate issue for that.

@danielmewes
Copy link
Member Author

In next as of f242d98

@josephglanville
Copy link

I have been using the /ajax endpoint for doing cluster monitoring.

It strikes me that this could be split up into multiple endpoints rather than just being a massive 'everything in one go' sort of thing.

@neumino
Copy link
Member

neumino commented Mar 16, 2014

The content in /ajax used to be split into multiple endpoints, but they were merged at some point.

Having its content spread accross multiple urls would mean that the web interface would have to do more http requests to retrieve all the data it wants -- which isn't a good thing.

@coffeemug
Copy link
Contributor

@josephglanville -- you could go into a specific path to do monitoring FYI. For example, you can curl /ajax/directory to only get the contents of the directory.

On a different note, the monitoring experience currently leaves a lot to be desired. Would you mind writing up your experiences in #1392? I.e. what was it like to set up monitoring, what's good about it, what's bad about it, what could be improved, etc. If you could take a few minutes to do that, it would help immensely (and would make monitoring better for you!)

@josephglanville
Copy link

@coffeemug Cheers for the heads up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants