Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should pas.plugins.ldap really load all LDAP user "keys" for every request? #4

Closed
datakurre opened this issue Jan 31, 2014 · 16 comments
Closed
Labels

Comments

@datakurre
Copy link

@datakurre datakurre commented Jan 31, 2014

Hi, am I missing something here?

All the plugin methods seem to invoke self.users, which is only cached per request, and ends up initializing something called Ugm, e.g.:

https://github.com/collective/pas.plugins.ldap/blob/master/src/pas/plugins/ldap/plugin.py#L185

Initializing Ugm-object eventually calls context._load_keys in

https://github.com/bluedynamics/node.ext.ldap/blob/master/src/node/ext/ldap/ugm/_api.py#L442

and that triggers non-paged search for all the available users at

https://github.com/bluedynamics/node.ext.ldap/blob/master/src/node/ext/ldap/_node.py#L599

And once I paged that search, it took about 15+ seconds for our 90 000 principals. Per request.

What am I missing? Should context._load_keys really load all the available users, or have we just managed to misconfigure the plugin somehow?

Thanks!

@rnixx

This comment has been minimized.

Copy link
Member

@rnixx rnixx commented Feb 17, 2014

@jensens - are all the results cached on request? you mentioned that caching is sane?

if results are cached at request only, we should add possibility to cache in ram, or memcached, or elsewhere less volatile than on request, this also would make it possible to use LDAP search paging

@jensens

This comment has been minimized.

Copy link
Member

@jensens jensens commented Feb 17, 2014

Well, at PAS-level its only cached on request, invoking ugm each time. But on node.ext.ldap level the ldap results of the ldap queries are cached i.e. using memcached if configured.

I know here is much room for optimizations. The major problem is not caching (thats easy) but an active cache-invalidation (passive i.e timeouts are again simple). I'd like to have invalidation pluggable - so if you dont use it it not in your way.

Another issue/enhancement in this context are large ugm-trees. At the moment we have no feature to load them partial. I'am not sure if we want to have 90k users at once in RAM!

Any ideas/ implementations are welcome!

@jensens jensens added the enhancement label Feb 17, 2014
@datakurre

This comment has been minimized.

Copy link
Author

@datakurre datakurre commented Feb 17, 2014

@jensens I tried also with memcached and saw it being filled with LDAP-entries, but it didn't prevent _load_keys-being called once for each request. Should memcached prevent it (and I'm still doing something wrong)? For me it seemed like only queries for single users are being memcached.

The old LDAPUserFolder does not query all the users, which makes it usable also with larger ugm-trees (of course, user searches can still be quite heavy with it).

I think, it would be ok to store any amount of queries/users in memcached as long as it's possible to share the cache between sites.

I'm not yet familiar enough with node.ext.ldap to know, how deeply it requires to know all users before doing anything and what should be fixed to make it work more lazily and with larger trees.

@jensens

This comment has been minimized.

Copy link
Member

@jensens jensens commented Feb 17, 2014

Yes, also the keys should be cached. But i'am not sure if this is the solution. I'd say if a single user is requested its enough to load this one user and dont first ask for all keys. I also have to take some time to get back into the implementation details. Also caching in ugm tree should be done for the single user and not for the whole tree. @rnix and I talked about this in the past, but never implemented it.

@rnix

This comment has been minimized.

Copy link

@rnix rnix commented Feb 17, 2014

@jensens @rnixx dos equis :)

@jensens

This comment has been minimized.

Copy link
Member

@jensens jensens commented Feb 17, 2014

@rnix and @rnixx sorry for confusion :D

@chaoflow

This comment has been minimized.

Copy link
Member

@chaoflow chaoflow commented Jun 16, 2014

@jensens, @rnixx, @datakurre any new insights on this?

@rnixx

This comment has been minimized.

Copy link
Member

@rnixx rnixx commented Feb 17, 2015

hannes just did some work on caching the UGM tree in RAM instead of building it for each request, which speeds up the whole story a lot. Nevertheless it does not solve the underlying problem of loading all keys for a search base in node.ext.ldap. This refactoring still misses a project/budget

@thet

This comment has been minimized.

Copy link
Member

@thet thet commented Feb 23, 2015

users/groups caching implemented here: #13
still, the first call is expensive

@tdesvenain

This comment has been minimized.

Copy link
Member

@tdesvenain tdesvenain commented Feb 23, 2015

Lol) NN
Le 31 janv. 2014 15:16, "Asko Soukka" notifications@github.com a écrit :

Hi, am I missing something here?

All the plugin methods seem to invoke self.users, which is only cached
per request, and ends up initializing something called Ugm, e.g.:

https://github.com/collective/pas.plugins.ldap/blob/master/src/pas/plugins/ldap/plugin.py#L185

Initializing Ugm-object eventually calls context._load_keys in

https://github.com/bluedynamics/node.ext.ldap/blob/master/src/node/ext/ldap/ugm/_api.py#L442

and that triggers non-paged search for all the available users at

https://github.com/bluedynamics/node.ext.ldap/blob/master/src/node/ext/ldap/_node.py#L599

And once I paged that search, it took about 15+ seconds for our 90 000
principals. Per request.

What am I missing? Should context._load_keys really load all the
available users, or have we just managed to misconfigure the plugin somehow?

Thanks!

Reply to this email directly or view it on GitHub
#4.

@rnixx

This comment has been minimized.

Copy link
Member

@rnixx rnixx commented Feb 23, 2015

@rnixx

This comment has been minimized.

Copy link
Member

@rnixx rnixx commented Apr 27, 2015

The underlying problem of loading all keys is basically solved in https://github.com/bluedynamics/node.ext.ldap/tree/performance. It's not completely finished yet, but some might have time and motivation creating also a performance branch of p.p.ldap and start playing around with the new node.ext.ldap branch.

cheers

@rnixx

This comment has been minimized.

Copy link
Member

@rnixx rnixx commented Apr 27, 2015

@pilz

This comment has been minimized.

Copy link
Member

@pilz pilz commented Apr 29, 2015

@rnixx thanks, very neat!! Will soon look into that!

@jensens

This comment has been minimized.

Copy link
Member

@jensens jensens commented Apr 29, 2015

i already created a performance branch and at a first look all works using it.
i did not checked user listing related features, i'am sure there is some work left.
but overall it works much better, even w/o any caching its fast.

@rnixx

This comment has been minimized.

Copy link
Member

@rnixx rnixx commented Jan 2, 2016

node.ext.ldap 1.0b1 is now released which solves this issue.

@rnixx rnixx closed this Jan 2, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.