Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to prevent millions of queries #25

Closed
wants to merge 1 commit into from
Closed

Conversation

ivissani
Copy link

Permission check was slowing down django admin to the point that it was unusable.
I transformed _get_tenant_perms into a @cached_property to try to prevent literally thousands of repeated queries.
Let me know what you think. I didn't give too much thought to whether this can introduce any problems somewhere else.

@Viatrak
Copy link
Contributor

Viatrak commented Nov 16, 2018

Thanks for contributing this. It seems like a good idea. I need to think about the implications a bit.

@Viatrak
Copy link
Contributor

Viatrak commented Jan 20, 2019

If get_tenant_perms is a cached_property, what is the benefit in making is_staff and is_superuser cached_property?

@Viatrak
Copy link
Contributor

Viatrak commented Jan 20, 2019

If the current tenant gets switched while an instance of a user exists this cached property will end up reflecting the permissions of the wrong tenant. It needs to be reset/cleared if the tenant is changed. See https://docs.djangoproject.com/en/2.2/ref/utils/#django.utils.functional.cached_property where they use del to reset the cached property. Thoughts?

@ivissani
Copy link
Author

If get_tenant_perms is a cached_property, what is the benefit in making is_staff and is_superuser cached_property?

Probably none. We should get rid of this and let _get_tenant_perms remain as @cached_property

@ivissani
Copy link
Author

If the current tenant gets switched while an instance of a user exists this cached property will end up reflecting the permissions of the wrong tenant. It needs to be reset/cleared if the tenant is changed. See https://docs.djangoproject.com/en/2.2/ref/utils/#django.utils.functional.cached_property where they use del to reset the cached property. Thoughts?

I think that we could drop using @cached_property and implement our own caching logic instead. With this we could have the cache indexed by tenant, then, when the _get_tenant_perm property is hit we query (a key in a dictionary) our internal cache to see whether a cached result for the currently active tenant exists. If it does then we return that value, otherwise we hit the database to get the permissions and store them in the cache.

What do you think?

@Viatrak
Copy link
Contributor

Viatrak commented Feb 8, 2019

Yeah I agree. I think implementing our own caching logic that's tenant aware would solve this.

@@ -18,59 +19,60 @@ class Meta:
# permissions matching the current schema, which means that this
# user has no authorization, so we catch this exception and return
# the appropriate False or empty set
@cached_property
def _get_tenant_perms(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit but a property should not have a get prefix imo

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thanks for pointing that out

@boatcoder
Copy link

When I read the title, I thought it was an exaggeration, but noooooo . It appears it isn't

image

This has been open more than a year and this still isn't merged.... Is there some reason for the delay?

@sshahbaj
Copy link

sshahbaj commented Sep 27, 2020

So I was looking into the source code of cached_property to understand how we can implement our own caching mechanism and from what I can understand is that we will need to implement our own __get__ method and rest of the stuff in cached_property can remain the same. This is what I've come up with so far:

def __get__(self, instance, cls=None):
    if instance is None:
        return self

    current_schema = connection.get_schema()

    if current_schema not in instance.__dict__.keys():
        instance.__dict__[current_schema] = {}
        res = instance.__dict__[current_schema][self.name] = self.func(instance)
        return res

    elif self.name not in instance.__dict__[current_schema].keys():
        res = instance.__dict__[current_schema][self.name] = self.func(instance)
        return res

    res = instance.__dict__[current_schema][self.name]
    return res

My approach is the check if the current schema is present in __dict__ of the instance and if its not the to create a dict inside with it's key as the current schema and within that dict store the return value of the decorated function.

If the current schema is present then check if method that cached_property is decorating is present within the dict. If it is not present then store the return value of the decorated function.

If both the schema value and the name of the decorated function is present then just return the stored value. Below is an example:

class Foo(object):

    @tenant_cached_property
    def bar_a(self):
        return random.randint(1, 20)

    @tenant_cached_property
    def bar_b(self):
        return random.randint(51, 70)

connection.set_schema('public')

obj = Foo()
print(obj.bar_a)
print(obj.bar_b)

print(obj.__dict__)  # Data is stored in cache

print(obj.bar_a)  # Data is returned from the cache
print(obj.bar_b)  # Data is returned from the cache'

print('###########################################')

connection.set_schema('thor_1589546057') # Simulate schema change

print(obj.bar_a)  # New value is returned and then stored.
print(obj.bar_b)  # New value is returned and then stored.

print(obj.__dict__)  # Confirm that the data is cached within the respective schemas.

Here is the output:

7
57
{'public': {'bar_a': 7, 'bar_b': 57}}
7
57
###########################################
11
67
{'public': {'bar_a': 7, 'bar_b': 57}, 'thor_1589546057': {'bar_a': 11, 'bar_b': 67}}

The problem with above approach is that when you try to do del obj.bar_a then it throws an AttributeError because bar_a was never present in __dict__ of the obj.

I would appreciate any inputs on this. Let me know what you guys think of it.

@foarsitter
Copy link

Looks good to me, but i'm not in the position to test it this week.

My proposal is to to turn the keys current_schema and self.name in __dict__. This fixes the del obj.bar_a since bar_a is present in the dictionary, if i'm correct?

Give it a go and write a pull-request with a testcase like you did with the print statements. Then we have a nice caching solution!

@Viatrak
Copy link
Contributor

Viatrak commented Sep 30, 2020

@sshahbaj @foarsitter Thanks for the contributions. This looks like a good potential solution that addresses the underlying tenant switching / caching problem identified above that prevented this merge originally.

@ivissani @philippbosch Do you have time to weigh in on these ideas / review / test?

@sshahbaj
Copy link

sshahbaj commented Oct 2, 2020

My proposal is to to turn the keys current_schema and self.name in dict. This fixes the del obj.bar_a since bar_a is present in the dictionary, if i'm correct?

@foarsitter That's the catch actually. The cached_property's __get__ works by adding self.name into the __dict__ of the instance with it's value as the data returned by the decorated function. What this does is that when you first access the property it hits the cached_propery's __get__ method and after that any subsequent calls DO NOT hit the __get__ method because it has been replaced. Check this docstring from the source code.

Call the function and put the return value in instance.__dict__ so that subsequent attribute access on the instance returns the cached value instead of calling cached_property.__get__().

But in our case we need that __get__ to be hit so that we can determine for which tenant the data needs to be returned.

@philippbosch
Copy link
Contributor

philippbosch commented Apr 9, 2021

@ivissani @philippbosch Do you have time to weigh in on these ideas / review / test?

@Viatrak I think this is the right approach. I created a new PR #69 based on the work of @Dresdn and @sshahbaj which IMHO is ready to merge. I'm running the code in production already, and the performance enhancements are way bigger than expected. I always thought my own code was slow and started adding caching here and there with little effect. Now with the caching in django-tenant-users I removed all my own caching, and my app is blazing fast. 🎉

@Viatrak Viatrak closed this in #69 Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants