Skip to content

ShihabYasin/django-performance-measure

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Django Performance Testing Automation

Find code here.

N+1 Queries

Say, for example, you're working with a Django application that has the following models:

# courses/models.py

from django.db import models

class Author(models.Model): name = models.CharField(max_length=100)

def str(self): return self.name

class Course(models.Model): title = models.CharField(max_length=100) author = models.ForeignKey(Author, on_delete=models.CASCADE)

def str(self): return self.title

Now, if you're tasked with creating a new view for returning a JSON response of all courses with the title and author name, you could write the following code:

# courses/views.py

from django.http import JsonResponse

from courses.models import Course

def all_courses(request): queryset = Course.objects.all()

courses = []

for course in queryset: courses.append( {"title": course.title, "author": course.author.name} )

return JsonResponse(courses, safe=False)

This code will work, but it's very inefficient since it will make far too many database queries:

  • 1 query for obtaining all the courses
  • N queries for obtaining the branch in each iteration
  • Before addressing this, let's look at just how many queries are made and measure the execution time.

    Metrics Middleware

    You'll notice that the project includes custom middleware that calculates and logs the execution time of each request:

    # core/middleware.py
    

    import logging import time

    from django.db import connection, reset_queries

    def metric_middleware(get_response): def middleware(request): reset_queries()

    # Get beginning stats start_queries = len(connection.queries) start_time = time.perf_counter()

    # Process the request response = get_response(request)

    # Get ending stats end_time = time.perf_counter() end_queries = len(connection.queries)

    # Calculate stats total_time = end_time - start_time total_queries = end_queries - start_queries

    # Log the results logger = logging.getLogger("debug") logger.debug(f"Request: {request.method} {request.path}") logger.debug(f"Number of Queries: {total_queries}") logger.debug(f"Total time: {(total_time):.2f}s")

    return response

    return middleware

    Run the database seed command to add 10 authors and 100 courses to the database:

    $ python manage.py seed_db
    

    With the Django development server up and running, navigate to http://localhost:8000/courses/ in your browser. You should see the JSON response. Back in your terminal, take note of the metrics:

    Request: GET /courses/
    Number of Queries: 101
    Total time: 0.10s
    

    That's a lot of queries! This is very inefficient. Each additional author and course added will require an additional database query, so performance will continue to degrade as the database grows. Fortunately, the fix for this is quite simple: You can add a select_related method to create a SQL join which will include the authors in the initial database query.

    queryset = Course.objects.select_related("author").all()
    

    Before making any code changes, let's first start with some tests.

    Performance Tests

    Start with the following test, which uses the django_assert_num_queries pytest fixture to ensure that the database is hit only once when there is one or more author and course records present in the database:

    import json
    

    import pytest from faker import Faker from django.test import override_settings

    from courses.models import Course, Author

    @pytest.mark.django_db def test_number_of_sql_queries_all_courses(client, django_assert_num_queries): fake = Faker()

    author_name = fake.name() author = Author(name=author_name) author.save() course_title = fake.sentence(nb_words=4) course = Course(title=course_title, author=author) course.save()

    with django_assert_num_queries(1): res = client.get("/courses/") data = json.loads(res.content)

    assert res.status_code == 200 assert len(data) == 1

    author_name = fake.name() author = Author(name=author_name) author.save() course_title = fake.sentence(nb_words=4) course = Course(title=course_title, author=author) course.save()

    res = client.get("/courses/") data = json.loads(res.content)

    assert res.status_code == 200 assert len(data) == 2

    Not using pytest? Use the assertNumQueries test method in place of django_assert_num_queries.

    What's more, we can also use nplusone to prevent the introduction of future N+1 queries. After installing the package and adding it to the settings file, you can add it to your tests with the @override_settings decorator:

    ...
    @pytest.mark.django_db
    @override_settings(NPLUSONE_RAISE=True)
    def test_number_of_sql_queries_all_courses(client, django_assert_num_queries):
    ...
    

    Or if you'd like to automatically enable nplusone across the entire test suite, add the following to your test root conftest.py file:

    from django.conf import settings
    

    def pytest_configure(config): settings.NPLUSONE_RAISE = True

    Hop back to the sample app, and you run the tests. You should see the following error:

    nplusone.core.exceptions.NPlusOneError: Potential n+1 query detected on `Course.author`
    

    Now, make the recommended change -- adding the select_related method -- and then run the tests again. They should now pass.