Django Performance Testing Automation

Find code here.

N+1 Queries

Say, for example, you're working with a Django application that has the following models:

# courses/models.py
from django.db import models
class Author(models.Model):
name = models.CharField(max_length=100)
def str(self):
return self.name
class Course(models.Model):
title = models.CharField(max_length=100)
author = models.ForeignKey(Author, on_delete=models.CASCADE)
def str(self):
return self.title

Now, if you're tasked with creating a new view for returning a JSON response of all courses with the title and author name, you could write the following code:

# courses/views.py
from django.http import JsonResponse
from courses.models import Course
def all_courses(request):
queryset = Course.objects.all()
courses = []
for course in queryset:
courses.append(
{"title": course.title, "author": course.author.name}
)
return JsonResponse(courses, safe=False)

This code will work, but it's very inefficient since it will make far too many database queries:

1 query for obtaining all the courses

N queries for obtaining the branch in each iteration

Before addressing this, let's look at just how many queries are made and measure the execution time.

Metrics Middleware

You'll notice that the project includes custom middleware that calculates and logs the execution time of each request:

# core/middleware.py
import logging
import time
from django.db import connection, reset_queries
def metric_middleware(get_response):
def middleware(request):
reset_queries()
# Get beginning stats
start_queries = len(connection.queries)
start_time = time.perf_counter()
# Process the request
response = get_response(request)
# Get ending stats
end_time = time.perf_counter()
end_queries = len(connection.queries)
# Calculate stats
total_time = end_time - start_time
total_queries = end_queries - start_queries
# Log the results
logger = logging.getLogger("debug")
logger.debug(f"Request: {request.method} {request.path}")
logger.debug(f"Number of Queries: {total_queries}")
logger.debug(f"Total time: {(total_time):.2f}s")
return response
return middleware

Run the database seed command to add 10 authors and 100 courses to the database:

$ python manage.py seed_db

With the Django development server up and running, navigate to http://localhost:8000/courses/ in your browser. You should see the JSON response. Back in your terminal, take note of the metrics:

Request: GET /courses/
Number of Queries: 101
Total time: 0.10s

That's a lot of queries! This is very inefficient. Each additional author and course added will require an additional database query, so performance will continue to degrade as the database grows. Fortunately, the fix for this is quite simple: You can add a select_related method to create a SQL join which will include the authors in the initial database query.

queryset = Course.objects.select_related("author").all()

Before making any code changes, let's first start with some tests.

Performance Tests

Start with the following test, which uses the django_assert_num_queries pytest fixture to ensure that the database is hit only once when there is one or more author and course records present in the database:

import json
import pytest
from faker import Faker
from django.test import override_settings
from courses.models import Course, Author
@pytest.mark.django_db
def test_number_of_sql_queries_all_courses(client, django_assert_num_queries):
fake = Faker()
author_name = fake.name()
author = Author(name=author_name)
author.save()
course_title = fake.sentence(nb_words=4)
course = Course(title=course_title, author=author)
course.save()
with django_assert_num_queries(1):
res = client.get("/courses/")
data = json.loads(res.content)
assert res.status_code == 200
assert len(data) == 1
author_name = fake.name()
author = Author(name=author_name)
author.save()
course_title = fake.sentence(nb_words=4)
course = Course(title=course_title, author=author)
course.save()
res = client.get("/courses/")
data = json.loads(res.content)
assert res.status_code == 200
assert len(data) == 2

Not using pytest? Use the assertNumQueries test method in place of django_assert_num_queries.

What's more, we can also use nplusone to prevent the introduction of future N+1 queries. After installing the package and adding it to the settings file, you can add it to your tests with the @override_settings decorator:

...
@pytest.mark.django_db
@override_settings(NPLUSONE_RAISE=True)
def test_number_of_sql_queries_all_courses(client, django_assert_num_queries):
...

Or if you'd like to automatically enable nplusone across the entire test suite, add the following to your test root conftest.py file:

from django.conf import settings
def pytest_configure(config):
settings.NPLUSONE_RAISE = True

Hop back to the sample app, and you run the tests. You should see the following error:

nplusone.core.exceptions.NPlusOneError: Potential n+1 query detected on `Course.author`

Now, make the recommended change -- adding the select_related method -- and then run the tests again. They should now pass.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
core		core
courses		courses
tests		tests
LICENSE		LICENSE
README.md		README.md
git_commit_auto_script.sh		git_commit_auto_script.sh
manage.py		manage.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Django Performance Testing Automation

N+1 Queries

Metrics Middleware

Performance Tests

About

Languages

License

ShihabYasin/django-performance-measure

Folders and files

Latest commit

History

Repository files navigation

Django Performance Testing Automation

N+1 Queries

Metrics Middleware

Performance Tests

About

Topics

Resources

License

Stars

Watchers

Forks

Languages