-
Notifications
You must be signed in to change notification settings - Fork 27
DRIVERS-2917 - Standardized Performance Testing of ODMs and Integrations #366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
In general, awesome! Re:
Can this be integrated with existing tests such that some established testing conventions are followed? For example,
IOW If the |
Running the performance tests alongside the Django tests would add too much runtime to be practical. Putting the Is there a way to independently run tests on the |
I'm inclined to agree that the performance tests should be in their own directory Probably integrating with Django's |
tests/performance/perftest/tests.py
Outdated
class TestSmallFlatDocCreation(SmallFlatDocTest, TestCase): | ||
def do_task(self): | ||
for doc in self.documents: | ||
model = SmallFlatModel(**doc) | ||
model.save() | ||
|
||
def after(self): | ||
SmallFlatModel.objects.all().delete() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little uncertain how this whole thing fits together. On the hand, you inherit django.test.TestCase
which has its own setUp/tearDown strategy, on the other hand, you seem to implement your scheme with an after()
method. Some classes use after()
, others tearDown()
... is it intentional? (edit: perhaps answered by the benchmark spec.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ODM benchmark spec (draft PR here: mongodb/specifications#1828) is helpful here, but the short answer is yes, we need the ability to do setup or teardown both before and after every iteration of a test as well as the entire set of iterations.
tests/performance/perftest/tests.py
Outdated
class TestSmallFlatDocCreation(SmallFlatDocTest, TestCase): | ||
def do_task(self): | ||
for doc in self.documents: | ||
model = SmallFlatModel(**doc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically for this pattern you would use Model.objects.create()
which does the same without a need to call save()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the typical Django user be aware of that pattern? It's important to align our benchmark code with how the average user would expect to use the backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's common pattern for users to call Model.objects.create()
tests/performance/perftest/tests.py
Outdated
# Copyright 2025-present MongoDB, Inc. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same license as the rest of the repo. Not sure if we're supposed to put this boilerplate on all our files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Jibola do we have a different policy for integrations such as this? PyMongo has this license in every single file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this performance test is within the repo and there's no outside dependency.
It's fine to remove it
{ | ||
"field1": "kj9$mxz#p2qw8r*vn4@h7c&u1s", | ||
"field2": "x3@9zf#mk7w$qp2n8v*r6j4h&c1u5s0a", | ||
"field3": "p9#m2x$k7z@w3q8v*n4r&j6h1c5u0s", | ||
"field4": "z8@x3m#k9w$p2q7v*r4n6j&h1c5u0s", | ||
"field5": "m7#k9x$z3w@p8q2v*n4r6j&h1c5u0s", | ||
"field6": "k6$x9m#z7w3p@q8v2n*r4j6h&c1u5s0", | ||
"field7": "x5@m9k#z6w$p3q7v*n2r8j&h4c1u0s", | ||
"field8": "m4#x8k$z9w6p@q3v*n7r2j&h5c1u0s", | ||
"field9": "k3$m7x#z8w9p@q6v*n2r4j&h1c5u0s", | ||
"field10": "x2@k6m#z7w8p$q9v*n3r1j&h4c5u0s", | ||
"field11": "m1#x5k$z6w7p@q8v*n9r2j&h3c4u0s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "Django-ey" solution is to use fixtures, but I don't insist on it. It might be more trouble than it's worth, especially if Django's TestCase
machinery isn't actually running (see other comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intended workflow here is that the tests will clone the specifications repo and extract these datasets for each run of the suite. The datasets are used across all of the implementing ODMs, so using a Django-specific feature like a fixture seems out-of-scope here (despite looking nice and clean).
|
||
"""Tests for the MongoDB ODM Performance Benchmark Spec. | ||
See https://github.com/mongodb/specifications/blob/master/source/benchmarking/odm-benchmarking.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For other reviews, this file is not yet merged, see https://github.com/mongodb/specifications/pull/1828/files#diff-a9fdadbc97ab19c91a9b780e5a14287a9ef83485e36916c93a0b6e977e501871
tests/performance/perftest/tests.py
Outdated
def after(self): | ||
SmallFlatModel.objects.all().delete() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little uncertain how this whole thing fits together. On the hand, you inherit django.test.TestCase
which has its own setUp/tearDown strategy, on the other hand, you seem to implement your scheme with an after()
method. Reading the spec, it looks like some after()
methods should be tearDown()
. In this case, if you delete the objects after the first iteration, future iterations are actually creating new models rather than updating existing ones. Check your other tests too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Definitely a bug here. I'll have to modify the benchmark to ensure that each iteration of updated_value
is unique so the database actually performs an update operation.
.evergreen/config.yml
Outdated
include_expansions_in_env: | ||
- requester | ||
- revision_order_id | ||
- project_id | ||
- version_id | ||
- build_variant | ||
- parsed_order_id | ||
- task_name | ||
- task_id | ||
- execution | ||
- is_mainline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
evergreen nit:
For consistency are these able to be an array?
# Install django-mongodb-backend | ||
/opt/python/3.10/bin/python3 -m venv venv | ||
. venv/bin/activate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT:
Whilst I'd rather not have a new dependency on drivers-evergreen-tools
, to future-proof the binary usage, you could add in the find_python
function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This I copied over from the existing run-tests.sh
script here. Let's update them both in a separate ticket.
.evergreen/run_perf_test.py
Outdated
"end": int(end_time.timestamp()), | ||
"elapsed": elapsed_secs, | ||
} | ||
report = {"failures": 0, "results": [results]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we expect no failures, what's the point of providing the key? Is it just required to be passed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe so, I'll remove it and see.
# Always omit the performance benchmarking suite. | ||
if x.name != "gis_tests_" and x.name != "performance" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
@@ -0,0 +1 @@ | |||
{"field1":"miNVpaKW","field2":"CS5VwrwN","field3":"Oq5Csk1w","field4":"ZPm57dhu","field5":"gxUpzIjg","field6":"Smo9whci","field7":"TW34kfzq","field8":55336395,"field9":41992681,"field10":72188733,"field11":46660880,"field12":3527055,"field13":74094448} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this get formatted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file won't be in the repo. It'll live entirely in the specifications repo and be cloned for each run of the benchmark. It's included here for ease of review.
name = self.__class__.__name__[4:] | ||
median = self.percentile(50) | ||
megabytes_per_sec = self.data_size / median / 1000000 | ||
print( # noqa: T201 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use LOGGER
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging.info doesn't play nicely with tests here. For perf tests I don't think it's worth the effort to fix.
tests/performance/perftest/tests.py
Outdated
for doc in self.documents: | ||
model = SmallFlatModel(**doc) | ||
model.save() | ||
self.ids.append(model.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since creation isn't part of benchmark time, can we place this in a bulk_create or a threadpool to reduce the time consumption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using bulk_create
wherever possible makes sense.
tests/performance/perftest/tests.py
Outdated
for doc in self.documents: | ||
model = SmallFlatModel(**doc) | ||
model.save() | ||
self.models = list(SmallFlatModel.objects.all()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.models.append(model) should suffice, especially if you use Models.objects.create
. Removes the need for database call.
Purpose
This is the first draft of the Django implementation of the MongoDB ODM Benchmarking suite. It contains a small suite of benchmark tests designed to measure django-mongodb-backend performance across data sizes and structures for what we expect to be common user operations.
This is NOT intended to be a comprehensive test suite for every operation, only the most common and widely applicable. This is also not intended to be Django specific: each of these tests must be implementable across all of our ODMs, up to the features supported by each library.
Structure
The benchmarks are contained within a separate Django application within
tests/performance/perftest
. This application exists only to execute the benchmarking suite.Running Locally
After starting a local MongoDB server, the tests can be run by running
python manage.py test
from thetests/performance
directory. The full suite run locally is expected to take approximately 10 minutes. For faster benchmarks, passFASTBENCH=1
as an environment variable totests/performance/perftest/tests.py
.Review Items