Add developer documentation for benchmarking #11122

vyasr · 2022-06-17T21:02:23Z

This PR documents best practices for writing cuDF Python benchmarks. It includes an overview of the various fixtures provided by our benchmarking suite to all benchmarks and indicates how best to make use of them. It also discusses the various features of our benchmarking suite (including easy comparison to pandas and running in CI) and what developers must do to maintain compatibility with those features.

A PR to incorporate the cudf_benchmarks repo into cudf proper is imminent, but this documentation PR can be reviewed (and merged) independently.

docs/cudf/source/developer_guide/benchmarking.md

…etails to an advanced topics section.

This PR ports the benchmarks in https://github.com/vyasr/cudf_benchmarks, adding official benchmarks to the repository. The new benchmarks are designed from the ground up to make the best use of pytest, pytest-benchmark, and pytest-cases to simplify writing and maintaining benchmarks. Extended discussions of various previous design questions may be found on [the original repo](https://github.com/vyasr/cudf_benchmarks). Reviewers may also benefit from reviewing the companion PR creating documentation for how to write benchmarks, #11122. Tests will not pass here until rapidsai/integration#492 is merged. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Bradley Dice (https://github.com/bdice) - Michael Wang (https://github.com/isVoid) - GALI PREM SAGAR (https://github.com/galipremsagar) - Matthew Roeschke (https://github.com/mroeschke) URL: #11125

docs/cudf/source/developer_guide/benchmarking.md

wence-

Minor queries

docs/cudf/source/developer_guide/benchmarking.md

codecov · 2022-07-18T21:10:48Z

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.10@acadcf2). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-22.10   #11122   +/-   ##
===============================================
  Coverage                ?   86.47%           
===============================================
  Files                   ?      144           
  Lines                   ?    22856           
  Branches                ?        0           
===============================================
  Hits                    ?    19765           
  Misses                  ?     3091           
  Partials                ?        0

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

wence-

Some more minor comments, but overall looks good I think.

docs/cudf/source/developer_guide/benchmarking.md

vyasr · 2022-07-22T21:59:00Z

@shwina @wence- @isVoid thanks for your patience here. I think that I have addressed all the comments. I left open the few discussions that I thought still required some response. I also adapted what we discussed in #11180 (comment) into the discussion on how to handle parametrization in different scenarios.

Note that there is significant overlap in some parts of this document with discussions on testing. However, since I anticipate this PR being merged before #11199 I am fine with getting this done and then consolidating the work in that PR.

docs/cudf/source/developer_guide/benchmarking.md

shwina · 2022-07-25T12:46:52Z

docs/cudf/source/developer_guide/benchmarking.md

+
+When it comes to parametrizing tests, we have a number of options at our disposal.
+One option is fixtures, while a second is using `pytest.mark.parametrize`.
+A third option is provided by the [`pytest_cases`](https://smarie.github.io/python-pytest-cases/) `pytest` plugin.


Did we come to a conclusion about whether we want to advocate for the use of cases v/s straight parametrize/fixtures? It's not super clear from the thread. cc: @wence-

We concluded there's a hierarchy of complexity, which is illustrated with some guidance a little bit further down this paragraph. It's not fully cut-and-dried, which is why this guidance is soft rather than hard.

My fear is that unless it's cut-and-dried, most developers would rather not deal with complexity.

If I develop a feature and need to write a test or benchmark for it, I'm going to choose the path of least resistance, which could include:

writing as little code as possible

touching as few files as possible

learning as few new tools as possible

Sorry to pile on this discussion further (and late!).

I think the current set of rules for choosing a method of parametrization is well-written and as concise as it can be.

But, it's quite mind-bending to think about the intersectionality of parameters for someone wanting to develop a feature and write incremental tests or benchmarks for it. Incorporating fixtures into this workflow is easy and familiar as one identifies reusable components between tests. Incorporating cases seems more difficult and really only possible in retrospect - or am I thinking of this all wrong?

I am not really sufficiently familiar with cases to say, but I think that is reasonable. Probably you would start out writing fixtures, and then, as they became more baroque, wonder if there were a better way (which I believe cases offer).

I don't think that anything that you've said is wrong or untrue, but I think you're classifying as a difference in kind something that is really only a difference in degree.

I would argue that incorporating fixtures effectively is also really only possible in retrospect. As our test suite clearly demonstrates, most people's starting point (at least historically) is to write an unparametrized test with some hardcoded objects, then abuse pytest.mark.parametrize to test as many possibilities as you can. Getting everyone to switch to using fixtures already requires some amount of either 1) foresight into what tests you want to write, or 2) refactoring tests in hindsight. Typically the latter occurs as part of PR review, although over the past year or two I think many developers have become more accustomed to thinking in terms of reusable fixtures and reaching for them first.

Cases are just one more step up the complexity ladder. Instead of just thinking about how you might write a bunch of tests that take the exact same arguments, in order to start off with using cases you need to start thinking about tests that might have only partial overlap in parametrization. Otherwise, you refactor tests in hindsight. The review process will train people in those best practices in the same way that we've improved our fixture usage.

I think the current set of rules for choosing a method of parametrization is well-written and as concise as it can be.

Just to make sure that I understand: am I interpreting this as saying that you're not yet convinced that we should use cases at all?

I did a poor job communicating what changes are required here.

I'm not unilaterally opposed to the use of cases, but I feel I need a better understanding of how much friction, if any, we would be adding to the process of developing benchmarks and - more importantly - tests, if we require them.

I think before moving forward with this recommendation, it'd be great to have a deeper discussion about cases offline. Perhaps with the rest of the team so we can get their feedback as well?

FWIW, bringing my experience from other projects that heavily used parameterized fixtures (but not pytest cases). I think there's a tradeoff between generality (and lack of code repetition) that the more complex approaches bring, and ease of debugging (and extraction) of a particular test when something goes wrong and you need to fix things. Although it is relatively easy to run a single test with pytest test_file::test_name after it fails for debugging purposes, I have often found that what I end up doing is pulling that test out into a single file that I can run with none of the pytest machinery in the way. If it then depends on a bunch of fixtures this is painful because you have to find all those and so forth.

I'll leave this conversation unresolved for now so that it's easy to find when we finally come back to this discussion.

docs/cudf/source/developer_guide/benchmarking.md

wence-

Trivial grammar nit, but otherwise looks good, thanks!

docs/cudf/source/developer_guide/benchmarking.md

vyasr · 2022-08-04T21:56:27Z

In the interest of getting this PR merged sooner rather than later, I'm removing the discussion of cases for now until we can have a discussion and come to some consensus on how they should be used. I'm going to copy the exact relevant text out of the current document into this comment and then remove it from the doc so that we can merge and then revisit.

Discussion of cases

In the second case, fixtures are really functioning as parameters, which we discuss in the next section.

Parametrization: custom fixtures, `pytest-cases` and `pytest.mark.parametrize`

When it comes to parametrizing benchmarks, we have a number of options at our disposal.
One option is fixtures, while a second is using pytest.mark.parametrize.
A third option is provided by the pytest_cases pytest plugin.
Our benchmarks make extensive use of this plugin to handle complex parametrization.
Specifically, it provides some syntactic sugar around

@pytest.mark.parametrize(
    "num", [1, 2, 3]
)
def bench_foo(benchmark, num):
    benchmark(num * 2)

for when the parameters are nontrivial and require complex initialization.
This is common for benchmarks of functions accepting cuDF objects, such as cudf.concat.
With pytest_cases, the different cases are instead placed into separate functions and automatically made available.

# bench_foo_cases.py
def case_1():
    return 1

def case_2():
    return 2

def case_3():
    return 3

# bench_foo.py
@pytest_cases.parametrize_with_cases(num)
def bench_foo(benchmark, num):
    benchmark(num * 2)

pytest-cases is allows developers to put complex initialization into named, documented functions.
That becomes especially valuable when benchmarking APIs whose performance can vary drastically based on parameters.
Additionally, cases, like fixtures, are lazily evaluated.
Initializing complex objects inside a pytest.mark.parametrize can dramatically slow down test collection,
or even lead to out of memory issues if too many complex cases are collected.
Using lazy case functions ensures that the associated objects are only created on an as-needed basis.

When writing cases, just as in writing custom fixtures, developers should make use of the config variables.
Cases should import the NUM_ROWS and/or NUM_COLS variables from the config module and use them to define data sizes.

Given the plethora of options for parametrization,
we codify here some best practices for how each should be employed.
In general, these approaches are applicable to parametrizations of different complexity.
For the purpose of this discussion,
we define a "simple parametrization" as parametrization using a list (possibly nested) of primitive objects.
Examples include a list of integers or a list of list of strings.
This does not include e.g. cuDF or pandas objects.

With that in mind, here are some ground rules for how to parametrize.

Use pytest.mark.parametrize when:

One test must be run on many inputs and those inputs are simple to construct.

Use fixtures when:

One or more tests must be run on the same set of inputs,
and all of those inputs can be constructed with simple parametrizations.
In practice, that means that it is acceptable to use a fixture like this:
```
    @pytest.fixture(params=['a', 'b'])
    def foo(request):
        if request.param == 'a':
            # Some complex initialization
        elif request.param == 'b':
            # Some other complex initialization
```
In other words, the construction of the fixture may be complex,
as long as the parametrization of that construction is simple.

Use pytest-cases.parametrize_with_cases when:

One or more tests must be run on the same set of inputs,
and at least one of those inputs requires complex parametrizations.
Given a set of cases, different tests need to run on different subsets with a nonempty intersection.

This PR adds a primary developer guide for Python. It provides a more complete and informative landing page for new developers. When #11217, #11199, and #11122 are merged, they will all be linked from this page to provide a complete set of developer documentation. There is one main point of discussion that I would like reviewer comments on, and that is the section on directory and file organization. How do we want that aspect of cuDF to look? Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Matthew Roeschke (https://github.com/mroeschke) - Lawrence Mitchell (https://github.com/wence-) - Ashwin Srinath (https://github.com/shwina) URL: #11235

…enchmarking

vyasr · 2022-08-05T15:31:40Z

@gpucibot merge

vyasr added 2 commits June 17, 2022 13:58

First version of benchmarking docs.

fccef06

Second draft.

9326ffe

vyasr added 3 - Ready for Review Ready for review by team doc Documentation Python Affects Python cuDF API. improvement Improvement / enhancement to an existing function labels Jun 17, 2022

vyasr added this to PR-WIP in v22.08 Release via automation Jun 17, 2022

vyasr self-assigned this Jun 17, 2022

github-actions bot removed the Python Affects Python cuDF API. label Jun 17, 2022

vyasr added non-breaking Non-breaking change and removed improvement Improvement / enhancement to an existing function labels Jun 17, 2022

vyasr mentioned this pull request Jun 17, 2022

Feature/python benchmarking #11125

Merged

shwina reviewed Jun 22, 2022

View reviewed changes

docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved

shwina reviewed Jun 22, 2022

View reviewed changes

docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved

vyasr added 2 commits June 23, 2022 11:41

Rename accepts_cudf_fixture to benchmark_with_object.

059820c

Refocus the main text on typical developers and move implementation d…

94c5c8f

…etails to an advanced topics section.

vyasr requested review from shwina and isVoid June 24, 2022 00:27

vyasr added 2 commits June 30, 2022 09:47

Add information on how to run and compare benchmarks.

1f64180

Fix typo.

9b838a7

This was referenced Jul 7, 2022

Remove _getattr_ method in RangeIndex class #10538

Merged

Create main developer guide for Python #11235

Merged

[DISCUSS] cuDF internal/developer documentation #6481

Closed

shwina reviewed Jul 11, 2022

View reviewed changes

docs/cudf/source/developer_guide/benchmarking.md Show resolved Hide resolved

shwina reviewed Jul 11, 2022

View reviewed changes

docs/cudf/source/developer_guide/benchmarking.md Show resolved Hide resolved

shwina reviewed Jul 11, 2022

View reviewed changes

docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved

shwina reviewed Jul 11, 2022

View reviewed changes

docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved

wence- reviewed Jul 12, 2022

View reviewed changes

vyasr mentioned this pull request Jul 12, 2022

Expose get_json_object_options to Python #11180

Merged

Merge branch 'branch-22.08' into docs/python_benchmarking

83d3e6d

bdice mentioned this pull request Jul 18, 2022

[FEA] Story - Supporting row operators on nested types #10186

Closed

wence- reviewed Jul 19, 2022

View reviewed changes

isVoid reviewed Jul 21, 2022

View reviewed changes

docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved

docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved

Address PR comments and add new content discussing parametrization.

c54c050

vyasr requested review from isVoid and wence- July 22, 2022 21:56

shwina reviewed Jul 25, 2022

View reviewed changes

docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved

shwina reviewed Jul 25, 2022

View reviewed changes

docs/cudf/source/developer_guide/benchmarking.md Show resolved Hide resolved

wence- approved these changes Jul 25, 2022

View reviewed changes

docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved

shwina mentioned this pull request Jul 25, 2022

Added 'crosstab' and 'pivot_table' features #11314

Merged

bdice changed the base branch from branch-22.08 to branch-22.10 August 2, 2022 21:43

bdice removed this from PR-WIP in v22.08 Release Aug 2, 2022

bdice added this to PR-WIP in v22.10 Release via automation Aug 2, 2022

Remove discussion of pytest-cases.

32bb6df

vyasr requested a review from shwina August 4, 2022 21:59

Address last open PR comment.

412db7f

shwina approved these changes Aug 4, 2022

View reviewed changes

v22.10 Release automation moved this from PR-WIP to PR-Reviewer approved Aug 4, 2022

Merge remote-tracking branch 'origin/branch-22.10' into docs/python_b…

47915ac

…enchmarking

rapids-bot bot merged commit 493d96b into rapidsai:branch-22.10 Aug 5, 2022

v22.10 Release automation moved this from PR-Reviewer approved to Done Aug 5, 2022

vyasr deleted the docs/python_benchmarking branch August 5, 2022 15:35

vyasr mentioned this pull request Aug 31, 2022

Conform "bench_isin" to match generator column names #11549

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add developer documentation for benchmarking #11122

Add developer documentation for benchmarking #11122

vyasr commented Jun 17, 2022

wence- left a comment

codecov bot commented Jul 18, 2022 •

edited

Loading

wence- left a comment

vyasr commented Jul 22, 2022

shwina Jul 25, 2022

wence- Jul 25, 2022

shwina Jul 25, 2022

shwina Jul 25, 2022

wence- Jul 25, 2022

vyasr Jul 26, 2022

shwina Jul 26, 2022

wence- Jul 26, 2022

vyasr Aug 4, 2022

wence- left a comment

vyasr commented Aug 4, 2022 •

edited

Loading

Parametrization: custom fixtures, `pytest-cases` and `pytest.mark.parametrize`

vyasr commented Aug 5, 2022

Add developer documentation for benchmarking #11122

Add developer documentation for benchmarking #11122

Conversation

vyasr commented Jun 17, 2022

wence- left a comment

Choose a reason for hiding this comment

codecov bot commented Jul 18, 2022 • edited Loading

Codecov Report

wence- left a comment

Choose a reason for hiding this comment

vyasr commented Jul 22, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wence- left a comment

Choose a reason for hiding this comment

vyasr commented Aug 4, 2022 • edited Loading

Parametrization: custom fixtures, pytest-cases and pytest.mark.parametrize

vyasr commented Aug 5, 2022

codecov bot commented Jul 18, 2022 •

edited

Loading

vyasr commented Aug 4, 2022 •

edited

Loading

Parametrization: custom fixtures, `pytest-cases` and `pytest.mark.parametrize`