Feature cProfile #59

Artimi · 2016-09-09T14:06:48Z

Hello again,

this PR implements addition of cProfile to pytest-benchmark. With cProfile we will know what are the most demanding subfunctions of benchmarked function. When you run py.test with option --benchmark-cprofile you will get top 10 functions of every benchmarked function as you can see here:

Those statistics are also saved to storage and you can compare them.

ionelmc · 2016-09-09T14:18:17Z

src/pytest_benchmark/stats.py

@@ -168,6 +169,7 @@ def __init__(self, fixture, iterations, options):
        self.group = fixture.group
        self.param = fixture.param
        self.params = fixture.params
+        self.cprofile_stats = fixture.cprofile_stats

        self.iterations = iterations
        self.stats = Stats()


Now it becomes really obvious that I picked a confusing name for these two stats attributes. If you have naming ideas let me know (eg: what if it's would be rawstats or data? or maybe rename BenchmarkStats to BenchmarkData and so on?)

Yes it's a bit confusing. What about
BenchmarStats -> BenchmarkMetadata
and
Stats -> BenchmarkStats?
And you can probably keep self.stats.

Sounds good, but we should strip the „Benchmark” prefix, no sense to have it everywhere (obvious the stuff is going to be about benchmarks right?).

ionelmc · 2016-09-09T14:19:51Z

This saves the cprofile top ten into the storage right? I was wondering, what if we'd save the full data (so that the user can sort differently later, without re-running)?

Artimi · 2016-09-09T14:45:43Z

Yes it saves top ten function according to current sorting column. I would rather keep it limited because there can be hundreds to thousands functions and usually you are not interested in the most of them. Also, as we discussed in the team, we use mostly cumtime column and sometime maybe tottime and top functions of these two columns should be most of the times same.

ionelmc · 2016-09-09T14:48:27Z

What if we's save just the top-10 for all the cols? That can't be that much.

It's just that usually I have very slow suites and it's a pain in the ass just to rerun everything cause you wanted to see a different column. I'd expect it's the same for other people?

Artimi · 2016-09-09T14:54:40Z

Ok, I will think about the best way to have all needed information.

Artimi · 2016-09-12T09:42:51Z

I updated the code so we now have stored top ten functions for all columns (not redundant). It is still ordered by cprofile_sort_by and functions for following columns are added only if they are not present in the list already. It's not optimal right now because I have to sequentially scan through list but there are always at max 70 dictionaries so it won't be computationally demanding. With this implementation if you later choose different column you will see all needed results.

ionelmc

This is pretty good, there are just few nits.

ionelmc · 2016-09-28T07:22:33Z

src/pytest_benchmark/plugin.py

@@ -215,6 +215,13 @@ def pytest_addoption(parser):
        help="Fail test if performance regresses according to given EXPR"
             " (eg: min:5%% or mean:0.001 for number of seconds). Can be used multiple times."
    )
+    group.addoption(
+        "--benchmark-cprofile",
+        metavar="COLUMN", default=None,


This option should have some validation.

ionelmc · 2016-09-28T07:27:59Z

src/pytest_benchmark/stats.py

@@ -196,7 +199,8 @@ def __getitem__(self, key):
    def has_error(self):
        return self.fixture.has_error

-    def as_dict(self, include_data=True, flat=False, stats=True):
+    def as_dict(self, include_data=True, flat=False, stats=True,
+                cprofile_sort_by="cumtime", cprofile_all_columns=False):


I would try to reduce the cprofile arguments to a sole cprofile argument (cause this call boiler plate is all over the place). If it's specified it means we want to display cprofile stats (sorted by column). If it's not specified it means we want all the columns for storage.

Eg:

.as_dict(cprofile=getoption(...)) if we display the stats in table

.as_dict() if we generate json for storage

ionelmc · 2016-09-28T07:29:24Z

src/pytest_benchmark/stats.py

+            stats_columns.insert(0, cprofile_sort_by)
+            for column in stats_columns:
+                cprofile_functions.sort(key=operator.itemgetter(column), reverse=True)
+                for cprofile_function in cprofile_functions[:10]:


Maybe default to 25 rows? Just in case we need more info later.

ionelmc · 2016-09-28T07:30:24Z

src/pytest_benchmark/stats.py

@@ -207,6 +211,20 @@ def as_dict(self, include_data=True, flat=False, stats=True):
                (k, funcname(v) if callable(v) else v) for k, v in self.options.items()
            )
        }
+        if self.cprofile_stats:
+            result["cprofile"] = []


I would use an intermediate variable (eg cprofile = result["cprofile"] = []) for the loops bellow.

Artimi · 2016-10-06T07:05:36Z

I've worked on your suggestions. It should be ok right now. Please review and merge.

codecov-io · 2016-10-06T07:05:57Z

Current coverage is 60.92% (diff: 73.13%)

No coverage report found for master at e21d193.

Powered by Codecov. Last update e21d193...f078f60

ionelmc · 2016-10-09T16:22:04Z

Alright, thanks for these changes, I know few people have been asking for the cprofile stats.

Petr Šebek added 6 commits September 9, 2016 12:50

Include cprofile info to output json

1767a28

Add display output of cProfile

ecef9cb

cProfile: show only project relpath of file

521074f

Add cProfile cli column to sort by

75f02cf

Added test for cprofile

e445b8e

Add documentation for cprofile

6759433

ionelmc reviewed Sep 9, 2016
View reviewed changes

Store top ten cprofile function for all columns

f078f60

ionelmc requested changes Sep 28, 2016

View reviewed changes

Cprofile fixups

4f9fb13

ionelmc approved these changes Oct 6, 2016

View reviewed changes

ionelmc merged commit ee2f36b into ionelmc:master Oct 9, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature cProfile #59

Feature cProfile #59

Artimi commented Sep 9, 2016

ionelmc Sep 9, 2016

Artimi Sep 9, 2016

ionelmc Sep 9, 2016

ionelmc commented Sep 9, 2016

Artimi commented Sep 9, 2016

ionelmc commented Sep 9, 2016

Artimi commented Sep 9, 2016

Artimi commented Sep 12, 2016

ionelmc left a comment

ionelmc Sep 28, 2016

ionelmc Sep 28, 2016

ionelmc Sep 28, 2016

ionelmc Sep 28, 2016

Artimi commented Oct 6, 2016

codecov-io commented Oct 6, 2016

ionelmc commented Oct 9, 2016

Feature cProfile #59

Feature cProfile #59

Conversation

Artimi commented Sep 9, 2016

ionelmc Sep 9, 2016

Choose a reason for hiding this comment

Artimi Sep 9, 2016

Choose a reason for hiding this comment

ionelmc Sep 9, 2016

Choose a reason for hiding this comment

ionelmc commented Sep 9, 2016

Artimi commented Sep 9, 2016

ionelmc commented Sep 9, 2016

Artimi commented Sep 9, 2016

Artimi commented Sep 12, 2016

ionelmc left a comment

Choose a reason for hiding this comment

ionelmc Sep 28, 2016

Choose a reason for hiding this comment

ionelmc Sep 28, 2016

Choose a reason for hiding this comment

ionelmc Sep 28, 2016

Choose a reason for hiding this comment

ionelmc Sep 28, 2016

Choose a reason for hiding this comment

Artimi commented Oct 6, 2016

codecov-io commented Oct 6, 2016

Current coverage is 60.92% (diff: 73.13%)

ionelmc commented Oct 9, 2016