Skip to content

Conversation

@palimondo
Copy link
Contributor

@palimondo palimondo commented Apr 24, 2017

Also fix phantom “number” test result parsed from Totals.

Resolves SR-4590.

Broken out of #8793.

@palimondo
Copy link
Contributor Author

@gottesmm Please review

@gottesmm
Copy link
Contributor

@palimondo Can you put in a warning for the new tests that were added? I think you would just do a set subtraction and do a simple print.

@gottesmm
Copy link
Contributor

Also, why is the digit test not necessary any more?

@gottesmm
Copy link
Contributor

Another small concern of mine is about the row column numbers. Are you sure you got those correct? @atrick is more familiar with this script, so he may be the right person to just quickly confirm that.

@atrick
Copy link
Contributor

atrick commented Apr 24, 2017

I glanced at the diff but I don't understand how removing all that logic for finding the min/max scores pertains to this bug.
Please make sure that any changes you make to this script continue to handle the concatenated output from multiple runs of the benchmark driver.

Incidentally, I once had an earlier version of the script that handled added and removed tests. That shouldn't be hard.

@palimondo
Copy link
Contributor Author

New tests that were added are coming in the fix for SR-4601 - separate PR, after we land the refactoring.

That whole code around row[MIN].isdigit() must have been some legacy from who-knows-when. It looks like there was a time when Benchamark_Driver wasn't reporting aggregate stats, but repeated tests were printed out multiple time? Given the current output from Benchmark_Driver, this works just fine.

I wasn't changing row column numbers… if you mean going from len(row) > 7 to len(row) > 8 in that test: MEDIAN = 7 and it is followed by MAX_RSS which makes for total of 9 columns.

@palimondo
Copy link
Contributor Author

@atrick:

Please make sure that any changes you make to this script continue to handle the concatenated output from multiple runs of the benchmark driver.

What?! Where, how? I don't understand. When is it invoked like that?

@atrick
Copy link
Contributor

atrick commented Apr 24, 2017

The compare script needs to handle multiple invocations of the driver. That's literally the only way that I use the driver.

1/benchmark_driver > out1
1/benchmark_driver >> out1
...

2/benchmark_driver > out2
2/benchmark_driver >> out2
...

compare out1 out2

Also, I usually only rerun some subset of tests and concatenate those to the same output.

@palimondo
Copy link
Contributor Author

Oh, my… I can see how's the compare_perf_test.py used by other scripts. There was no documentation about your use case anywhere. 🤷‍♂️🙎‍♂️

@palimondo
Copy link
Contributor Author

@atrick What you describe here seems like manual version of the --rerun option you suggested in SR-4669. Am I correct?

@atrick
Copy link
Contributor

atrick commented Apr 24, 2017

Well, compare_perf_test was around long before the driver. I've always used a script similar to that. Mishal made it work with CI.
As I've been saying (over and over), I always run multiple invocations of the driver. I will actually use compare_perf_test on just a single set of results just to aggregate the information, but that's not so common.

Also fix phantom “number” test result parsed from Totals.
@palimondo
Copy link
Contributor Author

palimondo commented Apr 25, 2017

That was just great. 🙅‍♂️
You should have taken https://bugs.swift.org/browse/SR-4590 when I filed it.

@palimondo palimondo closed this Apr 25, 2017
@palimondo palimondo deleted the SR-4590 branch April 25, 2017 00:14
@palimondo
Copy link
Contributor Author

@atrick Please go ahead and do the book-keeping in Jira, too. Thanks!

@palimondo
Copy link
Contributor Author

Sorry @atrick, I thought the conflict was due to your commit, but it was @moiseev.

Why didn't you take SR-4590 when I filed it 11 days ago? I filed PR #8793 some 10 days ago. Now you jump in and mix it all up. Have you seen the refactoring I did there? We are duplicating efforts… Why?

@atrick
Copy link
Contributor

atrick commented Apr 25, 2017

@palimondo I think you misunderstood. I'm not fixing SR-4590. I'm saying I'm surprised that it's broken because I expected it to be able to handle added/removed benchmarks.

The confusing thing is that your bug title and this PR title is unrelated to the functionality that you're removing.

@palimondo
Copy link
Contributor Author

palimondo commented Apr 25, 2017

@atrick It wasn't you. This PR is broken out from #8793 by request from @gottesmm to land it per-partes. @moiseev just landed the fix for SR-4590 in #8923. And a logical change in sorting order. 👍

I'm upset as my changes in #8793 do that too, as a part of much bigger refactoring of compare_perf_test.py - please have a look, both of you! In order to keep such massive change safe, I've been using diff to validate all outputs from legacy version to match output from my refactored script.

Your quick fixes make this unnecessarily hard, as I must rebase on top of your changes. The legacy script contains a ton of little quirks I had to replicate, now I need to redo that again. Its quite upsetting.

@palimondo
Copy link
Contributor Author

I thought that filing bugs, that nobody took and I did self assign and opening a PR with fixes is enough to coordinate the work on this part of code to prevent conflict. I was apparently mistaken.

@atrick
Copy link
Contributor

atrick commented Apr 25, 2017

Ok. Thanks for working at this. Rebasing is always frustrating. But what you're experiencing is pretty normal. The reality is that I miss a lot of bug and commit activity, especially when I'm on vacation for a week then working on a deadline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants