Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make json and csv output consistent. #1662

Merged
merged 4 commits into from
Sep 26, 2023
Merged

Make json and csv output consistent. #1662

merged 4 commits into from
Sep 26, 2023

Conversation

andreas-abel
Copy link
Contributor

Currently, the --benchmark_format=csv option does not output the correct value for the cv statistics. Also, the json output should not contain a time unit for the cv statistics.

Currently, the --benchmark_format=csv option does not output the correct value for the cv statistics. Also, the json output should not contain a time unit for the cv statistics.
Copy link
Collaborator

@LebedevRI LebedevRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the json output should not contain a time unit for the cv statistics.

That statement is lacking motivational proof.
Please drop the JSON change, it is intentional.

The CSV reporter is deprecated, not sure how much we care about it,
but the change seems innoculus...

@andreas-abel
Copy link
Contributor Author

andreas-abel commented Sep 14, 2023

The cv is the "coefficient of variation" (https://en.wikipedia.org/wiki/Coefficient_of_variation), which is a dimensionless number (https://en.wikipedia.org/wiki/Dimensionless_quantity).

So the time unit should either be removed, or it should be empty. Currently, it contains "ns", which is definitely wrong.

@LebedevRI
Copy link
Collaborator

The cv is the "coefficient of variation" (https://en.wikipedia.org/wiki/Coefficient_of_variation), which is a dimensionless number (https://en.wikipedia.org/wiki/Dimensionless_quantity).

Yep, i know that, i added it.

So the time unit should either be removed, or it should be empty. Currently, it contains "ns", which is definitely wrong.

I'm sorry, this is not productive. I'm telling you it is there intentionally, and you ignored that comment.
Perhaps i would recommend to, instead of saying what should be done, to say what you believe goes wrong
with the things they are now.

@andreas-abel
Copy link
Contributor Author

andreas-abel commented Sep 14, 2023

I'm telling you it is there intentionally, and you ignored that comment.

I'm sorry. Would you mind sharing why the time unit for the cv is intentionally "ns" even though it is a dimensionless quantity?

Also note that the console output does not contain a time unit for the cv. Is the console output intentionally inconsistent with the json output?

@LebedevRI
Copy link
Collaborator

I'm telling you it is there intentionally, and you ignored that comment.

I'm sorry. Would you mind sharing why the time unit for the cv is intentionally "ns" even though it is a dimensionless quantity?

tool/compare.py (namely partition_benchmarks() in tools/gbench/report.py)
requires time_unit to be present to correctly distinguish the repetitions of the same benchmark.

Also note that the console output does not contain a time unit for the cv. Is the console output intentionally inconsistent with the json output?

time_unit does not mean "the dimension/unit of the whatever we happened to cram into
the 'real_time'/`cpu_time' fields" (i mean, cv is clearly not time :)).
We don't have a field that is intentionally designed to supply such information.
I'm guessing that is what you were looking for?

@LebedevRI
Copy link
Collaborator

Ok, apparently i forgot. There is aggregate_unit = percentage field for aggregates.

@andreas-abel
Copy link
Contributor Author

tool/compare.py (namely partition_benchmarks() in tools/gbench/report.py) requires time_unit to be present to correctly distinguish the repetitions of the same benchmark.

It works just fine, though, if I set the time_unit to the empty string instead of removing it.

I have reverted the JSON change for now, though I'm not yet really convinced yet that there isn't a better option.

time_unit does not mean "the dimension/unit of the whatever we happened to cram into the 'real_time'/`cpu_time' fields"

This is very surprising (and begs the question what it does mean then). In particular, given that there is a field with the identical name time_unit in the Run struct in benchmark.h, and two similarly named fields real_accumulated_time and cpu_accumulated_time, and that there is a function with the comment Return a value representing the real time per iteration in the unit specified by 'time_unit'.

It would be really helpful if such non-obvious and non-intuitive design choices could be documented somewhere.

Copy link
Collaborator

@LebedevRI LebedevRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The csv change seems fine to me, but we really are ought to make a decision
on the future of that reporter. Is it deprecated with no new functionality
to be accepted or not? @dmah42

@dmah42
Copy link
Member

dmah42 commented Sep 15, 2023

the plan of record is: don't add new functionality because we should instead rely on the JSON output and have a separate tool to create the CSV from JSON.

this got stuck behind an aborted effort to have a v2 that had some deeper API changes. i don't have a strong opinion on whether we should continue to try to deprecate the CSV output or not, tbh. while it's here and in use, we should at least make sure it's correct though.

@dmah42 dmah42 merged commit 7736df0 into google:main Sep 26, 2023
54 of 60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants