Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor calibration logs #2074

Merged
merged 10 commits into from
Dec 7, 2023
Merged

Conversation

kozhukalov
Copy link
Contributor

@kozhukalov kozhukalov commented Oct 26, 2023

Introduce two different ways of logging calibration results:

  • As a 3 column table.
benchmark_0 | strategy_0 | performance_00
benchmark_0 | strategy_1 | performance_01
benchmark_1 | strategy_0 | performance_10
benchmark_1 | strategy_1 | performance_11
  • As a cartesian product of benchmark/strategy. In this case the performance is at intersection of particular benchmark and particular strategy.
|            | benchmark_0    | benchmark_1    |
| strategy_0 | performance_00 | performance_01 |
| strategy_1 | performance_10 | performance_11 |

Benchmarks, strategies and performances are miltiline stringified dicts. This makes it easy to maintain the logging for all new benchmarks and strategies as it only assumes implementing to_dict methods for them.

Fixes #2012

Description


License

  • I license this contribution under the terms of the GNU GPL, version 3 and grant Unitary Fund the right to provide additional permissions as described in section 7 of the GNU GPL, version 3.

Before opening the PR, please ensure you have completed the following where appropriate.

@codecov
Copy link

codecov bot commented Oct 29, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (638e31b) 98.30% compared to head (54c1427) 98.21%.
Report is 9 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2074      +/-   ##
==========================================
- Coverage   98.30%   98.21%   -0.09%     
==========================================
  Files          87       87              
  Lines        4140     4156      +16     
==========================================
+ Hits         4070     4082      +12     
- Misses         70       74       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kozhukalov kozhukalov force-pushed the issue_2012 branch 4 times, most recently from d908d7f to 095741f Compare October 29, 2023 22:29
Copy link
Member

@natestemen natestemen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work here. I really like the cartesian logging. That's a really great start to the sort of "group by" feature we were thinking about.

I've left a few small comments/suggestions. Feel free to respond directly on those.

BTW, please do not force push every commit. It's helpful to see what comes in follow on commits!

Finally, if you're interested in getting more involved in Mitiq development, feel free to join our discord at http://discord.unitary.fund, as well as our weekly Mitiq development calls on Fridays at noon ET.

@@ -126,6 +126,13 @@ def to_dict(self) -> Dict[str, Any]:
def __repr__(self) -> str:
return str(self.to_dict())

def __str__(self) -> str:
result: str = ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove unnecessary typehints (such as this one) where possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 125 to 132
lines = str(ghz_problem).split("\n")
ghz_problem_dict = ghz_problem.to_dict()
for line in lines:
[title, value] = line.split(":", 1)
key = title.lower().replace(" ", "_")
value = value.strip()
assert key in ghz_problem_dict
assert value == str(ghz_problem_dict[key])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this block (along with some of the changes below) into it's own test that checks the code related to string formatting. That way we can keep the tests small and modular.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 388 to 396
mcount = 0
ncount = 0
for line in captured.out.split("\n"):
if "Mitigated error: " in line:
mcount += 1
if "Noisy error: " in line:
ncount += 1
assert mcount == (len(cal.strategies) * len(cal.problems))
assert ncount == (len(cal.strategies) * len(cal.problems))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is checked in other parts of the tests. Feel free to drop it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check makes sure that the logging output contains mitigated and noisy errors for all strategy/problem combination, i.e. one performance record per combination. I reworked this a little bit.

Comment on lines 311 to 299
def run(self, log: bool = False) -> None:
def run(self, log: bool = False, log_cartesian: bool = False) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we make log a string which can either be "all" or "cartesian"? Do you think this is a simpler inferface, or more complicated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it wasn't a very good idea to call this parameter "log" because it is not logging but rather the output of calibrations results in some form.

I have couple ideas regarding this.

The calibration is nothing more than just running a set of experiments with different executors/circuits/techniques. So in the end you have a bunch of results (now it is 3 numpy arrays) which can be represented in plenty of different ways. We can wrap these arrays into a pandas dataset and have a very convenient sql like query interface for these data. This also can be used as a persistent storage (I am thinking about this #1934)

Of course, we can provide a predefined set of the calibration results representations like now it can be just "flat" (not "all") and "cartesian". But still users can use pandas interface to do their own analysis.

I am also thinking about the case when we run all these experiments periodically on various hardware and simulators and see if there are any changes (improvements/degradations). In this case I would like to put the calibration results into a database from where I can easily gather necessary data.

BTW, having old plain logs while calibrator is working can also be useful and it is better to rely here on old plain log level.

│ benchmark │ strategy │ performance │
├────────────────────────────────────────────┼────────────────────────────────┼─────────────────────────┤
│ Type: rb │ Technique: ZNE │ ✔ │
│ Ideal distribution: {'00': 1.0} │ Factory: RichardsonFactory │ Noisy error: 0.1053 │
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's super useful to the user to see the "Ideal distribution" so let's drop that here (and below in the cartesian logging).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it is useful since I can easily see what it should look like in the ideal case. But you are right it could make the output too dirty. Removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I can see when it would be helpful, but I'm also worried that the table will become less and less readable if you take larger and large benchmarks (e.g. if you have 4 qubits and many possible outcome states).

mitiq/calibration/calibrator.py Outdated Show resolved Hide resolved
@kozhukalov
Copy link
Contributor Author

@natestemen Thanks a lot for the review and valuable comments. And thank you for the invitation to the Mitiq dev call on Friday. I am definitely interested to join.

@kozhukalov kozhukalov force-pushed the issue_2012 branch 2 times, most recently from ec04601 to a57ed60 Compare November 1, 2023 06:00
@kozhukalov
Copy link
Contributor Author

the last force push is due to rebase to master

@Misty-W Misty-W modified the milestones: 0.31.0, 0.32.0 Nov 1, 2023
Introduce two different ways of logging calibration results:
* As a 3 column table.
  benchmark_0 | strategy_0 | performance_00
  benchmark_0 | strategy_1 | performance_01
  benchmark_1 | strategy_0 | performance_10
  benchmark_1 | strategy_1 | performance_11

* As a cartesian product of benchmark/strategy. In this case
  the performance is at intersection of particular benchmark and
  particular strategy.
  |            | benchmark_0    | benchmark_1    |
  | strategy_0 | performance_00 | performance_01 |
  | strategy_1 | performance_10 | performance_11 |

Benchmarks, strategies and performances are miltiline
stringified dicts. This makes it easy to maintain the calibration
logging for all new benchmarks/strategies as it only assumes
implementing to_dict methods for them.

Fixes unitaryfund#2012
StrEnum class is available in the standard
library since 3.11.
Copy link
Member

@natestemen natestemen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for your patience on this. Everything is looking really good! Just one code change requested, and one small question. Otherwise I think it's ready to go!

Comment on lines 37 to 39
class OutputForm(StrEnum):
flat = auto()
cartesian = auto()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to check my understanding, the reason we need to use StrEnum here as opposed to enum.StrEnum is because enum.StrEnum was added in python 3.11 and we want to make sure this works with python version below that. That right?

mitiq/calibration/tests/test_calibration.py Outdated Show resolved Hide resolved
@natestemen
Copy link
Member

Oh, and we should update the calibration tutorial here:

Copy link
Contributor

@Misty-W Misty-W left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added as suggestions since I can't push to branch directly.

dev_requirements.txt Outdated Show resolved Hide resolved
mitiq/calibration/calibrator.py Outdated Show resolved Hide resolved
mitiq/calibration/calibrator.py Outdated Show resolved Hide resolved
mitiq/calibration/calibrator.py Outdated Show resolved Hide resolved
mitiq/calibration/calibrator.py Outdated Show resolved Hide resolved
mitiq/calibration/calibrator.py Outdated Show resolved Hide resolved
requirements.txt Outdated Show resolved Hide resolved
mitiq/calibration/calibrator.py Outdated Show resolved Hide resolved
mitiq/calibration/tests/test_calibration.py Outdated Show resolved Hide resolved
@Misty-W
Copy link
Contributor

Misty-W commented Dec 5, 2023

The docs failure is due to a bug in Qiskit Aer: Qiskit/qiskit-aer#2008.

@natestemen
Copy link
Member

The docs failure is due to a bug in Qiskit Aer: Qiskit/qiskit-aer#2008.

Rebasing/merging this branch on master should solve the issue as we are on the latest versions of qiskit on master.

Copy link
Member

@natestemen natestemen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misty and I had a go over of this and it looks good. Got a few things wrapped up. Thanks a lot Vladimir for this!

@natestemen natestemen merged commit 36c5a14 into unitaryfund:master Dec 7, 2023
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create group_by_strategy function for Calibrator
3 participants