-
-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor calibration logs #2074
Conversation
78dc388
to
4022f5f
Compare
06533a6
to
d660784
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2074 +/- ##
==========================================
- Coverage 98.30% 98.21% -0.09%
==========================================
Files 87 87
Lines 4140 4156 +16
==========================================
+ Hits 4070 4082 +12
- Misses 70 74 +4 ☔ View full report in Codecov by Sentry. |
d908d7f
to
095741f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work here. I really like the cartesian logging. That's a really great start to the sort of "group by" feature we were thinking about.
I've left a few small comments/suggestions. Feel free to respond directly on those.
BTW, please do not force push every commit. It's helpful to see what comes in follow on commits!
Finally, if you're interested in getting more involved in Mitiq development, feel free to join our discord at http://discord.unitary.fund, as well as our weekly Mitiq development calls on Fridays at noon ET.
mitiq/calibration/settings.py
Outdated
@@ -126,6 +126,13 @@ def to_dict(self) -> Dict[str, Any]: | |||
def __repr__(self) -> str: | |||
return str(self.to_dict()) | |||
|
|||
def __str__(self) -> str: | |||
result: str = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove unnecessary typehints (such as this one) where possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
lines = str(ghz_problem).split("\n") | ||
ghz_problem_dict = ghz_problem.to_dict() | ||
for line in lines: | ||
[title, value] = line.split(":", 1) | ||
key = title.lower().replace(" ", "_") | ||
value = value.strip() | ||
assert key in ghz_problem_dict | ||
assert value == str(ghz_problem_dict[key]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this block (along with some of the changes below) into it's own test that checks the code related to string formatting. That way we can keep the tests small and modular.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
mcount = 0 | ||
ncount = 0 | ||
for line in captured.out.split("\n"): | ||
if "Mitigated error: " in line: | ||
mcount += 1 | ||
if "Noisy error: " in line: | ||
ncount += 1 | ||
assert mcount == (len(cal.strategies) * len(cal.problems)) | ||
assert ncount == (len(cal.strategies) * len(cal.problems)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is checked in other parts of the tests. Feel free to drop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check makes sure that the logging output contains mitigated and noisy errors for all strategy/problem combination, i.e. one performance record per combination. I reworked this a little bit.
mitiq/calibration/calibrator.py
Outdated
def run(self, log: bool = False) -> None: | ||
def run(self, log: bool = False, log_cartesian: bool = False) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we make log
a string which can either be "all"
or "cartesian"
? Do you think this is a simpler inferface, or more complicated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it wasn't a very good idea to call this parameter "log" because it is not logging but rather the output of calibrations results in some form.
I have couple ideas regarding this.
The calibration is nothing more than just running a set of experiments with different executors/circuits/techniques. So in the end you have a bunch of results (now it is 3 numpy arrays) which can be represented in plenty of different ways. We can wrap these arrays into a pandas dataset and have a very convenient sql like query interface for these data. This also can be used as a persistent storage (I am thinking about this #1934)
Of course, we can provide a predefined set of the calibration results representations like now it can be just "flat" (not "all") and "cartesian". But still users can use pandas interface to do their own analysis.
I am also thinking about the case when we run all these experiments periodically on various hardware and simulators and see if there are any changes (improvements/degradations). In this case I would like to put the calibration results into a database from where I can easily gather necessary data.
BTW, having old plain logs while calibrator is working can also be useful and it is better to rely here on old plain log level.
mitiq/calibration/calibrator.py
Outdated
│ benchmark │ strategy │ performance │ | ||
├────────────────────────────────────────────┼────────────────────────────────┼─────────────────────────┤ | ||
│ Type: rb │ Technique: ZNE │ ✔ │ | ||
│ Ideal distribution: {'00': 1.0} │ Factory: RichardsonFactory │ Noisy error: 0.1053 │ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's super useful to the user to see the "Ideal distribution" so let's drop that here (and below in the cartesian logging).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me it is useful since I can easily see what it should look like in the ideal case. But you are right it could make the output too dirty. Removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I can see when it would be helpful, but I'm also worried that the table will become less and less readable if you take larger and large benchmarks (e.g. if you have 4 qubits and many possible outcome states).
@natestemen Thanks a lot for the review and valuable comments. And thank you for the invitation to the Mitiq dev call on Friday. I am definitely interested to join. |
ec04601
to
a57ed60
Compare
the last force push is due to rebase to master |
Introduce two different ways of logging calibration results: * As a 3 column table. benchmark_0 | strategy_0 | performance_00 benchmark_0 | strategy_1 | performance_01 benchmark_1 | strategy_0 | performance_10 benchmark_1 | strategy_1 | performance_11 * As a cartesian product of benchmark/strategy. In this case the performance is at intersection of particular benchmark and particular strategy. | | benchmark_0 | benchmark_1 | | strategy_0 | performance_00 | performance_01 | | strategy_1 | performance_10 | performance_11 | Benchmarks, strategies and performances are miltiline stringified dicts. This makes it easy to maintain the calibration logging for all new benchmarks/strategies as it only assumes implementing to_dict methods for them. Fixes unitaryfund#2012
StrEnum class is available in the standard library since 3.11.
a57ed60
to
bd0daba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for your patience on this. Everything is looking really good! Just one code change requested, and one small question. Otherwise I think it's ready to go!
mitiq/calibration/calibrator.py
Outdated
class OutputForm(StrEnum): | ||
flat = auto() | ||
cartesian = auto() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to check my understanding, the reason we need to use StrEnum
here as opposed to enum.StrEnum
is because enum.StrEnum
was added in python 3.11 and we want to make sure this works with python version below that. That right?
Oh, and we should update the calibration tutorial here:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added as suggestions since I can't push to branch directly.
The docs failure is due to a bug in Qiskit Aer: Qiskit/qiskit-aer#2008. |
Rebasing/merging this branch on master should solve the issue as we are on the latest versions of qiskit on master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misty and I had a go over of this and it looks good. Got a few things wrapped up. Thanks a lot Vladimir for this!
Introduce two different ways of logging calibration results:
Benchmarks, strategies and performances are miltiline stringified dicts. This makes it easy to maintain the logging for all new benchmarks and strategies as it only assumes implementing to_dict methods for them.
Fixes #2012
Description
License
Before opening the PR, please ensure you have completed the following where appropriate.