Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-group curve analysis #824

Merged

Conversation

nkanazawa1989
Copy link
Collaborator

Summary

This is the final piece of group curve analysis #737. This PR adds MultiGroupCurveAnalysis. This curve analysis takes a single set of models and reuses it for multiple (independent) datasets. This analysis simplifies the workflow of analysis for experiments that runs the same circuits with different conditions, such as two-qubit experiment with different control qubit states, e.g. Ham tomo, HEAT, JAZZ, etc...

close #737

Details and comments

CrossResonanceHamiltonianAnalysis has been updated as an example use case. This drastically reduces execution time of the analysis.

With this PR

{0} test.test_cross_resonance_hamiltonian.TestCrossResonanceHamiltonian.test_integration_1__1000000_0__2000000_0__1000_0___3000000_0___2000000_0__10000_0_ [5.978807s] ... ok
{0} test.test_cross_resonance_hamiltonian.TestCrossResonanceHamiltonian.test_integration_2___1000000_0___2000000_0__1000_0__3000000_0__2000000_0__10000_0_ [6.443986s] ... ok
{0} test.test_cross_resonance_hamiltonian.TestCrossResonanceHamiltonian.test_integration_3__10000_0__20000_0__1000_0__5000000_0__1000000_0__2000_0_ [23.903643s] ... ok
{0} test.test_cross_resonance_hamiltonian.TestCrossResonanceHamiltonian.test_integration_4__10000_0___1000_0__1000_0__500000_0__1000_0___1000_0_ [5.246760s] ... ok
{0} test.test_cross_resonance_hamiltonian.TestCrossResonanceHamiltonian.test_integration_5___100000_0__120000_0__1000_0__150000_0___110000_0___1000_0_ [4.878105s] ... ok

Original

{0} test.test_cross_resonance_hamiltonian.TestCrossResonanceHamiltonian.test_integration_1__1000000_0__2000000_0__1000_0___3000000_0___2000000_0__10000_0_ [119.991389s] ... ok
{0} test.test_cross_resonance_hamiltonian.TestCrossResonanceHamiltonian.test_integration_2___1000000_0___2000000_0__1000_0__3000000_0__2000000_0__10000_0_ [167.573162s] ... ok
{0} test.test_cross_resonance_hamiltonian.TestCrossResonanceHamiltonian.test_integration_3__10000_0__20000_0__1000_0__5000000_0__1000000_0__2000_0_ [181.881969s] ... ok
{0} test.test_cross_resonance_hamiltonian.TestCrossResonanceHamiltonian.test_integration_4__10000_0___1000_0__1000_0__500000_0__1000_0___1000_0_ [76.284761s] ... ok
{0} test.test_cross_resonance_hamiltonian.TestCrossResonanceHamiltonian.test_integration_5___100000_0__120000_0__1000_0__150000_0___110000_0___1000_0_ [43.422981s] ... ok

fix #668

Perhaps multi-thread execution of group fit can further improve performance?

@nkanazawa1989 nkanazawa1989 added the Changelog: New Feature Include in the "Added" section of the changelog label Jun 9, 2022
@nkanazawa1989 nkanazawa1989 added this to the Release 0.4 milestone Jun 9, 2022
Copy link
Contributor

@eggerdj eggerdj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot. Looks mostly fine.

qiskit_experiments/curve_analysis/base_curve_analysis.py Outdated Show resolved Hide resolved
Co-authored-by: Daniel J. Egger <38065505+eggerdj@users.noreply.github.com>
Copy link
Contributor

@wshanks wshanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments are all suggestions for making the class easier to understand, but I think the implementation looks good.

in different measurement basis ("model") for each setting.
:class:`MultiGroupCurveAnalysis` class is convenient to write a fit model for such experiment.

Here is an example code to define a :class:`.MultiGroupCurveAnalysis` instance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about subclassing? Does one do about the same call but to super().__init__() instead of to MultiGroupCurveAnalysis (vs. defining class variables)?

Coming back to this after looking at the Ham tomo analysis, it looks like the main thing one would do besides calling super().__init__() for a subclass would be implementing _generate_fit_guesses().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need to implement subclass if you need to compute new quantity, and I think this is main use case of group fit. So I think this class is not directly used in the most case in practice.

releasenotes/notes/curve-analysis-02a702a81e014adf.yaml Outdated Show resolved Hide resolved
releasenotes/notes/curve-analysis-02a702a81e014adf.yaml Outdated Show resolved Hide resolved
releasenotes/notes/curve-analysis-02a702a81e014adf.yaml Outdated Show resolved Hide resolved
from .utils import analysis_result_to_repr, eval_with_uncertainties


class MultiGroupCurveAnalysis(BaseCurveAnalysis):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing I wonder about is if it should be easy to use a CurveAnalysis class in a MultiGroupCurveAnalysis class. Other than specifying the groups, the only difference seems to be the optional _create_composite_analysis_results method. For example, if one wants to run three separate sinusoidal measurements together, could there be a way to use OscillationAnalysis and name the group keys rather than respecify the model and the fit guess logic from OscillationAnalysis? I am just wondering. I don't think it is critical if there is not an easy way. With all the recent fitting work, making a new class is not that much work (and some of the initial guess logic is factored out into reusable functions already, making it even easier to specify the models and initial guess method).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, indeed this is what I was thinking of at the beginning, i.e

analysis = MultiGroupGurveAnalysis(
    analyses=[OscillationAnalysis(name="curve1"), OscillationAnalysis(name="curve2")],
    group_keys = {"curve1": {"option": 1}, "curve2": {"option": 2}},
)

Seems like this is much cleaner, but actually implementation of drawer is challenging because the logic to draw figure is in each CurveAnalysis (so this generates two separate figure), but MultiGroupGurveAnalysis may want a single figure with two lines in the same canvas. This looked bit complicated and I gave up to simplify the logic, but I'll look into this direction if I can come up with some clean logic.

…ly calling _run_analysis can solve the issue of drawing, i.e. it gives back the control of drawer to composite instance. implementation is bit redundant since many method returns NotImplemented, but this will be solved by promoting CurveData to dataframe, where we can manage data blongs to different group in the same object. new standard analysis BlochTrajectoryAnalysis is added as a basis of CR hamiltonian analysis, which could be used outside the context of CR Hamiltonian.
@nkanazawa1989
Copy link
Collaborator Author

Thanks for reviewing. I made major update in 32e0fcf based on discussion in #824 (comment). I was thinking to call _run_analysis of child curve analysis, resulting in separate curve figures rather than having one figure with two curves. This issue is addressed by calling subclass hook methods without generating the figure. Now group fit takes analysis instance, so we can reuse existing fit library.

In addition, CR Hamtomo analysis is now split into two pieces BlochTrajectoryAnalysis and CrossResonanceHamiltonianAnalysis. Indeed CR Hamtomo experiment is investigation of target qubit dynamics, and the BlochTrajectoryAnalysis can be used outside the context of CR, e.g. it can analyze single qubit Rabi oscillation trajectory in 3D Bloch sphere. In this sense, this core analysis class is moved to the standard library.

Currently implementation of CompositeCurveAnalysis is bit inefficient because many hook methods return NotImplemented and actual logic is hard-coded in _run_analysis which developer cannot override to customize. We can move logic to each hook by introducing dataframe as a replacement of CurveData, i.e. dataframe can manage dataset consisting of multiple group by adding new column for group tag.

For example, we can still write

    def _run_data_processing(
        self,
        raw_data: List[Dict],
        models: List[lmfit.Model],
    ) -> Dict[str, CurveData]:
        out = {}
        for analysis in self._analyses:
            out[analysis.name] = analysis._run_data_processing(raw_data, analysis.models)
        return out

but developer must be careful for data type, because it returns dictionary rather than CurveData. With dataframe it always returns a single dataframe regardless of composite or single curve analysis.

Copy link
Contributor

@wshanks wshanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I wonder though if all the abstract methods that need to be overridden but unused point to the BaseCurveAnalysis design being off from what is really needed for the base class.

One thing that is kind of neat. You could possibly use this to run multiple independent analyses on the same data set. I am not sure what in practice you would want to do that for.

List of fit options that are passed to the fitter function.
"""
# This method is delegated to self.analyses
return NotImplemented
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all these NotImplemented methods were pushed down from CurveAnalysis to BaseCurveAnalysis for the sake of CompositeCurveAnalysis. If we like how CompositeCurveAnalysis works now, should we consider lifting them back up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe CompositeCurveAnalysis just should inherit from BaseAnalysis? Is it using any code from BaseCurveAnalysis?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. This NotImplemented is because of missing group context in CurveData. For example,

  • _format_data
  • _run_data_processing
  • _run_curve_fit
  • _create_curve_data

These methods can be implemented on composite class by introducing pandas data frame as a replacement of CurveData class. This is because data frame keeps metadata and we can store experiment results of different groups in the same object, and later filter the results by using the metadata. I'm going to upgrade CurveData in follow-up.

Perhaps _generate_fit_guesses is tied to each sub analysis, because FitOptions is not aware of groups. So this method can move to CurveAnalysis.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation of several methods (_format_data, _run_data_processing, _run_curve_fit) are moved to CurveAnalysis, and _generate_fit_guesses is removed from BaseCurveAnalysis because this is called inside _run_curve_fit. By replacing CurveData with data frame, we can implement these methods in CompositeAnalysis rather than implementing everything in _run_analysis as in this PR. This allows users to easily customize the behavior of data processing, formatting, and analysis without entirely overriding the _run_analysis method.

Please check d3a5de4

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you imagine _format_data, _run_data_processing, and _run_curve_fit being implemented in CompositeCurveAnalysis with the updated CurveData dataframe? Will you remove the loop over analyses from _run_analysis and have each smaller method do its own loop over groups? What is a scenario for which a user might want to override one of these methods? Most of the real logic would be in the individual analyses. I think the user would be more likely to override the methods of the classes of the analyses.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense. These methods are removed in 9b97df6. Now composite curve analysis is a direct subclass of BaseAnalysis.

releasenotes/notes/curve-analysis-02a702a81e014adf.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@eggerdj eggerdj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks okay to me. I'm approving but please consider the remaining suggestions for the docs.

Comment on lines 159 to 160
# This method is delegated to self.analyses
return NotImplemented
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this goes back to Will's comment on the base class. I'm fine with this but perhaps we can simplify the docs and just write

"""This method is delegated to self.analyses"""

Seems a bit too much to have Args and Returns to then have a NotImplemented. By the way should this not be raise NotImplemented? I'm not familiar with return NotImplemented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NotImplementedis mainly used to indicate that special methods like __add__ are not implemented so that Python can try __radd__ instead (see here). For a custom method like this, raising NotImplementedError might be more standard.

qiskit_experiments/curve_analysis/curve_analysis.py Outdated Show resolved Hide resolved
# This code respects the ordering of parameters so that it matches with
# the signature of fit function and it is backward compatible.
# In principle this should not matter since LMFIT maps them with names
# rather than index. Need more careful investigation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to open issue for this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looping is not significant overhead compared with other lines (I've done some profiling). So this is okey for now with this comment.

\end{align}

where :math:`t' = t + t_{\rm offset}` with :math:`t` is pulse duration to scan
and :math:`t_{\rm offset}` is an extra fit parameter that may represent the edge effect.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
and :math:`t_{\rm offset}` is an extra fit parameter that may represent the edge effect.
and :math:`t_{\rm offset}` is an extra fit parameter that may represent a edge effect.

Perhaps clarify what you mean by edge effect.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Done in 7ae1d01

Copy link
Contributor

@wshanks wshanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some phrasing suggestions. Otherwise, this looks good to me. I still am not sure about the split in methods between BaseCurveAnalysis and CompositeCurveAnalysis. It feels like CompositeCurveAnalysis is mainly a manager of other BaseCurveAnalysis instances and does not itself have much that needs to be overridden.

The motivation for supporting all the methods within CompositeCurveAnalysis would be if we wanted CompositeCurveAnalysis to be able to take CompositeCurveAnalysis instances in its analyses inputs in addition to CurveAnalysis instances.

qiskit_experiments/curve_analysis/base_curve_analysis.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/base_curve_analysis.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/curve_analysis.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/curve_analysis.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/curve_analysis.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/curve_analysis.py Outdated Show resolved Hide resolved
This method creates analysis results for important fit parameters
that might be defined by analysis options ``result_parameters``.

.. rubric:: _create_curve_data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a list of methods the user might override, right? Why would a user override this method?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. User can update the method to format data structure as they want. However, this will be removed once we switch to data frame because it doesn't need formatting and can be saved as-is.

qiskit_experiments/curve_analysis/curve_analysis.py Outdated Show resolved Hide resolved
releasenotes/notes/curve-analysis-02a702a81e014adf.yaml Outdated Show resolved Hide resolved
Comment on lines 159 to 160
# This method is delegated to self.analyses
return NotImplemented
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NotImplementedis mainly used to indicate that special methods like __add__ are not implemented so that Python can try __radd__ instead (see here). For a custom method like this, raising NotImplementedError might be more standard.

nkanazawa1989 and others added 2 commits July 9, 2022 08:01
Co-authored-by: Will Shanks <willshanks@us.ibm.com>
Co-authored-by: Daniel Egger <38065505+eggerdj@users.noreply.github.com>
…89/qiskit-experiments into feature/group_curve_analysis
Copy link
Contributor

@wshanks wshanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@nkanazawa1989 nkanazawa1989 merged commit d63a6b1 into Qiskit-Extensions:main Jul 11, 2022
@nkanazawa1989 nkanazawa1989 deleted the feature/group_curve_analysis branch July 11, 2022 13:54
@nkanazawa1989 nkanazawa1989 restored the feature/group_curve_analysis branch July 12, 2022 03:06
@nkanazawa1989 nkanazawa1989 deleted the feature/group_curve_analysis branch October 27, 2022 06:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: New Feature Include in the "Added" section of the changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CurveAnalysis refactoring Cross-resonance integration tests fail in my environment
3 participants