Curve fitting analysis helper functions #19

chriseclectic · 2021-03-17T19:36:40Z

Summary

~~Adds an analysis class wrapping scipy.optimize.curve_fit for reuse in other experiments.~~

Adds helper functions for writing curve fitting analysis classes. These include functions for extracting single or multi-series x, y, sigma data, fitting functions for a single or multi curve fit, and plotting functions.

Details and comments

As an example of this being used for RB analysis see this notebook

zlatko-minev · 2021-03-17T20:08:39Z

I can suggest lmfit as a really good and flexible package we’ve used. Has very nice models.
Main thing is to have a function to make inital guesses. Lmfit has some guesses built in. In my experience, for custom functions and models one ideally wants to use their own custom guess function.

lmfit builds on and extends many of the optimization methods of scipy.optimize. It has been mature for some time. Handles many non linear fit methods and provides useful enhancements
Main things: parameter and results classes, and the minimize function interface for general models

qiskit_experiments/analysis/curve_fit_analysis.py

nkanazawa1989 · 2021-03-18T10:48:44Z

Just a quick comment: How easily can we extend this base class for analysis of multiple or series of experiments? such as IRB. I think having data in dataframe gives more flexibility. For example
https://github.com/nkanazawa1989/qiskit-ignis/blob/3a472e13d93306e418bdeb2d0c6870a895372a82/qiskit/ignis/experiments/calibration/cal_base_analysis.py#L86-L104

I can write MVP :)

yaelbh · 2021-03-18T14:29:43Z

I'm also asking myself the same question as @nkanazawa1989. I expect something more flexible, that allows multiple fittings and some processing of them within one analysis result. Some of this sort of things is already supported as a CompositeExperiment. Also not everyone must inherit from CurveFitAnalysis. But we can think of extracting out the chisq function, and then CurveFitAnalysis can call it but also other analysis classes.

qiskit_experiments/analysis/curve_fit_analysis.py

eggerdj · 2021-03-19T08:21:41Z

For the calibration module we have our own base analysis class which looks very similar: https://github.com/Qiskit/qiskit-experiments/pull/20/files#diff-746d34cd1c5ff8ee88c96d2914cacbf4c98b02d3b9acb3ba9c0958da5df5d0ac I'd be happy to subclass from CurveFitAnalysis but it needs to be able to handle multiple curves on the same plot (think of a Ramsey or a drag calibration).

qiskit_experiments/analysis/curve_fit_analysis.py

chriseclectic · 2021-03-19T14:53:16Z

After discussing offline a bit with Oliver I want to rethink the metadata for some more complicated cases he suggested including multiple objective functions. This is probably more along lines of what you suggest @nkanazawa1989.

@eggerdj @nkanazawa1989 @yaelbh before going any further with this PR let's discuss more offline, and try come up with something that also includes everything you need for the curve fitting portion of the BaseCalibrationAnalysis class.

chriseclectic · 2021-03-26T16:20:57Z

I refactored this into some library functions for curve fitting and plotting. These functions are:

curve_fit wrapper for scipy curve_fit which returns an AnalysisResult object
multi_curve_fit a function for fitting multiple functions to indexed data over the same optimization parameters. This handles linearizing the multi objective and calling curve_fit.
plot_curve_fit generate a line plot with shaded confidence interval from curve fit result
plot_scatter generate scatter plot of (xdata, ydata) data
plot_errorbar generate errorbar plot of (xdata, ydata, sigma) data

The CurveFitAnalysis class puts them together for an example curve fit of count data, but is mostly a place holder since it still needs work for model definition and data processing which is being discussed and worked on in other PRs.

eggerdj

Here are some comments to emphasis where the relations between this PR and #22 are located in the code.

qiskit_experiments/analysis/curve_fit_analysis.py

chriseclectic · 2021-04-08T16:26:29Z

I update this PR again to remove the actual CurveFitAnalysis class and just include the library of helper functions. I think this can be merged first for use in other classes (such as T1, RB), and writing an actual curve fit analysis base class can come later after more discussion.

If sigma is provided the mean and variance is computed using inverse-variance weighting, otherwise it is computed as the sample mean and biased sample variance.

oliverdial · 2021-04-13T16:00:25Z

qiskit_experiments/analysis/curve_fitting.py

+
+    # Check data set has at least 1 more values than model parameters
+    num_params = len(param_p0)
+    if len(ydata) <= num_params:


this should probably be num_params+1 otherwise chisq diverges

oliverdial · 2021-04-13T16:03:21Z

qiskit_experiments/analysis/curve_fitting.py

+    xdata_range = [min(xdata), max(xdata)]
+
+    result = {
+        "popt": popt,


question; if the user passed in a dict, should we return a dict here?

To avoid having two different return types, i left this as an array, and if p0 was supplied with a dict the popt_keys will include the corresponding key labels for each param

oliverdial · 2021-04-13T16:04:22Z

qiskit_experiments/analysis/curve_fitting.py

+    """
+    # Format p0 parameters if specified as dictionary
+    if isinstance(p0, dict):
+        param_keys = list(p0.keys())


I think this will make the ordering of p0 as a vector unstable, which will probably mess someone up when they assume the covariance matrix is always ordered the same. Maybe sort alphabetically by key or something?

Currently the popt, popt_keys, popt_err, pcov ordering will be the ordering of dict.keys. The popt_keys is intended to remove any ambiguity for which params these are. I don't think it makes that much more sense to order alphabetically instead of the order the dict was originally constructed if p0 is a dict.

* Fix error in level2_probabilities multiplying by shots instead of dividing * Add explicit method kwarg for mean_xy_data

qiskit_experiments/analysis/curve_fitting.py

yaelbh · 2021-04-14T14:08:46Z

qiskit_experiments/analysis/curve_fitting.py

+    for i, datum in enumerate(filter_data):
+        metadata = datum["metadata"]
+        xdata[i] = metadata[x_key]
+        y_mean, y_var = data_processor(datum)


The data process should be aware of whether sigma is absolute

This one is not a mistake. The computation of sigma is different for different values of absolute_sigma.

@eggerdj Note the previous comment because it's relevant to #22

The choice of setting absolute_sigma = False/True should depend on how sigma was calculated, not the other way around. Absolute sigma is a property of curve fit function for how it calculates pcov (and hence how we deduce popt_err). According to scipy docs this is pcov(absolute_sigma=False) = pcov(absolute_sigma=True) * chisq(popt)/(M-N)

By defaulting to True we are assuming that sigma is usually specified in the same units of y, so in that sense we should make sure any data processors calculate sigma in the same units as y by default as well.

yaelbh · 2021-04-14T14:22:04Z

qiskit_experiments/analysis/curve_fitting.py

+    residues = (yfits - ydata) ** 2
+    if sigma is not None:
+        residues = residues / (sigma ** 2)
+    chisq = np.sum(residues) / (len(yfits) - num_params)


Do we still want to name this field of the analysis result chisq, or change it to reduced_chisq ? Also, maybe update the function documentation that reduced chi-square is calculated.

We've had a discussion: if xdata consists of n points, and each was measured for s shots, should we fit n points, with the n means of s measurements, or should we fit n*s points? I argued that both are equivalent, because the objective function that's minimized by the least-squares algorithm is the same. This is still true (assuming that I was right in the first place), however after the fitting the returned reduced chi-square is different.

For you second question: After discussing this a bit with @oliverdial, I think the answer was they are only equivalent under some assumptions of the underlying noise models being the same. Eg. sampling noise vs single-shot measurement noise, is it gaussian or not etc. In general the sigma values could be different in the two cases, P(0) for N-shots is binomial dist but sigma for each single shot depends on measurement discriminator error, so this can effect chi-squared and pcov

qiskit_experiments/analysis/curve_fitting.py

* Change chisq -> reduced_chisq in result and docs * add dof (degrees of freedom) to returned results * add documentation about sigma by default being absolute sigma.

yaelbh · 2021-04-18T09:45:23Z

qiskit_experiments/analysis/curve_fitting.py

+
+    # Run linearized curve_fit
+    analysis_result = curve_fit(f, xdata, ydata, p0, sigma=wsigma, bounds=bounds, **kwargs)
+


The code of this function is difficult to follow in the lack of internal documentation. Also, in the doc string, the meaning of the series argument is not clear. The text at the beginning of the doc string implies that xdata, ydata, and sigma are expected to be 2D arrays, but then it turns out that they are 1D arrays, and the connection is not clear.

CLAassistant · 2021-04-19T15:25:46Z

All committers have signed the CLA.

* Add CurveFitAnalysis class * Get outcome label from metadata * Fix import error message if no matplotlib * Add option for fitting means of batched data sets * Reorganize helper functions into separate file * Add curve_fit, multi_curve_fit, and plotting functions for xy data * Refactor example CurveFitAnalysis class to use library functions * Add curve fitting data * Remove CurveFitAnalysis class and leave as library functions * Linting * Clean up curve_fit_data * Remove unneeded check for meas level now that data processor function is an arg * Add optional kwargs for x and series metadata keys for extracting data * Run black * Add filters to curve fit data * Reorganize files * Move curve_fit_data and mutli_curve_fit_data to curve_fitting.py * Move other processing functions to `data_processing.py` * Remove unneeded init import * Initial review edits * Separate x and series data for multi curve fit Rename curve_fit_data to process_curve_data Rename multi_curve_fit_data to process_multi_curve_data * Allow passing p0, and bounds as dict to curve fit * Add sigma to mean_xy_data If sigma is provided the mean and variance is computed using inverse-variance weighting, otherwise it is computed as the sample mean and biased sample variance. * Set default absolute_sigma = True * Fix logical check on np array * Use reduced chi-squared calculation * fixup data processing * Fix error in level2_probabilities multiplying by shots instead of dividing * Add explicit method kwarg for mean_xy_data * Add basic tests for curve fitting * More review edits * Change chisq -> reduced_chisq in result and docs * add dof (degrees of freedom) to returned results * add documentation about sigma by default being absolute sigma. Co-authored-by: Helena Zhang <helena@astbb.com>

* RB experiment class based on Chris' draft * Unifying sampling procedures * Some more unification * Style * Style fixes * Analysis now works directly with curve_fit functions * Added plotting (ad-hoc solution for now) * Linting * Adding error bars to plots * fixed link in readme (#15) * Curve fitting analysis helper functions (#19) * Add CurveFitAnalysis class * Get outcome label from metadata * Fix import error message if no matplotlib * Add option for fitting means of batched data sets * Reorganize helper functions into separate file * Add curve_fit, multi_curve_fit, and plotting functions for xy data * Refactor example CurveFitAnalysis class to use library functions * Add curve fitting data * Remove CurveFitAnalysis class and leave as library functions * Linting * Clean up curve_fit_data * Remove unneeded check for meas level now that data processor function is an arg * Add optional kwargs for x and series metadata keys for extracting data * Run black * Add filters to curve fit data * Reorganize files * Move curve_fit_data and mutli_curve_fit_data to curve_fitting.py * Move other processing functions to `data_processing.py` * Remove unneeded init import * Initial review edits * Separate x and series data for multi curve fit Rename curve_fit_data to process_curve_data Rename multi_curve_fit_data to process_multi_curve_data * Allow passing p0, and bounds as dict to curve fit * Add sigma to mean_xy_data If sigma is provided the mean and variance is computed using inverse-variance weighting, otherwise it is computed as the sample mean and biased sample variance. * Set default absolute_sigma = True * Fix logical check on np array * Use reduced chi-squared calculation * fixup data processing * Fix error in level2_probabilities multiplying by shots instead of dividing * Add explicit method kwarg for mean_xy_data * Add basic tests for curve fitting * More review edits * Change chisq -> reduced_chisq in result and docs * add dof (degrees of freedom) to returned results * add documentation about sigma by default being absolute sigma. Co-authored-by: Helena Zhang <helena@astbb.com> * Example notebook * Linting and documentation * Some more linting * Small change * Linting Co-authored-by: Yael Ben-Haim <yaelbh@il.ibm.com> Co-authored-by: Christopher J. Wood <cjwood@us.ibm.com> Co-authored-by: Helena Zhang <helena@astbb.com>

* Add CurveFitAnalysis class * Get outcome label from metadata * Fix import error message if no matplotlib * Add option for fitting means of batched data sets * Reorganize helper functions into separate file * Add curve_fit, multi_curve_fit, and plotting functions for xy data * Refactor example CurveFitAnalysis class to use library functions * Add curve fitting data * Remove CurveFitAnalysis class and leave as library functions * Linting * Clean up curve_fit_data * Remove unneeded check for meas level now that data processor function is an arg * Add optional kwargs for x and series metadata keys for extracting data * Run black * Add filters to curve fit data * Reorganize files * Move curve_fit_data and mutli_curve_fit_data to curve_fitting.py * Move other processing functions to `data_processing.py` * Remove unneeded init import * Initial review edits * Separate x and series data for multi curve fit Rename curve_fit_data to process_curve_data Rename multi_curve_fit_data to process_multi_curve_data * Allow passing p0, and bounds as dict to curve fit * Add sigma to mean_xy_data If sigma is provided the mean and variance is computed using inverse-variance weighting, otherwise it is computed as the sample mean and biased sample variance. * Set default absolute_sigma = True * Fix logical check on np array * Use reduced chi-squared calculation * fixup data processing * Fix error in level2_probabilities multiplying by shots instead of dividing * Add explicit method kwarg for mean_xy_data * Add basic tests for curve fitting * More review edits * Change chisq -> reduced_chisq in result and docs * add dof (degrees of freedom) to returned results * add documentation about sigma by default being absolute sigma. Co-authored-by: Helena Zhang <helena@astbb.com>

* RB experiment class based on Chris' draft * Unifying sampling procedures * Some more unification * Style * Style fixes * Analysis now works directly with curve_fit functions * Added plotting (ad-hoc solution for now) * Linting * Adding error bars to plots * fixed link in readme (qiskit-community#15) * Curve fitting analysis helper functions (qiskit-community#19) * Add CurveFitAnalysis class * Get outcome label from metadata * Fix import error message if no matplotlib * Add option for fitting means of batched data sets * Reorganize helper functions into separate file * Add curve_fit, multi_curve_fit, and plotting functions for xy data * Refactor example CurveFitAnalysis class to use library functions * Add curve fitting data * Remove CurveFitAnalysis class and leave as library functions * Linting * Clean up curve_fit_data * Remove unneeded check for meas level now that data processor function is an arg * Add optional kwargs for x and series metadata keys for extracting data * Run black * Add filters to curve fit data * Reorganize files * Move curve_fit_data and mutli_curve_fit_data to curve_fitting.py * Move other processing functions to `data_processing.py` * Remove unneeded init import * Initial review edits * Separate x and series data for multi curve fit Rename curve_fit_data to process_curve_data Rename multi_curve_fit_data to process_multi_curve_data * Allow passing p0, and bounds as dict to curve fit * Add sigma to mean_xy_data If sigma is provided the mean and variance is computed using inverse-variance weighting, otherwise it is computed as the sample mean and biased sample variance. * Set default absolute_sigma = True * Fix logical check on np array * Use reduced chi-squared calculation * fixup data processing * Fix error in level2_probabilities multiplying by shots instead of dividing * Add explicit method kwarg for mean_xy_data * Add basic tests for curve fitting * More review edits * Change chisq -> reduced_chisq in result and docs * add dof (degrees of freedom) to returned results * add documentation about sigma by default being absolute sigma. Co-authored-by: Helena Zhang <helena@astbb.com> * Example notebook * Linting and documentation * Some more linting * Small change * Linting Co-authored-by: Yael Ben-Haim <yaelbh@il.ibm.com> Co-authored-by: Christopher J. Wood <cjwood@us.ibm.com> Co-authored-by: Helena Zhang <helena@astbb.com>

chriseclectic added 2 commits March 17, 2021 15:30

Add CurveFitAnalysis class

5db1a42

Get outcome label from metadata

a6d9c93

mtreinish reviewed Mar 17, 2021

View reviewed changes

qiskit_experiments/analysis/curve_fit_analysis.py Outdated Show resolved Hide resolved

chriseclectic added 2 commits March 17, 2021 16:21

Fix import error message if no matplotlib

b816312

Add option for fitting means of batched data sets

1e0e755

chriseclectic changed the title ~~Add CurveFitAnalysis class~~ [WIP] Add CurveFitAnalysis class Mar 18, 2021

yaelbh reviewed Mar 18, 2021

View reviewed changes

qiskit_experiments/analysis/curve_fit_analysis.py Outdated Show resolved Hide resolved

yaelbh reviewed Mar 18, 2021

View reviewed changes

qiskit_experiments/analysis/curve_fit_analysis.py Outdated Show resolved Hide resolved

yaelbh reviewed Mar 18, 2021

View reviewed changes

qiskit_experiments/analysis/curve_fit_analysis.py Outdated Show resolved Hide resolved

eggerdj mentioned this pull request Mar 18, 2021

Calibration module #20

Closed

2 tasks

eggerdj reviewed Mar 19, 2021

View reviewed changes

qiskit_experiments/analysis/curve_fit_analysis.py Outdated Show resolved Hide resolved

eggerdj reviewed Mar 19, 2021

View reviewed changes

qiskit_experiments/analysis/curve_fit_analysis.py Outdated Show resolved Hide resolved

Reorganize helper functions into separate file

8d1c971

yaelbh mentioned this pull request Mar 22, 2021

Add T1 experiment class #5

Merged

chriseclectic force-pushed the curve-fit-analysis branch from 55e7ef3 to f3cb555 Compare March 26, 2021 16:04

chriseclectic added 2 commits March 26, 2021 12:26

Add curve_fit, multi_curve_fit, and plotting functions for xy data

4c8057c

Refactor example CurveFitAnalysis class to use library functions

271995a

chriseclectic force-pushed the curve-fit-analysis branch from cd23fc1 to 271995a Compare March 26, 2021 16:26

eggerdj reviewed Mar 26, 2021

View reviewed changes

qiskit_experiments/analysis/curve_fit_analysis.py Outdated Show resolved Hide resolved

qiskit_experiments/analysis/curve_fit_analysis.py Outdated Show resolved Hide resolved

qiskit_experiments/analysis/curve_fit_analysis.py Outdated Show resolved Hide resolved

yaelbh mentioned this pull request Mar 29, 2021

Data processing #22

Merged

chriseclectic added 2 commits April 8, 2021 11:49

Add curve fitting data

0a7fb8b

Remove CurveFitAnalysis class and leave as library functions

0ba6a74

chriseclectic changed the title ~~[WIP] Add CurveFitAnalysis class~~ Curve fitting analysis helper functions Apr 8, 2021

chriseclectic added 4 commits April 9, 2021 16:12

Add sigma to mean_xy_data

6bb4dc7

If sigma is provided the mean and variance is computed using inverse-variance weighting, otherwise it is computed as the sample mean and biased sample variance.

Set default absolute_sigma = True

361ae6c

Fix logical check on np array

ab805e9

Use reduced chi-squared calculation

17876df

oliverdial reviewed Apr 13, 2021

View reviewed changes

chriseclectic added 2 commits April 13, 2021 16:12

fixup data processing

90f1926

* Fix error in level2_probabilities multiplying by shots instead of dividing * Add explicit method kwarg for mean_xy_data

Add basic tests for curve fitting

363d09e

yaelbh reviewed Apr 14, 2021

View reviewed changes

qiskit_experiments/analysis/curve_fitting.py Show resolved Hide resolved

yaelbh reviewed Apr 14, 2021

View reviewed changes

qiskit_experiments/analysis/curve_fitting.py Show resolved Hide resolved

yaelbh reviewed Apr 14, 2021

View reviewed changes

qiskit_experiments/analysis/curve_fitting.py Outdated Show resolved Hide resolved

yaelbh reviewed Apr 14, 2021

View reviewed changes

qiskit_experiments/analysis/curve_fitting.py Outdated Show resolved Hide resolved

More review edits

9024ded

* Change chisq -> reduced_chisq in result and docs * add dof (degrees of freedom) to returned results * add documentation about sigma by default being absolute sigma.

yaelbh reviewed Apr 18, 2021

View reviewed changes

yaelbh mentioned this pull request Apr 18, 2021

RB experiment #25

Merged

Merge branch 'main' into curve-fit-analysis

6b1f362

coruscating approved these changes Apr 19, 2021

View reviewed changes

coruscating merged commit c9a355e into qiskit-community:main Apr 19, 2021

yaelbh mentioned this pull request May 2, 2021

Interleaved RB #28

Merged

coruscating added this to the Release 0.1 milestone Jun 16, 2021

chriseclectic deleted the curve-fit-analysis branch July 21, 2021 18:45

yaelbh mentioned this pull request Oct 3, 2021

T2 experiment with Hahn echoes #365

Merged

7 tasks


		# Run linearized curve_fit
		analysis_result = curve_fit(f, xdata, ydata, p0, sigma=wsigma, bounds=bounds, **kwargs)

Curve fitting analysis helper functions #19

Curve fitting analysis helper functions #19

Uh oh!

Conversation

chriseclectic commented Mar 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details and comments

Uh oh!

zlatko-minev commented Mar 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

nkanazawa1989 commented Mar 18, 2021

Uh oh!

yaelbh commented Mar 18, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eggerdj commented Mar 19, 2021

Uh oh!

Uh oh!

chriseclectic commented Mar 19, 2021

Uh oh!

chriseclectic commented Mar 26, 2021

Uh oh!

eggerdj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chriseclectic commented Apr 8, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CLAassistant commented Apr 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

chriseclectic commented Mar 17, 2021 •

edited

Loading

zlatko-minev commented Mar 17, 2021 •

edited

Loading

CLAassistant commented Apr 19, 2021 •

edited

Loading