[WIP] Rework experiment data callbacks for running analysis #398

chriseclectic · 2021-09-24T21:55:16Z

Summary

This PR attempts to rework the ExperimentData class so that data processing callbacks can be added to the job future objects using a separate method add_processing_callback that uses the add_done_callback method of the Python Future API.

It also combines multiple jobs added together in a single add_data call into a single future object, this is necessary to support job splitting for backends that have a maximum number of circuits per job that is less than the number of circuits generated by experiments.

Splitting an experiment with large numbers of circuits into multiple jobs is also implemented in this PR.

Details and comments

The run_analysis method for BaseExperiment uses this new callback function. The goal of this is that calling run_analysis will not require blocking on results to work correctly so that the follwoing should be equivalent:

expdata = exp.run(backend)

expdata = exp.run(backend, analysis=False)
exp.run_analysis(exp_data)

These changes seem to work correctly for most experiments except the recently added CR Hamiltonian tomography one (which im not sure why only it seems to fail), and the result DB tests which explicitly use the old callback API which are all failling (and im not exactly sure how they should be updated at the moment).

I would like to get feedback on this approach before going any further with trying to fix the tests.

This doesn't fix the failing tests, but stops them from erroring due to change in `add_data` signature.

jyu00 · 2021-09-27T13:58:19Z

It also combines multiple jobs added together in a single add_data call into a single future object, this is necessary to support job splitting for backends that have a maximum number of circuits per job that is less than the number of circuits generated by experiments.

backend.run() in qiskit-ibm now does the splitting/combining for you. But if you want to support legacy backend I suppose you'd still need this.

The run_analysis method for BaseExperiment uses this new callback function. The goal of this is that calling run_analysis will not require blocking on results to work correctly

I'm confused by this. How does run_analysis work if job results are not yet available? If it's to do analysis on existing results, that can already be done (e.g. by just calling run_analysis()).

that uses the add_done_callback method of the Python Future API.

There are a couple of caveats about using add_done_callback. One is that the future is considered "done" even when the the function you pass to add_done_callback is still running, making it difficult to ascertain the "true" status of an experiment. Similarly, exceptions made by this function will not be captured. These are the reasons the old code calls job_done_callback in _wait_for_job() instead of using add_done_callback.

nkanazawa1989 · 2021-09-28T03:08:55Z

How does run_analysis work if job results are not yet available?

This crashes execution of analysis with no-data entry error because expdata.data() returns something immediately. This is why all unittets need block_for_results.

nkanazawa1989

I like add_processing_callback method. However, this allows one to call the method multiple times. What is the expected behavior if different callbacks are added? (Since there is no error handling, it is acceptable behavior?).

nkanazawa1989 · 2021-09-28T03:14:21Z

qiskit_experiments/database_service/db_experiment_data.py

                    self.backend,
                )
-            self._backend = data.backend()
+            self._backend = job.backend()


maybe

Suggested change

self._backend = job.backend()

if not self._backend:

self._backend = job.backend()

?

nkanazawa1989 · 2021-09-28T03:19:00Z

qiskit_experiments/database_service/db_experiment_data.py

        else:
-            raise TypeError(f"Invalid data type {type(data)}.")
+            # Get the last added future and add a done callback
+            _, future = self._job_futures[-1]


What happens in self._job_futures[:-1]? Submission of jobs and their completion are always in the same order? I guess this is not guaranteed because job is scheduled according to some formula and it prioritizes jobs with cost, i.e. number of circuits, shots, etc...

nkanazawa1989 · 2021-09-28T03:29:35Z

qiskit_experiments/framework/base_experiment.py

-            job = backend.run(qobj)
+        # Run experiment jobs
+        max_experiments = getattr(backend.configuration(), "max_experiments", None)
+        if max_experiments and len(circuits) > max_experiments:


This should be taken also from run options. If we use pulse gate and scan a parameter of very long pulse, this job will quickly consume the waveform memory on hardware and raise the memory overflow error on a backend. To prevent this, we often split a job to keep total waveform data point number sufficiently small. For example, CR Hamiltonian tomography often runs into this situation.

chriseclectic · 2021-09-28T18:18:43Z

@jyu00 @nkanazawa1989 This is now replaced by #407 which is refactored to use separate futures for analysis to avoid using the add_done_callback for the reasons jessie mentioned.

@jyu00 The job splitting is now in #402 and this was requested by @mtreinish so experiments can work for 3rd party providers, not just IBM backends.

chriseclectic added 2 commits September 24, 2021 17:46

Rework futures and callbacks use in ExperimentData

8f8ba2e

Fix method call in db tests

e532225

This doesn't fix the failing tests, but stops them from erroring due to change in `add_data` signature.

chriseclectic mentioned this pull request Sep 27, 2021

Allow splitting of jobs for all backends #402

Merged

nkanazawa1989 reviewed Sep 28, 2021

View reviewed changes

nkanazawa1989 mentioned this pull request Sep 28, 2021

Experiment run hook methods #380

Closed

chriseclectic closed this Sep 28, 2021

chriseclectic deleted the expdata-callback branch March 3, 2022 22:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Rework experiment data callbacks for running analysis #398

[WIP] Rework experiment data callbacks for running analysis #398

Uh oh!

chriseclectic commented Sep 24, 2021 •

edited

Loading

Uh oh!

jyu00 commented Sep 27, 2021

Uh oh!

nkanazawa1989 commented Sep 28, 2021

Uh oh!

nkanazawa1989 left a comment

Uh oh!

nkanazawa1989 Sep 28, 2021

Uh oh!

nkanazawa1989 Sep 28, 2021

Uh oh!

nkanazawa1989 Sep 28, 2021 •

edited

Loading

Uh oh!

chriseclectic commented Sep 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	self._backend = job.backend()
	if not self._backend:
	self._backend = job.backend()

[WIP] Rework experiment data callbacks for running analysis #398

[WIP] Rework experiment data callbacks for running analysis #398

Uh oh!

Conversation

chriseclectic commented Sep 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details and comments

Uh oh!

jyu00 commented Sep 27, 2021

Uh oh!

nkanazawa1989 commented Sep 28, 2021

Uh oh!

nkanazawa1989 left a comment

Choose a reason for hiding this comment

Uh oh!

nkanazawa1989 Sep 28, 2021

Choose a reason for hiding this comment

Uh oh!

nkanazawa1989 Sep 28, 2021

Choose a reason for hiding this comment

Uh oh!

nkanazawa1989 Sep 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chriseclectic commented Sep 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chriseclectic commented Sep 24, 2021 •

edited

Loading

nkanazawa1989 Sep 28, 2021 •

edited

Loading