Skip to content

Conversation

@chriseclectic
Copy link
Collaborator

@chriseclectic chriseclectic commented Sep 24, 2021

Summary

This PR attempts to rework the ExperimentData class so that data processing callbacks can be added to the job future objects using a separate method add_processing_callback that uses the add_done_callback method of the Python Future API.

It also combines multiple jobs added together in a single add_data call into a single future object, this is necessary to support job splitting for backends that have a maximum number of circuits per job that is less than the number of circuits generated by experiments.

Splitting an experiment with large numbers of circuits into multiple jobs is also implemented in this PR.

Details and comments

The run_analysis method for BaseExperiment uses this new callback function. The goal of this is that calling run_analysis will not require blocking on results to work correctly so that the follwoing should be equivalent:

expdata = exp.run(backend)
expdata = exp.run(backend, analysis=False)
exp.run_analysis(exp_data)

These changes seem to work correctly for most experiments except the recently added CR Hamiltonian tomography one (which im not sure why only it seems to fail), and the result DB tests which explicitly use the old callback API which are all failling (and im not exactly sure how they should be updated at the moment).

I would like to get feedback on this approach before going any further with trying to fix the tests.

This doesn't fix the failing tests, but stops them from erroring due to change in `add_data` signature.
@jyu00
Copy link
Contributor

jyu00 commented Sep 27, 2021

It also combines multiple jobs added together in a single add_data call into a single future object, this is necessary to support job splitting for backends that have a maximum number of circuits per job that is less than the number of circuits generated by experiments.

backend.run() in qiskit-ibm now does the splitting/combining for you. But if you want to support legacy backend I suppose you'd still need this.

The run_analysis method for BaseExperiment uses this new callback function. The goal of this is that calling run_analysis will not require blocking on results to work correctly

I'm confused by this. How does run_analysis work if job results are not yet available? If it's to do analysis on existing results, that can already be done (e.g. by just calling run_analysis()).

that uses the add_done_callback method of the Python Future API.

There are a couple of caveats about using add_done_callback. One is that the future is considered "done" even when the the function you pass to add_done_callback is still running, making it difficult to ascertain the "true" status of an experiment. Similarly, exceptions made by this function will not be captured. These are the reasons the old code calls job_done_callback in _wait_for_job() instead of using add_done_callback.

@nkanazawa1989
Copy link
Collaborator

How does run_analysis work if job results are not yet available?

This crashes execution of analysis with no-data entry error because expdata.data() returns something immediately. This is why all unittets need block_for_results.

Copy link
Collaborator

@nkanazawa1989 nkanazawa1989 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like add_processing_callback method. However, this allows one to call the method multiple times. What is the expected behavior if different callbacks are added? (Since there is no error handling, it is acceptable behavior?).

self.backend,
)
self._backend = data.backend()
self._backend = job.backend()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe

Suggested change
self._backend = job.backend()
if not self._backend:
self._backend = job.backend()

?

else:
raise TypeError(f"Invalid data type {type(data)}.")
# Get the last added future and add a done callback
_, future = self._job_futures[-1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in self._job_futures[:-1]? Submission of jobs and their completion are always in the same order? I guess this is not guaranteed because job is scheduled according to some formula and it prioritizes jobs with cost, i.e. number of circuits, shots, etc...

job = backend.run(qobj)
# Run experiment jobs
max_experiments = getattr(backend.configuration(), "max_experiments", None)
if max_experiments and len(circuits) > max_experiments:
Copy link
Collaborator

@nkanazawa1989 nkanazawa1989 Sep 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be taken also from run options. If we use pulse gate and scan a parameter of very long pulse, this job will quickly consume the waveform memory on hardware and raise the memory overflow error on a backend. To prevent this, we often split a job to keep total waveform data point number sufficiently small. For example, CR Hamiltonian tomography often runs into this situation.

@chriseclectic
Copy link
Collaborator Author

@jyu00 @nkanazawa1989 This is now replaced by #407 which is refactored to use separate futures for analysis to avoid using the add_done_callback for the reasons jessie mentioned.

@jyu00 The job splitting is now in #402 and this was requested by @mtreinish so experiments can work for 3rd party providers, not just IBM backends.

@chriseclectic chriseclectic deleted the expdata-callback branch March 3, 2022 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants