[WIP] Work with Terra PR 5499 #66

yaelbh · 2021-05-23T13:26:55Z

T1 is working except for composite experiments.
@chriseclectic @coruscating @jyu00 Please guide on what has to be done before we proceed to work with Sanket.

yaelbh · 2021-05-23T13:34:29Z

This PR is similar to #29. The main difference is that the individual experiments will not work directly with ExperimentDataV1; instead we have a local ExperimentData class for the experiments, which inherits from ExperimentDataV1, and stores the experiment.

jyu00 · 2021-05-25T19:13:58Z

@yaelbh One thing I noticed is that BaseExperiment does analysis.run() immediately after adding job data. This will cause a race condition because the analysis class could be referencing data that's not yet added to the ExperimentData.

You can instead do

experiment_data.add_data(job, post_processing_callback=self.__analysis_class__().run, **kwargs)

add_data() takes a callback function that will be called once the job finishes.

Also most of the code is still using a mixture of AnalysisResultV1 (from terra) and a local AnalysisResult (a dictionary), which may be the issue with composite experiment?

Lastly, since add_data(), if passed in a job, will asynchronously wait for the job to finish. So some of the data (and analysis results if you use callback), may not be immediately available. You can use block_for_jobs() to wait for the jobs.

I can create a PR to your branch or have a separate PR to make these updates, if you'd like.

yaelbh · 2021-05-26T11:58:52Z

@jyu thanks for reviewing. Some (possibly somewhat confused) thoughts following your comments:

Race condition: I'm pretty confused about the flow, once we allow code sections to continue to run before previous sections have terminated. The code of BaseExperiment.run creates a single job, runs it on a backend, and performs analysis. Do we envision a scenario where multiple jobs need to terminate before the analysis? Is it necessary for the result to be ready before the function ends? If we use block_for_jobs, where do we position the call and why? Maybe put it right after add_data and before the analysis, and not use the post-processing call for the analysis? Most of all, I'm worried about testing. Do you have a systematic way to test asynchronous runs? For example, can we write a test where running the analysis fails because the data has not been added yet, then fix the code, see that the test passes after the fix, and keep the test in our permanent test suite?
Mixture of AnalysisResult and AnalysisResultV1: it's ok and intentional. AnalysisResult is a dictionary, later to be plugged in as the result_data dictionary in AnalysisResultV1. I think composite experiments are doable, I'm going to do it.
I'm equally fine with all three options, that I make the fixes in the PR, or that you make them, either in a PR to my branch or as a separate PR. Which one do you prefer?

jyu00 · 2021-05-26T13:39:41Z

Race condition: The code of BaseExperiment.run creates a single job, runs it on a backend, and performs analysis.

The current code for this flow is basically

job = backend.run(circuits, **run_options)
experiment_data.add_data(job)
analysis.run()  # this references experiment_data.data()

The problem here is that add_data() waits asynchronously for the job to finish before adding its results to _data. Because the wait is done asynchronously, add_data() returns immediately, and therefore there is a good chance analysis.run() will get control before the job results become available. In other words, experiment_data.data() could return an empty list if results are not yet available.

Do we envision a scenario where multiple jobs need to terminate before the analysis?

One scenario I can think of is when the circuits are too long and need to be divided into multiple jobs. I was hoping to get the IBMQ Job Manager replacement (which does the division for you under the cover) done before qiskit-experiment release, but that's not likely to happen :(

Is it necessary for the result to be ready before the function ends? If we use block_for_jobs, where do we position the call and why?

It's currently designed to allow both the job and analysis to run asynchronously in the background, and the function could return before they are done. So when you call experiment.run(), it returns before the processing actually finishes. Some people might want to block until the result becomes available, and that's what block_for_jobs() is for (perhaps the function can use a better name). For example, one might do

exp = T1Experiment(0, delays, unit="dt")
exp_data = exp.run( ... )  # this could return before result is available
exp_data.block_for_jobs()  # this waits for the result to become available
result = exp_data.analysis_result(0)  # now it's safe to access the result

Maybe put it right after add_data and before the analysis, and not use the post-processing call for the analysis?

We can do that, but then the experiment.run() function will block until the job finishes, which could take a very long time.

Most of all, I'm worried about testing. Do you have a systematic way to test asynchronous runs? For example, can we write a test where running the analysis fails because the data has not been added yet, then fix the code, see that the test passes after the fix, and keep the test in our permanent test suite?

Yeah we can mock a job instance so that job.result() only returns when certain condition is set. Then we can make sure running the analysis would fail before setting the condition and succeed after the condition. However, if we call analysis.run() as a post-processing callback, then analysis would always happen after the data is added.

Mixture of AnalysisResult and AnalysisResultV1: it's ok and intentional. AnalysisResult is a dictionary, later to be plugged in as the result_data dictionary in AnalysisResultV1. I think composite experiments are doable, I'm going to do it.

Can we call it something else then (maybe AnalysisResultData)? Having names this similar is very confusing, especially since in Terra SomeClass is always the base class of SomeClassV1.

I'm equally fine with all three options, that I make the fixes in the PR, or that you make them, either in a PR to my branch or as a separate PR. Which one do you prefer?

Since you already did a lot of work on this PR, I'll just continue working on it, if that's ok with you.

# Conflicts: # qiskit_experiments/analysis/plotting.py # qiskit_experiments/base_experiment.py # qiskit_experiments/composite/composite_analysis.py # qiskit_experiments/randomized_benchmarking/interleaved_rb_analysis.py # qiskit_experiments/randomized_benchmarking/rb_analysis.py # test/test_t1.py

Terra base classes integration

yaelbh · 2021-05-31T13:16:18Z

I've just added changes made by @jyu00. Thanks Jessie! @nkanazawa1989 please note, Jessie replaced AnalysisResult with a dictionary. I think you've expressed an opinion in favor of AnalysisResult, for the purpose of validity checks. Please let us know if you think we should revert to AnalysisResult.

yaelbh · 2021-06-02T09:49:00Z

I merged with the main branch this morning, and the code is broken now (some tests fail, when running with the correct branch in Terra). I'm working on fixing it.

yaelbh · 2021-06-02T10:18:24Z

In fact, these test might have failed also before the merge with main. Looking at it anyway.

jyu00 · 2021-06-02T11:43:00Z

This PR depends on the Terra PR, which is not installed in the CI. We can either install it in the CI, or wait for the Terra PR to be merged and install Terra master in CI.

yaelbh · 2021-06-02T11:50:06Z

The tests are failing in my local environment, which is using the Terra PR. It appears that _run_analysis doesn't see the analysis options correctly.

yaelbh · 2021-06-02T11:52:33Z

Yes, the post-processing callback function needs to changed from run of the analysis class to BaseExperiment.run_analysis

yaelbh · 2021-06-02T12:21:45Z

All the tests are passing now. @eggerdj Please check the last commit.

eggerdj · 2021-06-02T12:47:20Z

test/data_processing/test_data_processing.py

+if __name__ == "__main__":
+    unittest.main()


Do we need this? Up to now I didn't need this when running tests locally and CI seemed fine without it.

@eggerdj If I want to run a single test file, I usually write:

python <file name>

and it works, if these two lines appear at the end of the test file. I could of course, instead, specify the test that I want in a flag to stestr or python -m unittest.

Is there a reason to prefer to omit these lines? @mtreinish

I prefer to omit them, but I am happy to hear guidance from @mtreinish or @scasagrande.

It's not really needed with modern test runners since they pretty much all let you run by file path natively. This is only really necessary with the stdlib unittest runner which doesn't not provide a useful UI that has any test selection. It doesn't harm anything by being there but I would say running with a more sophisticated test runner is in your best interest. Personally I never add this since most people don't manually invoke the stdlib runner and especially with larger test suites its just easier to use a single entrypoint from the runner that also gives more useful output.

For example, if you use pytest it would just be pytest $path or with stestr like we use via tox (which is also used in CI) you can do stestr run -n $path. The tox configuration is setup so you can just do tox -epy -- -n $path so you can call stestr natively this way in the tox venv.

I typically use pytest <path> as well, so my inclination would be to drop it.

yaelbh · 2021-06-17T07:37:12Z

Currently this PR is in a partial state. Will open a new PR, that will be based on #113.

yaelbh · 2021-06-21T07:39:15Z

Closing because this work was moved to #115

Work with Terra PR 5499

ea98ac8

yaelbh mentioned this pull request May 23, 2021

[WIP] Integrate T1 experiment with result database #29

Closed

yaelbh and others added 8 commits May 27, 2021 12:33

Merge branch 'main' into pr5499

e7be16c

fixed import

ca1be36

switch to using terra base

e845cd3

return analysis result

5505020

remove unused

b5e33bd

fix rb device components

86f7d8a

Merge pull request #1 from jyu00/terra-integration

e04d1c8

Terra base classes integration

yaelbh added 2 commits May 31, 2021 16:17

Merge branch 'main' into pr5499

bc4d96c

Merge branch 'main' into pr5499

a76eed8

yaelbh added 2 commits June 2, 2021 14:58

fixing the post processing callback function

4dd80a6

fixed data processor tests

134cc8e

removed relative import

62b9735

eggerdj reviewed Jun 2, 2021

View reviewed changes

coruscating added this to the Release 1 milestone Jun 14, 2021

yaelbh added 2 commits June 15, 2021 15:53

merged main

0eb0098

black

48fd05c

yaelbh mentioned this pull request Jun 17, 2021

Integration with result database #115

Merged

yaelbh closed this Jun 21, 2021

[WIP] Work with Terra PR 5499 #66

[WIP] Work with Terra PR 5499 #66

Uh oh!

Conversation

yaelbh commented May 23, 2021

Uh oh!

yaelbh commented May 23, 2021

Uh oh!

jyu00 commented May 25, 2021

Uh oh!

yaelbh commented May 26, 2021

Uh oh!

jyu00 commented May 26, 2021

Uh oh!

yaelbh commented May 31, 2021

Uh oh!

yaelbh commented Jun 2, 2021

Uh oh!

yaelbh commented Jun 2, 2021

Uh oh!

jyu00 commented Jun 2, 2021

Uh oh!

yaelbh commented Jun 2, 2021

Uh oh!

yaelbh commented Jun 2, 2021

Uh oh!

yaelbh commented Jun 2, 2021

Uh oh!

eggerdj Jun 2, 2021

Choose a reason for hiding this comment

Uh oh!

yaelbh Jun 2, 2021

Choose a reason for hiding this comment

Uh oh!

blakejohnson Jun 14, 2021

Choose a reason for hiding this comment

Uh oh!

mtreinish Jun 14, 2021

Choose a reason for hiding this comment

Uh oh!

blakejohnson Jun 14, 2021

Choose a reason for hiding this comment

Uh oh!

yaelbh commented Jun 17, 2021

Uh oh!

yaelbh commented Jun 21, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants