Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script to compare commits and additional benchmarks #568

Merged
merged 83 commits into from May 2, 2020

Conversation

johannesjmeyer
Copy link
Contributor

@johannesjmeyer johannesjmeyer commented Apr 1, 2020

Context:
The PR #567 needs some tests to see if it works.

Description of the Change:
This PR introduces a script to benchmark across commits.
Furthermore it adds new benchmarks.

There now exists a script benchmark_revisions.py that accepts a list of git revisions (i.e. branches, tags or commits). It downloads those commits and then runs benchmark.py with all the other parameters it got using the downloaded revision of pennylane.

Revisions are stored to decrease the number of necessary downloads.

Furthermore, a new benchmark bm_iqp_circuit, benchmarking mostly diagonal gates, was added and other benchmarks fixed.

@johannesjmeyer johannesjmeyer added WIP 🚧 Work-in-progress performance ⏲️ Benchmarking and performance improvements labels Apr 1, 2020
@codecov
Copy link

codecov bot commented Apr 1, 2020

Codecov Report

Merging #568 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #568   +/-   ##
=======================================
  Coverage   98.92%   98.92%           
=======================================
  Files          82       82           
  Lines        5124     5124           
=======================================
  Hits         5069     5069           
  Misses         55       55           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a7b3b37...a7b3b37. Read the comment docs.

@johannesjmeyer johannesjmeyer changed the title [WIP] Add additional benchmarks, script to compare commits [WIP] Add script to compare commits and additional benchmarks Apr 1, 2020
@co9olguy
Copy link
Member

co9olguy commented Apr 1, 2020

This will be a great tool @johannesjmeyer!

@co9olguy
Copy link
Member

I'm curious about some of the underlying logic. This script only works for revisions which are on the remote server, correct? It doesn't seem to recognize any commits/branches I have locally

@johannesjmeyer
Copy link
Contributor Author

I now also fixed some other git stuff that caused the wrong revision to be downloaded and that fixed the download of revisions that are "too new".

@co9olguy Yes, the script only runs for revisions on the remote server. I wouldn't necessarily know where to look on your local machine. One could of course add this possibility and give a directory, but I don't think that this merits the work

@josh146 I didn't put it into tmp (which is possible platform-independently) because I want the files to rest there (otherwise I have to clone them anytime I benchmark something which might take long). Do you think I should include a --clean switch that removes the folder?

@co9olguy
Copy link
Member

@co9olguy Yes, the script only runs for revisions on the remote server. I wouldn't necessarily know where to look on your local machine. One could of course add this possibility and give a directory, but I don't think that this merits the work

Since you mentioned above that the benchmark should be run out of the pennylane "mother" directory, and that will always be itself a git repository, then it should be enough to just query from the benchmark folder itself, no?

@johannesjmeyer
Copy link
Contributor Author

Since you mentioned above that the benchmark should be run out of the pennylane "mother" directory, and that will always be itself a git repository, then it should be enough to just query from the benchmark folder itself, no?

Yes, I will add "here" as a special command that uses git to determine the root of your current git folder.

Copy link
Member

@co9olguy co9olguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks @johannesjmeyer, looking forward to giving this tool some use :)

Everything is now working as advertised. The only thing I wonder is if it is necessary to create separate folders for every revision during testing. Couldn't we make one folder (or perhaps just use the "mother" folder!), then just use git to switch between the different commits?

.github/CHANGELOG.md Outdated Show resolved Hide resolved
benchmark/README.rst Outdated Show resolved Hide resolved

python3 benchmark_revisions.py -r master,0c8e90a -d default.qubit,default.tensor time bm_mutable_rotations

The chosen revisions will be downloaded and cached into the ``revisions`` subdirectory of the benchmarking folder.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now goes to home directory, correct?

benchmark/bm_iqp_circuit.py Outdated Show resolved Hide resolved
# PL repository. Instead we just copy the first revision in the directory
# checkout will then get the correct revision
first_revision = next((x for x in revisions_directory.iterdir() if x.is_dir()))
print(">>> Revision not found locally, copying...")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print(">>> Revision not found locally, copying...")
print(">>> Revision found locally, copying...")

revision,
"-q",
],
check=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what allowed me to figure out the --noinfo error I was seeing. Otherwise I think it would have silently failed

with cd(pl_directory):
subprocess.run(["git", "checkout", "master", "-q"], check=True)
subprocess.run(["git", "fetch", "-q"], check=True)
res = subprocess.run(["git", "checkout", revision, "-q"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no check=True here? Is that because you want to inspect res.returncode below?

benchmark_env = os.environ.copy()
benchmark_env["PYTHONPATH"] = str(pl_directory) + ";" + benchmark_env["PATH"]
subprocess.run(
["python3", str(benchmark_file_path)] + unknown_args + ["--noinfo"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so "--noinfo" is always turned on for benchmarking revisions?

print(
col(
">>> An error occured during execution of the script. "
+ "Deleting the current revision folder to not leave junk.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
+ "Deleting the current revision folder to not leave junk.",
+ "Deleting the current revision folder.",

env=benchmark_env,
check=True,
)
except:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will section ever get with check=True? (I'm not too experienced with using subprocess.run, just checking

Copy link
Contributor

@antalszava antalszava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! 🚀 Thanks @johannesjmeyer

Copy link
Member

@josh146 josh146 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, and tested locally and still works well :)

The revisions -- including branches, tags and commits -- are specified via ``-r revision1[,revision2[,revision3...]]``,
as in

.. code-block:: bash

python3 benchmark_revisions.py -r master,0c8e90a -d default.qubit,default.tensor time bm_mutable_rotations

The chosen revisions will be downloaded and cached into the ``revisions`` subdirectory of the benchmarking folder.
The chosen revisions will be downloaded and cached into the ``pennylane.benchmark/revisions`` subdirectory of your user home directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@co9olguy co9olguy Apr 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like it's now .pennylane/benchmarks/revisions?

@co9olguy
Copy link
Member

Thanks @johannesjmeyer, i am happy to merge this in (I had already approved).

Not something that's important for this PR, but I'm still a bit puzzled why we can't just perform the benchmarking in-place. What is the requirement that forces us to create the new folder $HOME/.pennylane/benchmarks/revisions, which contains an entire copy of the pennylane codebase, rather than just checking out the commits from within the user's active pennylane folder (which we can assume is already version-controlled)?

@co9olguy
Copy link
Member

I guess the primary reason is that we wouldn't be able to test any commits that didn't yet have the benchmark subfolder? Is that correct?

# Make really sure we don't reset the current git
if toplevel == Path.cwd():
raise Exception(
"Git accidently ended up in the current directory. Stopping to not cause any harm."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 🚀

@antalszava
Copy link
Contributor

Nice, working as before! 😉 👍

@johannesjmeyer
Copy link
Contributor Author

I guess the primary reason is that we wouldn't be able to test any commits that didn't yet have the benchmark subfolder? Is that correct?

No, actually my principal reason is that I really don't want to mess with the current git. I mean, we could surely implement some logic that stashes pending changes if there are some and then recovers them. But imagine there is some error during the script execution and your changes are gone. So I figured it is better to do it in a sandboxed environment. Also PennyLane source is 43MB, this is not hurting anyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance ⏲️ Benchmarking and performance improvements review-ready 👌 PRs which are ready for review by someone from the core team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants