Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pytest and TMPDIR housekeeping #6036

Closed
pavoljuhas opened this issue Mar 14, 2023 · 6 comments
Closed

pytest and TMPDIR housekeeping #6036

pavoljuhas opened this issue Mar 14, 2023 · 6 comments
Assignees
Labels
good first issue This issue can be resolved by someone who is not familiar with the codebase. A good starting issue. kind/health For CI/testing/release process/refactoring/technical debt items no QC knowledge needed Want to contribute to Cirq, but don't know quantum computing? This issue is for you. triage/accepted there is consensus amongst maintainers that this is a real bug or a reasonable feature to add

Comments

@pavoljuhas
Copy link
Collaborator

pavoljuhas commented Mar 14, 2023

pytest leaves behind a LOT of temporary files

After executing all cirq unit tests with pytest the root temporary directory contains 31 leftover files and 38 leftover directories taking about 30G of disk space in total. The majority of that is taken by /tmp/cirq-pytest which has 59 virtual environments - 3 base environments and 56 per each isolated notebook test.

Some ideas how to reduce disk usage in isolated_notebook_test

  • instead of creating new venv-s for each notebook, use one base environment with "PIP_TARGET" and "PYTHONPATH" pointing to a temporary directory (similar as in Prevent notebook tests from changing the pytest Python environment #6032). pip install commands run in the notebook should then keep their base environment unchanged.
  • alternatively, instead of cloning a new venv for each tested notebook, create an rsync backup of the initial base venv and restore the base with rsync -a --delete base-backup/ base/ before each test. This would probably need several base environments for each parallel pytest worker.
  • make sure all tests clean up their temporaries after successful completion

Here is how I checked for the leftover files (docker tests are not set up on my workstation).
Note: my work tree had a change in the isolated_notebook_test.py which makes it execute all isolated notebook tests.

$ touch t0; check/pytest -m 'slow or not slow' -k 'not docker'; touch t1
$ find /tmp -maxdepth 1 -user $USER -newer t0 -not -newer t1

Cirq version

1.2.0.dev at 8b97aa5

@pavoljuhas pavoljuhas added kind/health For CI/testing/release process/refactoring/technical debt items good first issue This issue can be resolved by someone who is not familiar with the codebase. A good starting issue. no QC knowledge needed Want to contribute to Cirq, but don't know quantum computing? This issue is for you. labels Mar 14, 2023
@tanujkhattar tanujkhattar added the triage/accepted there is consensus amongst maintainers that this is a real bug or a reasonable feature to add label Jul 19, 2023
@shawn-peng
Copy link
Contributor

I would like to work on this

@NoureldinYosri
Copy link
Collaborator

@shawn-peng assigned

shawn-peng added a commit to shawn-peng/Cirq that referenced this issue Aug 9, 2023
* remove temporary isolated venv after isolated notebook/packaging test done

* remove temporary venv after cloned env test done

* remove temporary directories for figures after heatmap testings

Fix pytest and TMPDIR housekeeping (quantumlib#6036)
@shawn-peng
Copy link
Contributor

shawn-peng commented Aug 10, 2023

This PR simply removes the temporary directories. Here are the options we have,

  1. "PIP_TARGET" and "PYTHONPATH" pointing to a temporary directory
  2. rsync -a --delete base-backup/ base/
  3. pip-sync to recover env after the test
  4. remove the temporary directory

Here is some reasoning for the options we have for the isolated test cases.

First, if I understand correctly, these isolated test cases are testing these notebooks or other test cases to be able to run from a base environment by automatically installing packages by themselves. So, the environment shouldn't have all packages set up.

The second method takes about 10s to restore each environment. It's slower than just removing and copying a new one (about 5s). The pip-sync also takes about 8s still worse than simply copying a new one.

@tanujkhattar
Copy link
Collaborator

@pavoljuhas to review the PR and @shawn-peng to verify that all files are cleaned up as per the last command in the original issue.

@shawn-peng
Copy link
Contributor

shawn-peng commented Aug 16, 2023

Thanks @tanujkhattar,

@pavoljuhas Here is what's left,

# starting usage
tmpfs           8.0G  8.0K  8.0G   1% /tmp

[shawn@fedora Cirq]$ find /tmp -maxdepth 1 -user $USER -newer t0 -not -newer t1
/tmp/.com.google.Chrome.7cImqR
/tmp/.com.google.Chrome.nh9f2y
/tmp/.com.google.Chrome.LUF5T6
/tmp/.com.google.Chrome.HnJ5lD
/tmp/tmpdua947ty
/tmp/tmp51napeqp
/tmp/tmphm9_jjio
/tmp/tmpqjtxm2mn
/tmp/cirq-pytest
/tmp/pytest-of-shawn

# ending usage
tmpfs           8.0G  1.1G  7.0G  13% /tmp

cirq-pytest takes most of the usage, here are what's left,

[shawn@fedora cirq-pytest]$ du -h --max-depth=1 
480M	./isolated_packages
17M	./test_isolated
539M	./isolated_notebook_tests
1.1G	.

These will be reused later, so I think it would be better to keep them.

Inside pytest-of-<username>

[shawn@fedora pytest-of-shawn]$ ls
pytest-0  pytest-2  pytest-3  pytest-4  pytest-current
[shawn@fedora pytest-of-shawn]$ du -h --max-depth=1
12K	./pytest-4
240K	./pytest-3
184K	./pytest-2
6.1M	./pytest-0
6.5M	.

4 tmp* dir is empty, 4 browser temp dirs take little space

[shawn@fedora tmp]$ du -h --max-depth=1 .com.google.Chrome.*
12K	.com.google.Chrome.7cImqR
4.0K	.com.google.Chrome.HnJ5lD
12K	.com.google.Chrome.LUF5T6
8.0K	.com.google.Chrome.nh9f2y

@shawn-peng
Copy link
Contributor

The top usage with 4 threads/workers is less than 5GB. Each isolated env is deleted immediately after the test is done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue This issue can be resolved by someone who is not familiar with the codebase. A good starting issue. kind/health For CI/testing/release process/refactoring/technical debt items no QC knowledge needed Want to contribute to Cirq, but don't know quantum computing? This issue is for you. triage/accepted there is consensus amongst maintainers that this is a real bug or a reasonable feature to add
Projects
None yet
Development

No branches or pull requests

4 participants