Re-add non-boilerplate code and add tests #6

seanh · 2022-07-13T16:20:24Z

This re-adds the non-boilerplate files that #5 (which this PR is based on) removed in order to facilitate code review. Also adds tests.

Once again a review of the meat is required (README.md, src/* and tests/*) and the cookiecutter stuff (.cookiecutter/*) can be reviewed when we review the cookiecutter itself.

How to test manually

The pip-sync-faster concept is pretty simple: it calls pip-sync and stashes hashes of the requirements files in a pip_sync_faster.json file within the venv. If you run it again with the same requirements files and you haven't changed any of the requirements files then it won't call pip-sync again. If any of the requirements files has changed or if you call it with a different set of requirements files (even a subset of the ones you called it with last time) then it will call pip-sync again.

The CLI of pip-sync-faster is the same as that of pip-sync and it passes all command line options and arguments blindly to pip-sync. It also echoes pip-sync's exit status.

To test it:

Create and activate a venv
Create some test requirements files

Install pip-sync-faster in the venv:

pip install -e /path/to/pip-sync-faster

It should be pretty obvious to see that it's running pip-sync:
```
pip-sync-faster requirements.txt
```
There is one testing gotcha: pip-sync-faster must itself be in the requirements file, otherwise it'll uninstall itself (it'll call pip-sync requirements.txt which'll uninstall anything that's not in requirements.txt, including pip-sync-faster).

In testing this is a double-gotcha because even if requirements.txt contains pip-sync-faster it will replace the local development version with a copy from PyPI.

So in testing a local development version you have to reinstall it every time:
```
pip install -e /path/to/pip-sync-faster
```
If you run pip-sync-faster again with the same requirements file it won't call pip-sync:
```
pip-sync-faster requirements.txt
```

Other things to test:

If you modify the requirements file and then re-call pip-sync-faster it will call pip-sync
If you call it with a different requirements file it will call pip-sync. If you then call it with the first requirements file again it'll call pip-sync again
Calling it with multiple requirements files. If you call it again without changing any of the requirements files it won't call pip-sync. If you change any of them it will
If you call it with a different set of requirements files (even a subset of what you previously called it with) it will call pip-sync

jon-betts

Would this benefit from following a more standard script pattern?

I think there's an opportunity here to make it clearer that this is a script, which would help understanding how it works and also avoid all of the testing woes noted in the ticket.

We have a pretty common pattern in a lot of our scripts (along with chmod +x for extra points) which I think we can follow here:

#!/usr/bin/env python

from argparse import ArgumentParser

parser = ArgumentParser(description="A tool for ...")
...

def main():
    args = parser.parse_args()
    ...

if __name__ == '__main__':
    main()

That won't work verbatim here as there are more complex arg parsing requirements, but the general structure of the script could be followed. This has a few nice features:

It's obvious it's intended to be used as a script (not a library)
The docs for how to use it are integrated and front and center
You can use it directly on the command line, or test it

This would avoid having to install it into the env in order to test it.

Other stuff

The rest is mostly just random naming suggestions.

The only thing I'd say is definitely worth thinking about is whether pre-baked hashes in the tests is a better way to go. At the moment they use the code which is indirectly under test is used to setup the fixture.

Another approach might be to have tests which explicitly drive it from the outside, and call the method twice. Changing things inbetween or not. This way you'd only be testing the externally observable behavior without having to bake in any hashes if that made you unconfortable.

src/pip_sync_faster/main.py

tests/unit/pip_sync_faster/main_test.py

seanh · 2022-07-23T12:47:57Z

pyproject.toml

 parallel = true
 source = ["pip_sync_faster", "tests/unit"]
+omit = [
+    "*/pip_sync_faster/__main__.py",


This'll go into the cookiecutter, __main__.py will always be omitted from coverage

seanh · 2022-07-23T12:49:57Z

This is ready for re-review:

I've added a __main__.py so that the package can be executed directly with python3 -m pip_sync_faster <args>. Added manual testing instructions to HACKING.md that use this.
Re-named pip_sync_maybe() to just sync()
Added help strings to the CLI (see tox -qe dev --run-command 'pip-sync-faster --help')
The tests now use pre-baked hashes
Accepted various naming suggestions (handle etc)
Accepted various other minor suggestions

File layout

I spent some time thinking about the best file layout for the cookiecutter to use for packages like this one that have a command line interface. This is what I came up with:

pip-sync-faster/
  setup.cfg     <- Declares a console_script that calls cli.py::cli()
  pip_sync_faster/
    __init__.py <- Empty (except maybe some imports)
    __main__.py <- Just imports and calls cli.py::cli()
    cli.py      <- Contains the argparse code and imports and calls something from core.py
    core.py     <- The actual code
  tests/
    unit/
      pip_sync_faster/
        cli_test.py
        core_test.py

If you install the package and run pip-sync-faster it goes setup.cfg -> cli.py::cli() -> core.py, and __main__.py isn't involved. If you run the package directly with python3 -m pip_sync_faster it's __main__.py -> cli.py::cli() -> core.py and setup.cfg isn't involved.

For packages that don't have a CLI the cookiecutter will use the same layout but with the CLI-related files missing:

pip-sync-faster/
  setup.cfg
  pip_sync_faster/
    __init__.py
    core.py
  tests/
    unit/
      pip_sync_faster/
        core_test.py

Here's why I went for this layout:

Empty __init__.py file. I know that some people don't like code in __init__.py and people new to Python often find it surprising. I don't want to have to have a file named __init___test.py. I don't want to have to open __init__.py often because it plays badly with autocomplete (since there are lots of files with that name). Packages might want to put some imports in __init__.py in order to hoist them into the package's top-level namespace, the cookiecutter won't do any of these for you but won't delete any that you put there.
Minimal __main__.py with no tests. This is what the Python docs says is idiomatic usage of __main__.py: when Python developers see a __main__.py file their eyes can pass right over it because they can expect it won't contain any code except to import and call an entry point function. This is how the runnable stdlib packages do it. Also it's nice to avoid a __main___test.py.
Separate cli.py file that contains the argparse code and imports and calls things from core.py. Reasons for separating cli.py out from core.py:
- Two smaller modules instead of one bigger one. Not really necessary for a small package like pip-sync-faster but scales well as the code size grows
- A file named cli.py makes it obvious where the CLI code is going to be found
- You can cookiecutter a package without a CLI then add one later. Change console_script to "yes" in cookiecutter.json and run make template. All the cookiecutter has to do is add the console_script to the templated setup.cfg file and add the __main__.py, cli.py and cli_test.py files. It doesn't need to modify core.py and core_test.py which it can't really touch because the user will have changed these files. If the CLI code were in core.py you couldn't really use make template do add a CLI later like this.
I chose the name core.py because main.py (what it was previously called) would clash with __main__.py. I also didn't want to call it pip_sync_faster.py because this name would have to vary between projects and because it makes for awkward imports especially if the package wants to have a function named pip_sync_faster() as the existence of a module with this same name blocks hoisting the function into __init__.py. You can end up with stuff like from pip_sync_faster.pip_sync_faster import pip_sync_faster.

core.py is expected to be changed by the user including perhaps splitting it up into multiple files or just keeping it as a single file but renaming it. Here in pip-sync-faster I've renamed it to sync.py as that seems a more appropriate name specific to this project. In tox plugins it might get renamed to plugin.py, etc.

I like that we can change around the internals of the package without breaking its interface. For example we could change the contents of cli.py or rename that file etc and as long as we update setup.cfg and __main__.py the pip-sync-faster and python3 -m pip_sync_faster commands won't change.

It's enough for a console script entry point to just return an int and setuptools makes sure that the command exits with that int as its status. This does mean that we have to add a sys.exit() to __main__.py so that it mirrors the same behaviour.

A more appropriate/specific name for this particular project.

seanh · 2022-07-25T10:10:30Z

setup.cfg

 [options.entry_points]
 console_scripts =
-    pip-sync-faster = pip_sync_faster.main:entry_point
+    pip-sync-faster = pip_sync_faster.cli:cli


I'll update this in the cookiecutter: main.py::entry_point() renamed to cli.py::cli()

seanh · 2022-07-25T10:17:15Z

src/pip_sync_faster/__main__.py

+import sys
+
+from pip_sync_faster.cli import cli
+
+sys.exit(cli())


It'll be a good idea to move this __main__.py into the cookiecutter where it can be implemented correctly once and for all. For one thing we can implement __main__.py in the idiomatic way (as here), and we can omit it from test coverage. Also it's actually not as trivial to implement as you might expect:

cli() is used as both the setuptools entry point function (specified in setup.cfg) and the top-level function for __main__.py to call. A setuptools entry point function is expected to return something appropriate for being passed to sys.exit(): None or 0 to exit successfully, an int to exit with that error code, or any printable object (such as a string) that will be printed to stderr before exiting with 1. You can see this by looking at the pip-sync-faster script that setuptools generates when you install the package:

$ cat .tox/dev/bin/pip-sync-faster #!/home/seanh/Projects/pip-sync-faster/.tox/dev/bin/python # -*- coding: utf-8 -*- import re import sys from pip_sync_faster.cli import cli if __name__ == '__main__': sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) sys.exit(cli())

A __main__.py or an if __name__ == "__main__" block should do the same thing: sys.exit(cli()).

This means that cli() itself shouldn't call sys.exit(), it should just return None or 0 to exit successfully or return 1 or return "Some error message" to exit with an error

TIL about sys.exit with a function, nice.

seanh · 2022-07-25T10:24:58Z

src/pip_sync_faster/cli.py

+from pip_sync_faster.sync import sync
+
+
+def cli(_argv=None):  # pylint:disable=inconsistent-return-statements


PyLint will force you to put return 0 or return None in every branch of the function (including at the very bottom) which just seems annoying for an entry point function.

seanh · 2022-07-25T10:26:01Z

src/pip_sync_faster/cli.py

+        "src_files", nargs="*", help="the requirements.txt files to synchronize"
+    )
+
+    args = parser.parse_known_args(_argv)


_argv (which is used by the tests) defaults to None and parse_known_args(None) causes argparse to parse the args from sys.argv

seanh · 2022-07-25T10:27:31Z

src/pip_sync_faster/sync.py

+    # Replace the cached hashes file with one containing the correct hashes for
+    # the requirements files that pip-sync-faster was called with this time.
+    with open(cached_hashes_path, "w", encoding="utf-8") as handle:
+        json.dump(hashes, handle)


When you see it split out into its own file like this the actual core logic of pip-sync-faster is very simple (and only about 40 lines of code without comments and docstrings)

marcospri

This works as expected. It's easy to understand and comes with good docs 👍

marcospri · 2022-07-28T14:23:18Z

src/pip_sync_faster/__main__.py

+import sys
+
+from pip_sync_faster.cli import cli
+
+sys.exit(cli())


TIL about sys.exit with a function, nice.

marcospri · 2022-07-28T14:50:46Z

src/pip_sync_faster/sync.py

+
+
+def sync(src_files):
+    cached_hashes_path = Path(environ["VIRTUAL_ENV"]) / "pip_sync_faster.json"


I reckon environ["VIRTUAL_ENV"] is the right place to store it. It would be just noise in the project's root.

marcospri · 2022-07-28T14:54:50Z

src/pip_sync_faster/sync.py

+
+def get_hash(path):
+    """Return the hash of the given file."""
+    hashobj = hashlib.sha512()


You could probably squeeze a few microseconds switching to sha256, if git considers that future-proof should be enough for this.

It doesn't probably matter.

This commit applies suggestions from the code review in #6

hypothesis/pip-sync-faster#6

hypothesis/pip-sync-faster#6 Fixes #35

seanh changed the title ~~Revert "Remove all non-boilerplate files"~~ Re-add non-boilerplate code and add tests Jul 14, 2022

seanh force-pushed the re-add-non-boilerplate-files-and-add-tests branch from 87995cd to b429198 Compare July 14, 2022 19:56

seanh requested a review from jon-betts July 14, 2022 20:01

seanh marked this pull request as ready for review July 14, 2022 20:01

seanh marked this pull request as draft July 15, 2022 09:58

seanh force-pushed the re-add-non-boilerplate-files-and-add-tests branch from b429198 to 89f5331 Compare July 15, 2022 14:44

seanh requested a review from marcospri July 15, 2022 14:51

seanh marked this pull request as ready for review July 15, 2022 14:52

seanh force-pushed the re-add-non-boilerplate-files-and-add-tests branch from d2bab8e to 368d67b Compare July 18, 2022 11:12

jon-betts reviewed Jul 18, 2022

View reviewed changes

seanh commented Jul 23, 2022

View reviewed changes

Re-add non-boilerplate code and tests

dddaf1b

seanh force-pushed the re-add-non-boilerplate-files-and-add-tests branch from a9bf7c3 to dddaf1b Compare July 23, 2022 12:55

seanh requested a review from jon-betts July 23, 2022 13:14

seanh added 4 commits July 23, 2022 14:51

Don't call sys.exit in cli.py

980aabf

It's enough for a console script entry point to just return an int and setuptools makes sure that the command exits with that int as its status. This does mean that we have to add a sys.exit() to __main__.py so that it mirrors the same behaviour.

Add more exit_code test assertions

ed0fe85

Rename pip_sync_faster() to just sync()

fa275d0

Rename core.py to sync.py

5f1789f

A more appropriate/specific name for this particular project.

seanh commented Jul 25, 2022

View reviewed changes

Disable a PyLint warning

5e7ad22

seanh commented Jul 25, 2022

View reviewed changes

Tweak code comment

439c880

marcospri approved these changes Jul 28, 2022

View reviewed changes

seanh merged commit 3eb4740 into remove-non-boilerplate-files Jul 30, 2022

seanh deleted the re-add-non-boilerplate-files-and-add-tests branch July 30, 2022 15:50

seanh added a commit that referenced this pull request Jul 30, 2022

Apply suggestions from code review

f1295e0

This commit applies suggestions from the code review in #6

seanh mentioned this pull request Jul 30, 2022

Apply suggestions from code review #5

Merged

seanh added a commit to hypothesis/cookiecutters that referenced this pull request Jul 30, 2022

Various tweaks that were suggested in the pip-sync-faster review:

3944c12

hypothesis/pip-sync-faster#6

seanh mentioned this pull request Jul 30, 2022

Various tweaks from the pip-sync-faster review hypothesis/cookiecutters#43

Merged

3 tasks

seanh added a commit to hypothesis/cookiecutters that referenced this pull request Jul 31, 2022

Various tweaks that were suggested in the pip-sync-faster review:

d9e2b99

hypothesis/pip-sync-faster#6

seanh added a commit to hypothesis/cookiecutters that referenced this pull request Jul 31, 2022

Various tweaks that were suggested in the pip-sync-faster review:

49d2b1d

hypothesis/pip-sync-faster#6 Fixes #35

seanh added a commit to hypothesis/cookiecutters that referenced this pull request Jul 31, 2022

Various tweaks that were suggested in the pip-sync-faster review:

ef98f6c

hypothesis/pip-sync-faster#6 Fixes #35

		from pip_sync_faster.sync import sync


		def cli(_argv=None): # pylint:disable=inconsistent-return-statements



		def sync(src_files):
		cached_hashes_path = Path(environ["VIRTUAL_ENV"]) / "pip_sync_faster.json"

Re-add non-boilerplate code and add tests #6

Re-add non-boilerplate code and add tests #6

Uh oh!

Conversation

seanh commented Jul 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to test manually

Uh oh!

jon-betts left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Would this benefit from following a more standard script pattern?

Other stuff

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seanh commented Jul 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

File layout

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seanh Jul 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seanh Jul 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

marcospri left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

seanh commented Jul 13, 2022 •

edited

Loading

jon-betts left a comment •

edited

Loading

seanh commented Jul 23, 2022 •

edited

Loading

seanh Jul 25, 2022 •

edited

Loading

seanh Jul 25, 2022 •

edited

Loading

marcospri left a comment •

edited

Loading