Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for python 3.11 #403

Merged
merged 25 commits into from Jan 4, 2023
Merged

Add support for python 3.11 #403

merged 25 commits into from Jan 4, 2023

Conversation

rbailey-godaddy
Copy link
Contributor

@rbailey-godaddy rbailey-godaddy commented Nov 21, 2022

To help us get this pull request reviewed and merged quickly, please be sure to include the following items:

  • Tests (if applicable)
  • Documentation (if applicable)
  • Changelog entry
  • A full explanation here in the PR description of the work done

PR Type

What kind of change does this PR introduce?

  • Bugfix
  • Feature
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • CI related changes
  • Documentation content changes
  • Tests
  • Other

Backward Compatibility

Is this change backward compatible with the most recently released version? Does it introduce changes which might change the user experience in any way? Does it alter the API in any way?

  • Yes (backward compatible)
  • No (breaking changes)

Issue Linking

What's new?

Update actions to latest versions

  • actions/checkout bumped to v3.2.0
  • actions/cache bumped to v3.2.0
  • actions/setup-python bumped to v4.3.1

Update dependencies

  • pygit2 bumped to 1.11.x
  • poetry bumped to 1.2.2
  • Assorted additional bumps due to poetry update

The pygit2 version is the first one that supports python 3.11. An explicit option change was made to maintain compatibility with the behavior of pygit2 version 1.9.1 and earlier (used by existing tartufo versions), with respect to allowing repositories owned by other users to be scanned.

The calculate_entropy() method was changed to static, as it does not utilize any instance or class data and the latest pylint was offended by the prospect of caching self instances.

@rbailey-godaddy
Copy link
Contributor Author

All checks technically pass, but I think this should be held up until after #400 is merged and we can ignore python 3.6.

There is a latent problem (examine the commit history for this branch LOL) wherein after I update poetry.lock in my local environment using poetry 1.2.2, the old poetry 1.1.5 required for python 3.6 starts misbehaving and removes setuptools, basically brain-damaging python 3.6-based environments. I tried several different ways to finesse this issue, without success.

If somebody who still has poetry 1.1.5 (or earlier) installed wants to try a poetry lock and then pushing the result to this branch, it would be interesting to see what happens. I am not invested enough to want to downgrade my poetry, and I can't get pyenv to build python 3.6 successfully so local testing isn't feasible.

As it stands, pyproject.toml and poetry.lock are slightly out of sync with each other (with respect to pygit2) and I do not have a high degree of confidence that the stack being tested by tox actually matches the stack being deployed by docker. Better to wait until we know for sure.

@rbailey-godaddy
Copy link
Contributor Author

Rebased and cleaned up for a python3.6-free world. No problems with dependency bumps, although pylint came up with a few new callouts. I verified the pygit2 bump does indeed allow us to bypass compilation in the docker image, and Dockerfile is streamlined accordingly.

@rbailey-godaddy rbailey-godaddy marked this pull request as ready for review November 23, 2022 18:45
@rbailey-godaddy rbailey-godaddy marked this pull request as draft November 23, 2022 19:27
@rbailey-godaddy rbailey-godaddy changed the title Update actions to latest versions Add support for python 3.11 Nov 23, 2022
@rbailey-godaddy
Copy link
Contributor Author

There is a subtle problem baked into this PR somewhere. By way of a baseline, the recently-issued 3.3.1 docker container works as expected (and is based on python 3.10). The container generated here (based on python 3.11) is bogus -- I have extended the CI workflow to include a functionality test that catches this problem (and would have caught the 3.3.0 problem that needed 3.3.1 for a fix, if we'd had this at the time).

The issue is that this:

import pygit2
foo = pygit2.Repository(".")

works with the old image but fails with "no repository found" with the new image. I have not yet figured out why...

  • It's not python 3.11, because changing the build simply to use 3.10 as a base does not fix the problem
  • It's not pygit2 1.11.x because backing it down to 1.10.x (with python 3.10) does not fix the problem -- note 1.10.x is not claimed to support python 3.11

I manually created a bare virtual environment with python 3.11 and pygit2 1.11.1 and the test code above works, so whatever the problem is doubtless will turn out to be something subtle.

When the docker CI test passes, we should be good to go.

@rbailey-godaddy
Copy link
Contributor Author

I still haven't found a fix for the misbehavior, but here is a very suggestive test case:

(tartufo-py3.10) sbailey@WIN-2I7EHEB5T6B:/tmp$ git clone https://github.com/godaddy/tartufo /tmp/foobar
Cloning into '/tmp/foobar'...
remote: Enumerating objects: 3426, done.
remote: Counting objects: 100% (189/189), done.
remote: Compressing objects: 100% (102/102), done.
remote: Total 3426 (delta 107), reused 129 (delta 87), pack-reused 3237
Receiving objects: 100% (3426/3426), 1.51 MiB | 8.80 MiB/s, done.
Resolving deltas: 100% (2283/2283), done.
(tartufo-py3.10) sbailey@WIN-2I7EHEB5T6B:/tmp$ docker run --network host --rm -it -v /tmp/foobar:/git --entrypoint /bin/sh tartufo
# git status /git
On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean
# /venv/bin/tartufo scan-local-repo /git
Traceback (most recent call last):
  File "/venv/bin/tartufo", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/venv/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/click/decorators.py", line 38, in new_func
    return f(get_current_context().obj, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/tartufo/commands/scan_local_repo.py", line 50, in main
    scanner = GitRepoScanner(options, git_options, str(repo_path))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/tartufo/scanner.py", line 810, in __init__
    super().__init__(global_options, repo_path)
  File "/venv/lib/python3.11/site-packages/tartufo/scanner.py", line 715, in __init__
    self._repo = self.load_repo(self.repo_path)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/tartufo/scanner.py", line 824, in load_repo
    repo = pygit2.Repository(repo_path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/pygit2/repository.py", line 1606, in __init__
    path_backend = init_file_backend(path, flags)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_pygit2.GitError: Repository not found at /git
# git clone https://github.com/godaddy/tartufo /foo
Cloning into '/foo'...
remote: Enumerating objects: 3426, done.
remote: Counting objects: 100% (189/189), done.
remote: Compressing objects: 100% (102/102), done.
remote: Total 3426 (delta 107), reused 129 (delta 87), pack-reused 3237
Receiving objects: 100% (3426/3426), 1.51 MiB | 8.65 MiB/s, done.
Resolving deltas: 100% (2283/2283), done.
# /venv/bin/tartufo scan-local-repo /foo
/venv/lib/python3.11/site-packages/tartufo/scanner.py:662: DeprecationWarning: Signature 358a1ae040ab71b3aafeb4318e5d9e318ff1b681c5bf0e99d381584a33ee7833 was generated by an old version of tartufo and is deprecated. tartufo 4.x will not recognize this signature. Please update your configuration to use signature 38d3398905c382a79857e884f33062a5a38e4e3f8b5d54f269879601b2e847fc instead.
  warnings.warn(
/venv/lib/python3.11/site-packages/tartufo/scanner.py:662: DeprecationWarning: Signature 358a1ae040ab71b3aafeb4318e5d9e318ff1b681c5bf0e99d381584a33ee7833 was generated by an old version of tartufo and is deprecated. tartufo 4.x will not recognize this signature. Please update your configuration to use signature 38d3398905c382a79857e884f33062a5a38e4e3f8b5d54f269879601b2e847fc instead.
  warnings.warn(
/venv/lib/python3.11/site-packages/tartufo/scanner.py:662: DeprecationWarning: Signature 6c70b4a712ada69f2a6c919edc783cf485b6637e3c5429de376ef1a451a042e4 was generated by an old version of tartufo and is deprecated. tartufo 4.x will not recognize this signature. Please update your configuration to use signature d26b223efb2087d4a3db7b3ff0f65362372c46190fd74b8061db8099f4a2ee60 instead.
  warnings.warn(
/venv/lib/python3.11/site-packages/tartufo/scanner.py:662: DeprecationWarning: Signature 357841766ca69979608e4e7fb1da2addeafe6d15cade15466c39d903cc0488a8 was generated by an old version of tartufo and is deprecated. tartufo 4.x will not recognize this signature. Please update your configuration to use signature 94c45f9acae0551a2438fa8932fac466e903fd9a09bd2e11897198b133937ccd instead.
  warnings.warn(
/venv/lib/python3.11/site-packages/tartufo/scanner.py:662: DeprecationWarning: Signature 88152f1d10077f417ee5aea5fa86adaa5b74d3f2de2e87ce4b4aa5cd79c7b2f5 was generated by an old version of tartufo and is deprecated. tartufo 4.x will not recognize this signature. Please update your configuration to use signature 482e7b31188f1e59dd6eb5b21c46f7467449ffcbeba9a4ca412dfcd8ea9ef66f instead.
  warnings.warn(
/venv/lib/python3.11/site-packages/tartufo/scanner.py:662: DeprecationWarning: Signature 23996f9309c042d37958d28c2cc0c16a060a03376ee075b5b077609e2a299e44 was generated by an old version of tartufo and is deprecated. tartufo 4.x will not recognize this signature. Please update your configuration to use signature 482e7b31188f1e59dd6eb5b21c46f7467449ffcbeba9a4ca412dfcd8ea9ef66f instead.
  warnings.warn(
/venv/lib/python3.11/site-packages/tartufo/scanner.py:662: DeprecationWarning: Signature 85109edb8081a60f3a44139a98b643b8cacb777f4d5c814276ae729d50fe414c was generated by an old version of tartufo and is deprecated. tartufo 4.x will not recognize this signature. Please update your configuration to use signature bf48e316ecad0a37071ed81cc54a6b629a1a3bf73cf52d06a77c9fbaf91bc833 instead.
  warnings.warn(
Time: 2022-12-14T21:46:30.482292
All clear. No secrets detected.
#

That is,

  • I clone tartufo into a fresh working directory
  • I run the container
  • Within the container, git believes the working directory is valid
  • tartufo (actually pygit2) thinks the working directory is not valid
  • I repeat the clone (this time inside the container) to a new scratch directory
  • tartufo does succeed in scanning this second directory -- even though it should be identical to the first

Not shown, I embedded rsync so I could copy /git to /foo2 and it turns out that /foo2 is not scannable either.

This suggests the problem is somehow with the directory mapping, but that's really bizarre. (I am experimenting in WSL.) I have tried backing off to base image python:3.10-slim while changing nothing else, but that fails the same way. Then backing off pygit2 from 1.11.1 to 1.10.1 also (the older version isn't supported with python 3.11), and that still fails.

I am completely flummoxed by what is going on. I would be interested to hear if anybody can reproduce these results, or obtain contrary results (that would point the finger at my screwed-up environment).

@rbailey-godaddy
Copy link
Contributor Author

It looks like our problem may be a regression in pygit2 1.10.0. Here is the simple test to reproduce:

  • Checkout main, build container, verify container works
  • Hack Dockerfile to change version for pygit2 only:
diff --git a/Dockerfile b/Dockerfile
index 1318c92..18d49e0 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -13,7 +13,7 @@ RUN pip --no-cache-dir install "poetry==$POETRY_VERSION"
 RUN python -m venv /venv

 COPY pyproject.toml poetry.lock ./
-RUN poetry export -f requirements.txt | /venv/bin/pip install -r /dev/stdin
+RUN poetry export --without-hashes -f requirements.txt | sed '/pygit2/s/1.9.1/1.10.1/' | /venv/bin/pip install -r /dev/stdin

 COPY . .
 RUN poetry build && /venv/bin/pip install dist/*.whl
  • Build container, verify container fails as described earlier ("Repository not found at ...")
  • Compare pip output from both builds, confirming version numbers are identical except for pygit2 per above

@rbailey-godaddy
Copy link
Contributor Author

Further bisection using the technique I just outlined narrows the problem space to somewhere between pygit2 1.9.1 (which works) and pygit2 1.9.2 (which fails)

@rbailey-godaddy
Copy link
Contributor Author

Veeeerrrry interestingly, the failure message in a manual diagnostic for 1.9.2 is much more informative than later versions:

$ docker run --rm -it -v $PWD:/git --entrypoint /venv/bin/python tartufo:pygit2-1.9.1
Python 3.10.9 (main, Dec 21 2022, 08:51:48) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pygit2
>>> foo = pygit2.Repository(".")
>>>

$ docker run --rm -it -v $PWD:/git --entrypoint /venv/bin/python tartufo:pygit2-1.9.2
Python 3.10.9 (main, Dec 21 2022, 08:51:48) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pygit2
>>> foo = pygit2.Repository(".")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/venv/lib/python3.10/site-packages/pygit2/repository.py", line 1620, in __init__
    path_backend = init_file_backend(path, flags)
_pygit2.GitError: .: repository path '/git/' is not owned by current user
>>>

The delta in pygit2 between 1.9.1 and 1.9.2 is extremely small, and the only consequential change in this context is a version bump from libgit2 from 1.4.2 to 1.4.3. Therefore, I'm now investigating what happened between those two releases.

@rbailey-godaddy
Copy link
Contributor Author

You just can't win. I added commit cb2cd60 which addresses the change of behavior in libgit2 that was breaking this PR, but now the linter actions all fail with some weird error, even though they all pass locally. In any case, I am moving this PR out of draft as I believe it now is ready for merging.

@rbailey-godaddy rbailey-godaddy marked this pull request as ready for review December 21, 2022 21:50
* actions/checkout bumped to v3.1.0
* actions/cache bumped to v3.0.11
* actions/setup-python bumped to v4.3.0
docker/setup-buildx-action now v2.2.1 (was v1.6.0)
* Adjust pygit2 requirements to match latest support matrix
* Bump poetry to latest stable version
Force pip to install a version of poetry new enough to handle setuptools
correctly, instead of whatever old crummy version it happens to find
lying around in cache.
So forcing a latest-and-greatest poetry for python 3.6 doesn't work
because the latest and greatest version isn't great enough. Fall back to
letting pip install the poetry version it wants, but then additionally
install setuptools explicitly; this way, if poetry fumbles and thinks it
won't be needed, it's okay because we already installed it.
Forcing setuptools installation manually before running poetry is
ineffective, because the old poetry for python 3.6 removes the
"unnecessary" package.

In the key area, namely the python unit test section of tox.ini, try
telling poetry we want setuptools also to see if we can avoid this
behavior.

FWIW, I am beginning to suspect this entire situation is due to
rebuilding poetry.lock with a new poetry version and then letting an old
poetry version try to interpret it. The problem only arises with python
3.6, because other pythons support versions new enough to not be
braindamaged.
* Add 3.11 to various test matrix
* Bump Dockerfile base from 3.10 to 3.11
* Drop references to python 3.6 in test matrices
* Incorporate all of the bot PR version bumps
* Skip GitPython over all of the yanked versions
* Bump pylint to latest version -- eliminates numerous false positive
  detections when run with python 3.11
* Bump minimum python to 3.7.2 to placate pylint after update
* Remove exclusions for deprecated/deleted pylint checks
* Minor reworking of some tests to eliminate new pylint complaints
We need the latest version for python 3.11 support, so bump version but
pin python 3.7 to previous version, which is the last one that supports
the old python.
We've had several instances where the docker build is "successful" but
the resulting container image is completely brain-damaged. Add a step
that exercises it by running a scan of the working directory used to
build the container.
* Instead of having poetry tell pip how to install packages, just make
  it do the job itself.
* Since we don't consume tarball dists, don't generate them.

If you are wondering why we don't just do a single `poetry install` to
do all of the prerequisites *AND* tartufo itself, it's because it
doesn't quite work. I think there are linkages to the original content
in /app that isn't carried through to the final image; making that work
more sanely is left as an exercise for the student.
Latest tagged version understands our updated signatures
Consolidate steps so we actually have the image available for testing
Hopefully this will save the image so we can test it
libgit2 version 1.4.3 introduced a number of new sanity checks aimed at
thwarting attacks where an untrusted entity could inject configuration
files into a repository that might be read and acted upon by an unwary
consumer. By default, it refuses to read a repository that is not owned
by the current user.

This change became visible in pygit2 1.9.2, which bumped its internal
libgit2 linkage from 1.4.2 to 1.4.3. Initially, 1.9.2 was kind enough to
report explicitly when raising a GitError that "repository path ... is
not owned by current user" -- but later versions did not include this
information, making it very difficult to determine why the "[r]epository
not found at ..." error was occurring.

A new option, `GIT_OPT_SET_OWNER_VALIDATION`, was added to libgit2 to
allow users to opt out of this behavior, but it was not included in
pygit2 1.9.2; luckily by the time we get to pygit2 1.11.x, it is
visible.

This commit explicitly forces `GIT_OPT_SET_OWNER_VALIDATION` to `0`,
restoring compatibility with pygit2 1.9.1 and earlier.

While there are many cases where the default behavior might be
appropriate, as a practical manner we must opt out to support use from a
docker container, where the owner of the host files will almost never be
the same as the process running tartufo inside the container. It's much
more straightforward to preserve existing behavior in all cases.

If you are a paranoid, do not run scans of repositories you do not own.
Note that remote repositories were not affected by this change, because
they always caused a clone operation where the resulting working
directory is created (and therefore owned) by the current user.
Update github actions to latest released versions, to see if this fixes
inscrutable linter workflow failures
`whitelist_externals` -> `allowlist_externals`
@rbailey-godaddy rbailey-godaddy merged commit 085d9ea into main Jan 4, 2023
@rbailey-godaddy rbailey-godaddy deleted the action-versions branch January 4, 2023 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants