New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRIVERS-2497 Fix paths on Cygwin and Python package dependencies #244
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
If my confusion is unfounded, then feel free to ignore it, otherwise consider clarifying the comment a bit.
.evergreen/venv-utils.sh
Outdated
"$bin" -m "$mod" --system-site-packages "$real_path" || continue | ||
;; | ||
virtualenv) | ||
# -p: ensure correct Python binary is used by virtual environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is -p actually needed here? -p
defaults to the current version of python so this seems redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is required, as some old versions of virtualenv
do not correctly select the Python binary used to create the virtual environment. This is documented by this comment in the old utils.sh
script, but I observed it to be an issue on more than just Debian 10 distros. I wanted to link to a relevant bug report, but could not find one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, can you add the comment from the old script? It's much more informative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. 👍
.evergreen/find-python3.sh
Outdated
local -r real_path="$(cygpath -aw "$tmp")" || return | ||
"$bin" -m venv "$real_path" || return | ||
else | ||
"$bin" -m venv "$tmp" || return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be refactored to avoid duplicating "$bin" -m venv "$tmp" || return
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opted for a dedicated real_path
variable only when required, but I can refactor it to reduce duplication instead.
.evergreen/find-python3.sh
Outdated
|
||
# Sanity check: on some environments (such as Cygwin) creation of the virtual | ||
# environment may succeed but place the environment in an unexpected location. | ||
if [[ -n "$(find "$tmp" -maxdepth 0 -type d -empty 2>/dev/null)" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you show an example of this happening? Regardless can we remove this check because it's already handled by the if/elif/else below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example is as described in the PR description under "Paths on Cygwin".
I suppose it could be considered redundant due to the checks below. The intent of this check was to test if there are any files placed in the intended directory at all, which I felt to be different enough from whether or not an activation script could be found. I can remove/simplify if preferable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I would prefer removing it because it simplifies the script and we don't do anything special for an empty dir.
.evergreen/find-python3.sh
Outdated
if [[ -n "$(find "$tmp" -maxdepth 0 -type d -empty 2>/dev/null)" ]]; then | ||
echo "$tmp is empty despite successful creation of virtual environment!" | ||
return 1 | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments as above.
|
||
if [[ "$windows_os_name" =~ 2016 ]]; then | ||
# Avoid `RuntimeError: Could not determine home directory.` on | ||
# windows-64-2016. See BUILD-16233. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only windows-64-2016? What about windows-64-vsMulti-small (Microsoft Windows Server 2019 Datacenter)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: I reproduced the same issue there. This probably hits all windows hosts on evergreen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the patch testing windows-64-2019, there appeared to be no issue. I was not aware of windows-64-vsMulti-small
. It is unclear to me what the difference between these distros may be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why windows-64-2019 works but windows-64-vsMulti-small doesn't. Either way the issue needs to be fixed on windows-64-vsMulti-small too because that's what we test on in pymongo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just documenting what we discussed via other channels that the windows-64-vsMulti-small was added to the test suite but did not demonstrate failure that was observed when testing on a spawn host, and that this issue could be related to BUILD-12392.
fi | ||
fi | ||
|
||
# Avoid `error: can't find Rust compiler`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if instead of trying to pinpoint which platforms need cryptography<3.4 we just try to install the latest version and if that fails fallback to cryptography<3.4? Like this:
python -m pip install cryptography || python -m pip install 'cryptography<3.4' || ...
python -m pip install -U "${packages[@]}" || ...
This is simpler and should work on more platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is indeed simpler, but I deliberately opted for the current approach in order to be very explicit about conditions that require workarounds and as narrow as possible in the application of said workarounds.
This was motivated by the status quo where generally-applied workarounds such as pinning cryptography to ~=3.4.8
or using CRYPTOGRAPHY_DONT_BUILD_RUST=1
continued to demonstrate unexpected failures, and the conditions for said failures appeared to be inconsistent and opaque. It was unclear to me whenever I encountered such a failure whether it was already known, a new problem, or where the blame should be assigned (did I break it, or did the environment change without my knowing?).
My hope was that being explicit in this manner would make it easier to maintain this script moving forward, with simplifications/removals of special-casing being applied in a controlled and targeted manner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer the generic one to avoid needing to tweak and maintain these 30 extra lines which may or may not cover all the hosts drivers test on. I think a good compromise would be to use the generic approach but add an informative comment that specifically explains why the workaround exists like:
# Installing newer versions of cryptography requires rust when a wheel is not available.
# Fallback to an older version that does not require rust if the install fails. This is needed
# for at least the RHEL 6.2, powerpc64le, zSeries, and power8 hosts.
python -m pip install cryptography || python -m pip install 'cryptography<3.4' || ...
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is an acceptable compromise. Would appeciate other reviewers' thoughts on this before committing to the refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a slight preference for the compromise. That may require less changes to this script as distros undergo changes or more distros are added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Verified by this patch.
fi | ||
fi | ||
|
||
# Avoid `error: can't find Rust compiler`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer the generic one to avoid needing to tweak and maintain these 30 extra lines which may or may not cover all the hosts drivers test on. I think a good compromise would be to use the generic approach but add an informative comment that specifically explains why the workaround exists like:
# Installing newer versions of cryptography requires rust when a wheel is not available.
# Fallback to an older version that does not require rust if the install fails. This is needed
# for at least the RHEL 6.2, powerpc64le, zSeries, and power8 hosts.
python -m pip install cryptography || python -m pip install 'cryptography<3.4' || ...
What do you think?
|
||
if [[ "$windows_os_name" =~ 2016 ]]; then | ||
# Avoid `RuntimeError: Could not determine home directory.` on | ||
# windows-64-2016. See BUILD-16233. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why windows-64-2019 works but windows-64-vsMulti-small doesn't. Either way the issue needs to be fixed on windows-64-vsMulti-small too because that's what we test on in pymongo.
.evergreen/find-python3.sh
Outdated
|
||
# Sanity check: on some environments (such as Cygwin) creation of the virtual | ||
# environment may succeed but place the environment in an unexpected location. | ||
if [[ -n "$(find "$tmp" -maxdepth 0 -type d -empty 2>/dev/null)" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I would prefer removing it because it simplifies the script and we don't do anything special for an empty dir.
.evergreen/venv-utils.sh
Outdated
"$bin" -m "$mod" --system-site-packages "$real_path" || continue | ||
;; | ||
virtualenv) | ||
# -p: ensure correct Python binary is used by virtual environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, can you add the comment from the old script? It's much more informative.
Description
This PR is a followup to #236 and applies a critical patch to DRIVERS-2497 required by Windows distros.
This PR is verified by this patch.
Paths on Cygwin
The tests used to validate behavior in #236 did not account for the presence of pre commands which modifies the environment such that it does not accurately reflect the environment used by Drivers, such as modifying the
$PATH
variable to prefer binaries provided by$MONGODB_BINARIES
such asmktemp
. This hid Cygwin path conversions requirements by Python binaries on Windows when creating the virtual environment, demonstrated below on a windows-64-vs2017-large distro (virtualenv
is used to obtain informative output;venv
demonstrates similar behavior but without any output):A sanity check was added to the
is_venv_capable
andis_virtualenv_capable
functions to ensure correct behavior, as well as an explicit check for the presence of an activation script to provide more informative error messages if one still cannot be found.Seed Packages
In addition to ensuring
venvcreate
handles paths correctly on Cygwin, thevenvcreate
function was updated to ensure all three "seed" packagespip
,setuptools
, andwheel
are consistently installed in the virtual environment, as default behavior is inconsistent depending onvenv
vs.virtualenv
and their respective versions.A drive-by fix to correctly pass
-p "$bin"
when using thevirtualenv
module was also applied.The
--no-cache-dir
argument was removed due to lack of necessity.The
--system-site-packages
argument was added to improve script performance.Error handling of the
venvcreate
function was improved to ensure the virtual environment is only activated on success.deactivate
is only possible/necessary ifvenvcreate
was successful.kmstlsvenv Packages
As a result of ensuring up-to-date
pip
,setuptools
, andwheel
packages in the virtual environment, some distros began to encounter issues with installing required packages for the kmstlsvenv virtual environment. I took this opportunity to strictly narrow down the scope and conditions when default behavior does not suffice for successful installation.The actual packages required by kmstlsvenv scripts,
boto3
andpykmip
, are still pinned to~=1.19.0
and~=0.10.0
respectively. All additional, conditionally pinned packages are dependencies required by these two packages.The
greenlet
package is conditionally pinned to<2.0
to avoid build failures on macos-1012.The
setuptools
package is conditionally pinned to<65.0
to avoid build failures on windows-64-2016 (see BUILD-16233).The
cryptography
package is conditionally pinned to<3.4
to avoid dependency on the presence of a Rust compiler when a cryptography wheel is not available. The associated conditions were narrowed down as much as possible to allow/encourage use of up-to-date packages whenever possible.As with
venvcreate
, error handling of theactivate_kmstlsvenv
function was improved to ensure the virtual environment is only activated on success.deactivate
is only possible/necessary ifactivate_kmstlsvenv
was successful.