Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hygienic alternatives to Python virtual environment scripts #236

Merged
merged 18 commits into from
Nov 3, 2022

Conversation

eramongodb
Copy link
Contributor

Description

This PR is the first of several PRs whose goal is to improve the robustness and hygiene of DET shell scripts by taking full advantage of the features provided by Bash.

This PR is focused on providing improved alternatives to scripts that pertain to Python virtual environment creation and activation. This PR does not change the behavior of any existing scripts. Followup PRs will attempt to incorporate these new scripts into existing scripts, which may change their behavior.

find-python3.sh

This file defines four functions:

  • is_python3
  • is_venv_capable
  • is_virtualenv_capable
  • find_python3

The primary purpose of this file is to provide find_python3, which is implemented in terms of the other three functions.

find_python3 aims to serve as a reliable and comprehensive utility function to replace the numerous instances of "find a python3 binary capable of creating a virtual environment" conditional logic found in many Evergreen scripts, both in the DET repository (e.g. activate_venv.sh) and in many other Drivers repositories. All demonstrate discrepancies from one another and are often not comprehensive in their handling of various variant-specific idiosyncracies.

Most notably, find_python3 and other functions provided by find-python3.sh ensure no variables used within the functions are leaked into the parent shell. The name of the resulting python3 binary (and only the name) is printed to stdout on success; all diagnostic and error messages are printed to stderr and may be silenced by redirecting 2>/dev/null. This allows for convenient patterns such as the following:

$ cd ./.evergreen/csfle
$ . ../find-python3.sh
$ PYTHON="$(find_python3)" ./activate_venv.sh

This is unlike, for example, the current utils.sh, which "leaks" the $PYTHON variable upon invocation of venvcreate and can inadvertently affect the behavior of other scripts if one is not careful, such as when subsequently using set-temp-creds.sh:

$ cd ./.evergreen/csfle
$ . ./activate-venv.sh  # Sets `PYTHON=/path/to/python` within venvcreate; installs boto3 package to venv.
$ . ./set-temp-creds.sh # Uses PYTHON set by venvcreate instead of virtual environment's python!
                        # boto3 package was installed using venv, not the system!
...
ModuleNotFoundError: No module named 'boto3'

This error can be worked-around by explicitly setting PYTHON=python after activate_venv.sh to ensure the venv python is used instead of any prior value of $PYTHON. However, it is preferable if such variable leaks did not occur in the first place.

venv-utils.sh

This file is a hygienic alternative to utils.sh.

As described above, the $PYTHON variable (among others) is leaked by the venvcreate and venvactivate functions defined by utils.sh. Furthermore, their behavior is affected by the $OS environment variable parameter, which is often set and used by other Evergreen scripts in manners inconsistent to what utils.sh expects. This can have the effect of indirectly breaking the behavior of utils.sh functions. Lastly, venvcreate is defined to exit 1 on error (terminates the shell), which limits error handling options by the user.

The venvcreate and venvactivate functions provided by venv-utils.sh resolves each of these issues. All parameters to each function is explicitly documented and asserted. The function's behavior depends solely on the provided arguments with no dependency on environment variables. On error, each function immediately returns a non-zero exit code to allow callers to handle errors as appropriate to their needs.

activate-kmstlsvenv.sh

This file is a hygienic alternative to activate_venv.sh.

As with activate_venv.sh, this script is meant to be invoked by the caller to create and activate the kmstlsvenv virtual environment. It is implemented in terms of both find-python3.sh and venv-utils.sh to avoid the variable leakage problems described above. Furthermore, on error, it immediately returns a non-zero exit code instead of continuing execution as is currently done by activate_venv.sh.

ShellCheck

All three files were analyzed by ShellCheck using the command shellcheck -x ./path/to/file.sh from the root directory of the DET repository. Automated ShellCheck integration is outside the scope of this PR, but hopefully in the near future more DET scripts will be analyzed by ShellCheck to improve their quality.

@ShaneHarvey
Copy link
Contributor

ShaneHarvey commented Oct 7, 2022

I would prefer that we fix the seemingly small bugs in the current scripts rather than writing everything from scratch. Would these problems be solved by changing venvcreate/venvactivate to use private variables (eg $_python instead of $PYTHON) and changing set-temp-creds.sh to use "python" instead of "$PYTHON"?

@eramongodb
Copy link
Contributor Author

@ShaneHarvey I considered replacement as improvement, but due to the nature of the issues being addressed, I am concerned that doing so may have an unbounded scope of potentially breaking changes to existing users of these scripts. As the version of DET scripts being used by Drivers are often not pinned to a given commit (just a git pull), this could have the undesirable effect of unexpectedly and unconditionally breaking Drivers' Evergreen builds that may be deliberately or unknowningly depending on the current behavior of these scripts (leak of environment variables, use of environment variables, error handling and shell termination behavior, etc.). Thus the emphasis on, "This PR does not change the behavior of any existing scripts".

By providing these scripts separately, Drivers will be given the opportunity to gradually test and migrate to using these hygienic scripts before further behavior-changing modifications are introduced to DET scripts (I am eyeing the start-orchestration.sh script in particular). Eventually, the current scripts can then be deprecated and removed in a controlled manner once these new alternatives have been sufficiently battle-tested and adopted.

@ShaneHarvey
Copy link
Contributor

If this was a change that we knew would break all driver testing I might agree with this approach but I don't think that's the position we're in. The minor tweaks to avoid shadowing or unintentionally reusing $PYTHON seem like small fixes that have low chance of breaking anyone. The plus side of my suggestion is that in the best case we fix these bugs and drivers don't need to change anything at all.

The normal approach we'd use is to fix the bug, test it in 1 or 2 drivers, merge, and then send an announcement in #drivers that teams might see python env setup failures in CSFLE/MONGODB-AWS test suites with a suggestion of how to fix it.

Copy link
Contributor

@rcsanchez97 rcsanchez97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand the concerns of Shane, though I don't share them. It seems like a step in the right direction to make our builds more hygienic and robust and this approach is superior to incremental bug fixes to the existing scripts. That said, I'll wait on approving to see what others might have to say.


local -r bin="${1:?'is_venv_capable requires a name or path of a python binary to test'}"

# Use a temporary directory to avoid polluting the caller's enviornment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Use a temporary directory to avoid polluting the caller's enviornment.
# Use a temporary directory to avoid polluting the caller's environment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


local -r bin="${1:?'is_virtualenv_capable requires a name or path of a python binary to test'}"

# Use a temporary directory to avoid polluting the caller's enviornment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Use a temporary directory to avoid polluting the caller's enviornment.
# Use a temporary directory to avoid polluting the caller's environment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

}

# /opt/mongodbtoolchain/vX/bin/python
append_bins "/opt/mongodbtoolchain" "v[0-9]*" "bin/python3" "bin/python"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be reconsidered. We use the toolchain in various places and those uses have the potential to become problematic. Specifically, the toolchain is intended for the server project, so it can break without warning. It may be that the way this script checks for presence and capability of Python is resilient against unexpected changes to the toolchain. If that's the case then a comment to that effect might be good here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prioritization of toolchain over system binaries was motivated by older tests where the selection of a binary under /opt/python failed where /opt/mongodbtoolchain did not. Revisiting the tests, this does not appear to be the case for any of the currently supported variants. Given this is the case, I think I am willing to revert the ordering to the "prefer system binaries first, toolchain binaries last" order I had initially preferred.

Regarding resiliance, the is_venv_capable and is_virtualenv_capable tests ensure that regardless of the state of the toolchain, it will correctly deduce whether the associated binaries can create a virtual environment. Unless the toolchain breaks the currently established pattern of placing Python binaries under /opt/mongodbtoolchain/vX/bin for a given value of X (currently 2 through 4, I believe), this script should continue to behave as intended.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon further testing, I rediscovered that prioritizing system binaries (e.g. plain python or python3) may lead to cryptography installation failures on rhel81-power8-small. In response, I have demoted the priority of system Python binaries to be tested after all other "explicitly-managed" Python binaries have been tested first.

Copy link
Contributor

@benjirewis benjirewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These scripts LGTM; cool bash coding, thanks @eramongodb 🧑‍🔧 .

I do wonder if we could just replace the old scripts with these new scripts as part of this PR, though. Doing it in separate PRs just seems like prolonging the possible breaking of drivers' reliance on the scripts.

Copy link
Contributor

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM.

If this was a change that we knew would break all driver testing I might agree with this approach but I don't think that's the position we're in. The minor tweaks to avoid shadowing or unintentionally reusing $PYTHON seem like small fixes that have low chance of breaking anyone.

Testing the activate-kmstlsvenv.sh script with Go suggests there are required changes to update Evergreen config to explicitly use bash:

[2022/10/10 17:17:53.902] sh: 17: ./activate-kmstlsvenv.sh: [[: not found
[2022/10/10 17:17:53.902] sh: 52: ../find-python3.sh: Syntax error: "(" unexpected
[2022/10/10 17:17:53.903] Command ''shell.exec' in "start-cse-servers"' failed: shell script encountered problem: exit code 2.

I suggest filing a DRIVERS ticket to request drivers use the new scripts. Once all drivers have implemented the required changes, the existing scripts can be safely updated.

.evergreen/find-python3.sh Outdated Show resolved Hide resolved
.evergreen/find-python3.sh Outdated Show resolved Hide resolved
.evergreen/find-python3.sh Outdated Show resolved Hide resolved
@ShaneHarvey
Copy link
Contributor

Testing the activate-kmstlsvenv.sh script with Go suggests there are required changes to update Evergreen config to explicitly use bash:

This was the point I was trying to make. I don't think we should make all teams update their scripts to use this new approach. Instead we can make the small $PYTHON bug fixes to the existing scripts and teams (most likely) wouldn't need to do any work.

I do like the modern bash and nice error messages though.

@kevinAlbs
Copy link
Contributor

This was the point I was trying to make. I don't think we should make all teams update their scripts to use this new approach. Instead we can make the small $PYTHON bug fixes to the existing scripts and teams (most likely) wouldn't need to do any work.

I do like the modern bash and nice error messages though.

I see. I misinterpreted the point. I am not opposed to the small fixes in the existing scripts. But IMO this PR is still an overall improvement. I do not consider it a requirement to update the existing scripts to merge.

@eramongodb
Copy link
Contributor Author

eramongodb commented Oct 10, 2022

we can make the small $PYTHON bug fixes to the existing scripts and teams (most likely) wouldn't need to do any work

This may be somewhat feasible for activate-kmstlsvenv.sh, but not for venv-utils.sh, and certainly not for find-python3.sh.

These scripts were motivated by external discussions suggesting that Evergreen supports Bash on all distros, even on Windows via Cygwin (see also the shell parameter for the shell.exe Evergreen project command). This guarantee is what permitted me to make full use of Bash features in order to accomplish the goal(s) of this PR.

Running ShellCheck on activate-kmstlsvenv.sh with the shebang set to invoke sh yields the following:

In .evergreen/csfle/activate-kmstlsvenv.sh line 17:
  if [[ -d kmstlsvenv ]]; then
     ^-----------------^ SC2039: In POSIX sh, [[ ]] is undefined.

For more information:
  https://www.shellcheck.net/wiki/SC2039 -- In POSIX sh, [[ ]] is undefined.

Similarly, running ShellCheck on venv-utils.sh with the shebang set to invoke sh yields the following (repeats of warnings omitted for brevity):

In .evergreen/venv-utils.sh line 27:
  local -r bin="${1:?'venvcreate requires a Python binary to use for the virtual environment'}"
  ^----------^ SC2039: In POSIX sh, 'local' is undefined.


In .evergreen/venv-utils.sh line 30:
  if "$bin" -m venv -h &>/dev/null; then
                       ^---------^ SC2039: In POSIX sh, &> is undefined.


In .evergreen/venv-utils.sh line 44:
  if [[ -f "$path/Scripts/activate" ]]; then
     ^-- SC2039: In POSIX sh, [[ ]] is undefined.

The [[ ]] warnings can be easily addressed by demoting the conditional construct used to the Bourne shell builtin [ ] and test (the Bash builtin [[ ]] was used in these scripts as the new, improved version of [ ] and test). Similarly, the &> warnings can be addressed by using the equivalent 1>/dev/null 2>&1 instead of &>/dev/null.

However, the local warnings are not addressable, as POSIX sh does not support local scoping of variables. None of the workarounds seem applicable (subshells prevent use of functions as source) or recommendable (complex boilerplate, not scalable).

Given the primary goal of this PR is to avoid variable leakage, being unable to use Bash in full due to semi-backwards-compatibility requirements with current use of activate_venv.sh and utils.sh (as observed when testing with Go) would significantly reduce the value of this PR.

Copy link
Contributor

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we don't need to use bash local variables, for example we can prefix the vars to avoid clashing in practice:

venvcreate () {
+    __PYTHON="$1"
-    PYTHON="$1"

Another suitable option would be to remove the var altogether and use "$1" instead of "$PYTHON".

@@ -0,0 +1,29 @@
#!/usr/bin/env bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't shebangs meaningless (ignored) for scripts that are intended to be sourced? Eg changing this to #!/usr/bin/env sh would have no effect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is twofold:

  • To document that the scripts are designed to be used in Bash shells.
  • To indicate to ShellCheck that the script should be analyzed as a Bash script.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha thanks for explaining that.

trap 'rm -rf "$tmp"' EXIT

# Evaluate the result of this function.
"$bin" -m venv "$tmp" 1>&2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall that in some broken environments python -m venv/virtualenv can succeed but the activate script fails. But no need to change anything now. Let's wait until we actually encounter such a failure before we try to workaround it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fairly trivial to add activation to the test, so I do not mind including it. However, I did not uncover during testing any distros where activation failed when venv creation did not.

More importantly, your comment made me realize that venvcreate needs to account for possible failure during venv creation + activation when falling back to using virtualenv to match the capability detection logic. I have updated it accordingly.

Copy link
Contributor Author

@eramongodb eramongodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can prefix the vars to avoid clashing in practice

This is not a proper solution, as it does not address the issue of variable leakage. This just changes the name of the variable being leaked. The environment will still be polluted by variables. This is not scalable.

Another suitable option would be to remove the var altogether and use "$1" instead of "$PYTHON".

This is also not a scalable solution. Forbidding the use of named variables in non-subshell functions in order to continue supporting the Bourne shell (where local is not available) does not align with the goals of this PR, which as stated earlier, "is to improve the robustness and hygiene of DET shell scripts by taking full advantage of the features provided by Bash".

@@ -0,0 +1,29 @@
#!/usr/bin/env bash
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is twofold:

  • To document that the scripts are designed to be used in Bash shells.
  • To indicate to ShellCheck that the script should be analyzed as a Bash script.

@eramongodb
Copy link
Contributor Author

Latest changes are verified by this patch. The set of distros tested are (best-effort) all non-EOL'd distros as currently listed in the MongoDB Platform Roadmap. The additions to the Evergreen config are omitted from this PR, but may be included if desirable.

The root cause of the test failure on macos-1012 is being tracked by BUILD-16106.

local -r version_str="$(perl -lne 'print $1 if m/.*([0-9]+\.[0-9]+\.[0-9]*)/' -- <(printf "%s" "$version_output"))"

# Evaluate 3.0.0 <= x.y.z.
sort -CV -- <(printf "%s\n%s\n" "3.0.0" "$version_str")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this logic can be implemented much simpler in python. For example:

if python -c "import sys; exit(sys.version_info[0] < 3)"; then
   echo "python 3"
else
   echo "python 2"
fi

this would avoid the dependency on "sort".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified this suggestion works as suggested. Thank you!

However, it does not completely remove the GNU Coreutils sort dependency as it is used below to prioritize newer MongoDB Toolchain and Python versions under /opt. It seems like macos-1012 is circumstantially passing due to system python3 satisfying the requirements despite sort command failures when searching for binaries under /opt.

@eramongodb
Copy link
Contributor Author

Unless there are any remaining concerns, I think I am comfortable merging this PR given all distros currently supported according to the MongoDB Platform Roadmap are currently passing despite BUILD-16106.

The plan is to then create a DRIVERS Task instructing Drivers to replace use of activate_venv.sh and utils.sh with csfle/activate-kmstlsvenv.sh and venv_utils.sh respectively. Once all Drivers have completed this task, the old activate_venv.sh and utils.sh scripts can be safely removed from this repo.

@eramongodb eramongodb merged commit f5d1c04 into mongodb-labs:master Nov 3, 2022
@eramongodb eramongodb deleted the det-scripts branch November 3, 2022 21:17
@eramongodb
Copy link
Contributor Author

The plan is to then create a DRIVERS Task instructing Drivers to replace use of activate_venv.sh and utils.sh with csfle/activate-kmstlsvenv.sh and venv_utils.sh respectively.

DRIVERS-2497 has been created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants