Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

env: clear environment variables that interfere with Python #375

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
30 changes: 23 additions & 7 deletions src/build/env.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
import tempfile

from types import TracebackType
from typing import Callable, Iterable, List, Optional, Tuple, Type
from typing import Callable, Dict, Iterable, List, Optional, Tuple, Type

import packaging.requirements
import packaging.version
Expand Down Expand Up @@ -84,8 +84,17 @@ def _subprocess(cmd: List[str]) -> None:
class IsolatedEnvBuilder:
"""Builder object for isolated environments."""

_ENV_VARS_TO_CLEAR = (
'PYTHONHOME',
'PYTHONPATH',
'PYTHONPLATLIBDIR',
'PYTHONSTARTUP',
'PYTHONNOUSERSITE',
)

def __init__(self) -> None:
self._path: Optional[str] = None
self._old_env_values: Dict[str, Optional[str]] = {}

def __enter__(self) -> IsolatedEnv:
"""
Expand All @@ -102,16 +111,19 @@ def __enter__(self) -> IsolatedEnv:
else:
self.log('Creating venv isolated environment...')
executable, scripts_dir = _create_isolated_env_venv(self._path)
return _IsolatedEnvVenvPip(
path=self._path,
python_executable=executable,
scripts_dir=scripts_dir,
log=self.log,
)
except Exception: # cleanup folder if creation fails
self.__exit__(*sys.exc_info())
raise

self._old_env_values = {name: os.environ.pop(name, None) for name in self._ENV_VARS_TO_CLEAR}

return _IsolatedEnvVenvPip(
path=self._path,
python_executable=executable,
scripts_dir=scripts_dir,
log=self.log,
)

def __exit__(
self, exc_type: Optional[Type[BaseException]], exc_val: Optional[BaseException], exc_tb: Optional[TracebackType]
) -> None:
Expand All @@ -122,6 +134,10 @@ def __exit__(
:param exc_val: The value of exception raised (if any)
:param exc_tb: The traceback of exception raised (if any)
"""
for name, old_value in self._old_env_values.items():
if old_value is not None:
os.environ[name] = old_value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, this makes the entire class not thread safe. IMHO we should instead create a copy of os.environ, alter that one and pass it down to the subprocess calls we end up invoking.

Copy link
Member Author

@FFY00 FFY00 Oct 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case we need to add a subprocess helper and require people to always use it, which is very limiting and requires people to change their code. Most people are using it in single threaded code, so it would be very disruptive IMO.

What about adding a keep_env/skip_env argument, defaulting to False, to disable the environment variable modification, and add both a subprocess helper and a env attribute with the env that should be used in subprocess invocations? The bad side is that people running this in multi-threaded/parallel situations would have to opt-in, but this way it would not disrupt existing code and would keep the API simple for single-threaded code, which is most of it. I feel this compromise is reasonable, what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable if that's the case 👍 I thought this might be easier considering we already overwrite the pep517 packages subprocess invocation with our own as far as I remember 🤔 (so we should now that part).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, actually, even if a bit disruptive, I am now leading towards update_env argument, with the opposite function. The API I proposed above motivates? (I don't remember the word I was looking for, something along those lines but that made sense in this sentence) non thread safe APIs, which is something we should probably avoid. Worse case scenario, people will get the exact same behavior as currently, they just might run into #373. And this would technically be a breaking change in thread safety, even though the line there is a bit blurry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We 'overwrite' it in the project builder for all builds. If we ovewrite it in the isolated env class then users will have to pass the isolated env's runner to the project builder and we'd need to provide some sort of function to wrap user-provided subprocess runners if they are to be used in conjunction with an isolated env. We should definitely not add a flag AND a subprocess runner wrapper AND an env attribute AND optionally mutate os.environ.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't need to interface with the builder, only with the isolated env, right? So you could slot in any isolated env you like:

with IsolatedEnvBuilder(...) as isolated_env:
    ProjectBuilder.from_isolated_env(isolated_env)

# Or...

with MyCustomEnvBuilderWhichReturnsAnIsolatedEnvSubclass() as isolated_env:
    ProjectBuilder.from_isolated_env(isolated_env)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't need to interface with the builder, only with the isolated env, right? So you could slot in any isolated env you like:

Hum, sure. That looks good to me.

I am not sure if it would make sense to make it public API, probably not.

Actually, as long as we keep the API simple, I think it would be alright.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, as long as we keep the API simple, I think it would be alright.

Then we cannot go down by adopting my pep-517 implementation path. The API is purposefully not simple because encourages maximum flexibility. The entire frontend is public and non-trivial https://github.com/tox-dev/tox/blob/rewrite/src/tox/util/pep517/frontend.py#L1

Copy link
Member

@layday layday Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you create a new issue explaining how this differs from pep517 and how to proceed with adopting it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, don't have time for that. At its core it differs by:

  • allow keeping alive the backend to reuse it in between commands
  • provides stdout/stderr for commands executed
  • frontend python 3 only and type hinted 👍

self._old_env_values = {}
if self._path is not None and os.path.exists(self._path): # in case the user already deleted skip remove
shutil.rmtree(self._path)

Expand Down
23 changes: 22 additions & 1 deletion tests/test_env.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ def test_isolated_env_log(mocker, caplog, package_test_flit):
('INFO', 'Installing packages in isolated environment... (something)'),
]
if sys.version_info >= (3, 8): # stacklevel
assert [(record.lineno) for record in caplog.records] == [105, 103, 194]
assert [(record.lineno) for record in caplog.records] == [105, 112, 210]


@pytest.mark.isolated
Expand Down Expand Up @@ -165,3 +165,24 @@ def test_venv_symlink(mocker, has_symlink):
build.env._fs_supports_symlink.cache_clear()

assert supports_symlink is has_symlink


def test_clear_env_vars(monkeypatch, mocker):
mocker.patch('build.env._create_isolated_env_venv', return_value=(None, None))

keys = (
'PYTHONHOME',
'PYTHONPATH',
'PYTHONPLATLIBDIR',
'PYTHONSTARTUP',
'PYTHONNOUSERSITE',
)
for key in keys:
monkeypatch.setenv(key, 'hello!')

with build.env.IsolatedEnvBuilder():
for key in keys:
assert key not in os.environ

for key in keys:
assert os.environ[key] == 'hello!'