Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

main: build in a virtual env by default #18

Merged
merged 5 commits into from
Jun 27, 2020
Merged

main: build in a virtual env by default #18

merged 5 commits into from
Jun 27, 2020

Conversation

FFY00
Copy link
Member

@FFY00 FFY00 commented Jun 12, 2020

As discussed in [1] python-build should now build inside a virtual
environment, as recommended in PEP517. This patch implements that
functionality and sets it as the default behavior. It also introduces a
--no-isolation flag to disable this.

[1] https://discuss.python.org/t/moving-python-build-to-pypa/4390

Signed-off-by: Filipe Laíns lains@archlinux.org

@codecov
Copy link

codecov bot commented Jun 12, 2020

Codecov Report

Merging #18 into master will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##            master       #18    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files            6         9     +3     
  Lines          256       382   +126     
==========================================
+ Hits           256       382   +126     

@FFY00
Copy link
Member Author

FFY00 commented Jun 12, 2020

@pganssle @gaborbernat could you review?

@FFY00
Copy link
Member Author

FFY00 commented Jun 12, 2020

I choose --no-isolation over something like --isolation pip as I don't think we will be providing different isolation types. I don't really think it makes sense here. But if that happens in the future we can still keep --no-isolation as an equivalent of --isolation none for backward compatibility.

@FFY00 FFY00 force-pushed the envbuild branch 2 times, most recently from 90e0401 to ee9a9fd Compare June 12, 2020 22:07
README.md Outdated Show resolved Hide resolved
build/__main__.py Outdated Show resolved Hide resolved
build/__main__.py Outdated Show resolved Hide resolved
docs/source/index.rst Outdated Show resolved Hide resolved
tests/test_main.py Outdated Show resolved Hide resolved
@FFY00
Copy link
Member Author

FFY00 commented Jun 13, 2020

I refactored the patch based on the comments, should be better now.

@FFY00 FFY00 force-pushed the envbuild branch 4 times, most recently from 7b5c426 to 8db9c11 Compare June 13, 2020 14:56
build/__init__.py Outdated Show resolved Hide resolved
tests/test_main.py Outdated Show resolved Hide resolved
@pganssle
Copy link
Member

I think overall this PR is good, but I think it's a bit under-tested considering the heavy use of mocks.

I've never found it terribly easy to test CLI applications, but I think I would get rid of .github/workflows/build.yml and move that whole test into some parameterized pytest tests that call the function via subprocess or another system call.

I also think that it would be better to use fake packages designed to test a single thing, rather than real packages that you pull from PyPI (though presumably — at least to start with — you can still hit PyPI during the tests anyway). You can see we do something like that in setuptools; in that case, we define the files in code and then construct the package in a temporary location as part of the test, but you could just as easily have a "tests/packages" directory or something with a bunch of stub packages in it.

@FFY00
Copy link
Member Author

FFY00 commented Jun 13, 2020

.github/workflows/build.yml are just sanity checks to make sure we can actually build real projects, they are not meant as a replacement for other tests. I am not sure if they should be in pytest, especially since they do take a bit of time to run. If anything, we could add them in nox? But since they are just sanity tests, I feel like they are better fit for the CI.

I think adding proper tests with fake packages is great, it's something that I have in my TODO.

@FFY00 FFY00 force-pushed the envbuild branch 2 times, most recently from 9515d4a to 7ed64ba Compare June 13, 2020 20:08
@gaborbernat
Copy link
Contributor

.github/workflows/build.yml are just sanity checks to make sure we can actually build real projects, they are not meant as a replacement for other tests. I am not sure if they should be in pytest, especially since they do take a bit of time to run. If anything, we could add them in nox? But since they are just sanity tests, I feel like they are better fit for the CI.

I think adding proper tests with fake packages is great, it's something that I have in my TODO.

IMHO they should be part of the test suite. They are integration tests, and as such should be marked as such in the tests, giving you the option to not run them if you want a fast test.

@FFY00
Copy link
Member Author

FFY00 commented Jun 13, 2020

Okay, I can add a --run-integration flag to pytest to opt in. But I believe this should be done in a separate PR.

@gaborbernat
Copy link
Contributor

What I tend to do is to make it auto on by default for nox, but disabled if you just invoke pytest. If you check tox/virtualenv you'll find exact how to of this.

@FFY00
Copy link
Member Author

FFY00 commented Jun 13, 2020

I agree with this :). On by default in nox but off in pytest.

build/__main__.py Outdated Show resolved Hide resolved
tests/test_main.py Outdated Show resolved Hide resolved

builder.build(dist, outdir)
if isolation:
with pep517.envbuild.BuildEnvironment() as env:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just had a conversation that made me realize that this may actually not be sufficient to achieve proper isolation. I was under the impression that BuildEnvironment() would create an isolated build environment, but it actually adds a layer on top of your existing environment. For example, if I create a project with the following setup.py:

from setuptools import setup

try:
    import attr
    raise ValueError("Attrs should not be available!")
except ImportError:
    pass

setup()

If I run python -m build on the project without installing attrs it works fine, but if I pip install attrs into the same environment that build is installed in, it raises the exception.

That's especially troubling because build itself has runtime dependencies, so any undeclared dependencies on packaging, pep517, pyparsing, toml, etc, would go unnoticed.

I don't know if we should try and fix this here or in pep517. I assume @gaborbernat will have a better understanding of the options than I do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK (@takluyver would know for sure) it's the caller responsibility to ensure that on the host python only the standard library is available. So I'd say the responsibility would be on this lib to fix this. In practice, this would imply that you would need to remove from sys.path all paths that are not stdlib (you can get this via https://docs.python.org/3/library/sysconfig.html#sysconfig.get_path and passing in stdlib).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The caller" as in the user of python-build, or "The caller" as in python-build?

I think the way it is now we're basically doing the equivalent of building a virtualenv with --system-site-packages, which is not a good default, so I feel strongly that we should not be doing that. How does virtualenv achieve this?

I suppose this explains why pip has their own logic for build isolation and doesn't directly use pep517's. We may want to copy them (though their approach does bring in at least one problem).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does virtualenv achieve this?

It knows the interpreter deeply to be able to tell how to achieve. There's no single global answer, various interpreters do it differently. Here's a list https://github.com/pypa/virtualenv/blob/master/src/virtualenv/discovery/py_info.py#L35-L103 of options that can influence things. Either pick pips way or pull in virtualenv package to create a virtual environment the way I see it. I agree we should not pull in global site package.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're thinking of removing BuildEnvironment from pep517 (along with all the higher-level code that uses it) - it's not very good, as you've noticed, and it distracts people from the useful part of the pep517 package. See pypa/pyproject-hooks#83

I wrote BuildEnvironment and the code using it as a kind of example of the simplest possible PEP 517 frontend that puts all the pieces together (installing dependencies, calling the hooks with those dependencies available). The spec only says that the build environment SHOULD be isolated, so that 'overlay' model is technically compliant, but it's not best practice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having this high level code would be in-scope for this project, we just need to be careful with what we choose to export as public API.

So I propose for now to move BuildEnvironment here as private API, and then in the future think about exporting it.

I wrote BuildEnvironment and the code using it as a kind of example of the simplest possible PEP 517 frontend that puts all the pieces together (installing dependencies, calling the hooks with those dependencies available).

I have expressed this opinion previously, but I'll reiterate as I think it's relevant. Nothing you consider an "example" should be ever exported publicly, being this API, CLI, etc. Once you make something public, people will use it. Providing code as an example forces people to copy it, tailor it to their needs, and most importantly, lets you off the hook 😄.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm right with you there. I must have been feeling rather gung-ho when I decided to make it part of the library. ;-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the issue here was if it's private or not. The issue is that it's not working, so we should use something else, which can be private. I'd say perhaps virtualenv, but I'm happy with venv too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the current pep517 code should be hard to fix up, we just need to adjust sys.path, no? I think virtualenv would be the direction forward if we needed to deal with custom interpreters, but we don't. I would like to avoid adding a new dependency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the current pep517 code should be hard to fix up, we just need to adjust sys.path, no?

Well sure, in theory, you just* need to fix the sys.path; however, in practice, it's non-trivial how to fix it up. Things differ per python implementation what needs to stay and what needs to go. virtualenv contains this complex logic. Hence why suggesting we use the slower venv on python 3, as that does not pull in any dependency 🤷

Copy link
Contributor

@gaborbernat gaborbernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be rebased on master. Also from the looks of it, we'd need to improve our isolation build environments, or switch to venv/virtualenv.

Copy link
Contributor

@gaborbernat gaborbernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like here some more integration tests, where the operations run within a Python that has no packages installed at all. So I'll block this until we move the integration tests within the test suite.

@FFY00
Copy link
Member Author

FFY00 commented Jun 26, 2020

The integration tests already run in the CI, and they are passing. This PR is already big, I don't think adding more stuff is a good idea. I see no reason to block it due to the integration tests. IMO we should merge it and then work on the integration tests. I would definitely block a release though.

@gaborbernat
Copy link
Contributor

gaborbernat commented Jun 26, 2020

The integration tests already run in the CI, and they are passing.

I have a distinct feeling that they are passing by chance, not by design; because how they are set up (e.g. the reason why a sdist is unusable on the master, but the CI still passes; or how even without using syconfig.get_scheme_names the CI was passing - this shows we're missing tests here). I'm not saying you need to do more in this PR. I'm saying we should not merge until we have the infrastructure in place to fully validate that what you made is 100% right.



def test_isolated_environment(mocker):
with build.env.IsolatedEnvironment() as env:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test should run in an isolated python environment (one that has no pip/setuptools/etc installed) so there's really no chance things are being pulled from some other path on the python path. E.g. is this past failing if you change from sys. get_scheme_names to that manual thing you defined before? It should if it's an accurate test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That "manual thing" I had before was hardcoding the scheme names, it should work the same. Why would this be failing?

really no chance things are being pulled from some other path on the python path

What things being pulled? There is nothing being pulled. In this test I check if the environment is correct, after that I check if pip is called correctly, but we never actually call pip. I look at the path and make sure it is correct.

For eg. I import sysconfig before entering the isolated environment and make sure purelib or platlib are not there.

for path in ('purelib', 'platlib'):
    assert sysconfig.get_path(path) not in os.environ['PYTHONPATH'].split(os.pathsep)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had before was hardcoding the scheme names, it should work the same.

It should have been failing on pypy, as pypy uses different scheme names then what you hardcoded. The fact that it did not, shows something is not totally correct in this sense.

For eg. I import sysconfig before entering the isolated environment and make sure purelib or platlib are not there.

To be fair I'm not sure this is enough. I'd expect we'd isolate from any potential PYTHONPATH the user might have set, or additional paths the user injects via user/sitecustomize.

Copy link
Member Author

@FFY00 FFY00 Jun 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't.

>>>> import sysconfig
>>>> sysconfig.get_path('purelib')
'/opt/pypy/site-packages'
>>>> sysconfig.get_path('platlib')
'/opt/pypy/site-packages'

If you read the code you'll notice I remove both the default paths and the different schemes.

https://github.com/FFY00/python-build/blob/e5efbcca2df30b7de5e14477cdbd4fb83f1d932c/build/env.py#L55

The test is correct and working as expected.

@FFY00
Copy link
Member Author

FFY00 commented Jun 26, 2020

The integration tests are definitely not passing by chance. I agree that we should have more of them but I keep my position this should not block this PR. We have unit tests which should fail when someone changes the behavior, and we have integration tests that build both this project and pip. To me, this should be enough to merge this PR.

@gaborbernat gaborbernat self-requested a review June 26, 2020 14:42
Copy link
Contributor

@gaborbernat gaborbernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't mind me, feel free to proceed as you want.

@FFY00
Copy link
Member Author

FFY00 commented Jun 26, 2020

@pganssle can you have a look?

Copy link
Member

@pganssle pganssle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to say that @gaborbernat's review is enough, because he's really the expert here — considering he maintains multiple virtualenv-related CLI tools and I primarily maintain libraries — and I doubt that I'll be able to grasp the implications of all of these changes just reading through the code in the limited time I have available for this sort of thing.

Is there anything particularly controversial or any major decisions that I'm glossing over here? I can probably form opinions on individual questions a lot easier than I can grok the entire change.


builder = build.ProjectBuilder()

assert builder._build_system == build._DEFAULT_BACKEND
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not really an expert in testing CLIs, so maybe this is the most expedient way to do this, but my general rule for tests is that I only test the public API, so anything that's making assertions about the internal private structure is usually off-limits (or I mark it with a comment or something to indicate why this is the only / best way to test it).

Put another way, if you did an otherwise safe refactoring of the ProjectBuilder class, you would need to change this test, but that defeats one of the purposes of tests — allowing you to refactor with confidence, because you've made assertions about the behavior that should remain invariant.

Of course, I rarely have to test CLIs, so I think the rules may be a little different there (when there's no public API available from Python anyway).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not really an expert in testing CLIs, so maybe this is the most expedient way to do this, but my general rule for tests is that I only test the public API, so anything that's making assertions about the internal private structure is usually off-limits (or I mark it with a comment or something to indicate why this is the only / best way to test it).

We are not testing CLI here, we are testing ProjectBuilder, which is public API. python-build provides a CLI but also a minimal API to build projects, which is ProjectBuilder.

Put another way, if you did an otherwise safe refactoring of the ProjectBuilder class, you would need to change this test, but that defeats one of the purposes of tests — allowing you to refactor with confidence, because you've made assertions about the behavior that should remain invariant.

If I do a safe refactoring of ProjectBuilder these tests should not fail. They might break if I change ProjectBuilder's public API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I do a safe refactoring of ProjectBuilder these tests should not fail. They might break if I change ProjectBuilder's public API.

Well, if you removed or renamed build._DEFAULT_BACKEND or ProjectBuilder._build_backend, these tests would fail, but presumably those are not part of the public interface.

It seems like our public interface exposes a hook attribute, which would be the PEP517HookCaller object, and that does expose a .build_backend attribute, publicly.

I think you'd need to ease off the mocking a bit, but you can define a top-level DEFAULT_BACKEND in this test file, and then do:

assert builder.hook.build_backend == DEFAULT_BACKEND

Unless it is unintentional that builder.hook is part of the public API, or that it is guaranteed to be a Pep517HookCaller object, in which case probably mocking out the default backend, calling .build and then using Mock.assert_called_once or Mock.assert_called_once_with would be more approriate.

@gaborbernat
Copy link
Contributor

Is there anything particularly controversial or any major decisions that I'm glossing over here? I can probably form opinions on individual questions a lot easier than I can grok the entire change.

IMHO it's a matter of taste at the end of the day. Looking through the code I'm not confident that we have a good enough test suite to validate the approach here. So I was advocating for more tests; ideally on top of moving our current integration tests from CI to pytest. @FFY00 wants to move ahead as is now. The current state at a glance looks ok, so I'll not block it, though the tests we have does not give me enough confidence to approve it either.

def test_missing_backend(mocker, pyproject_no_backend_mock):
mocker.patch('importlib.import_module')
mocker.patch('{}.open'.format(build_open_owner), pyproject_no_backend_mock)
mocker.patch('pep517.wrappers.Pep517HookCaller')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this mocked out? Because the constructor will try and import the build backend? If so, can we try a more targeted mock that would simply provide the build backend?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this can be removed when we refactor the test suite to use proper packages.

@pganssle
Copy link
Member

Looking through the code I'm not confident that we have a good enough test suite to validate the approach here. So I was advocating for more tests; ideally on top of moving our current integration tests from CI to pytest. @FFY00 wants to move ahead as is now. The current state at a glance looks ok, so I'll not block it, though the tests we have does not give me enough confidence to approve it either.

I tend to agree that the testing I see here is not really adequate given that this is the core functionality of this module. I am very dubious about the heavy use of mocks.

If it were me, I'd take the time to do the testing beforehand, because I know that I very rarely go back to add more tests, and if I don't add the tests stuff will be fundamentally broken, but that's my workflow.

I don't think we should do a production release of this software without adding the tests, but if there's some compelling reason to merge first and refactor the tests later (like avoiding annoying merge conflicts if/when we move to the src/ layout), I'm fine with merging and then opening a release-blocker bug for "add significantly more thorough tests" here.

def test_default_backend(mocker, empty_file_mock):
mocker.patch('importlib.import_module')
mocker.patch('{}.open'.format(build_open_owner), empty_file_mock)
mocker.patch('pep517.wrappers.Pep517HookCaller')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super familiar with this mocker fixture. Is there any way to use something equivalent to autospec with it? If we really need to mock this stuff out, we should be as surgical with the mocking as possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This plugin provides a mocker fixture which is a thin-wrapper around the patching API provided by the mock package

https://github.com/pytest-dev/pytest-mock/

Everything supported by unittest.mock.path is supported by mocker.patch.

@FFY00
Copy link
Member Author

FFY00 commented Jun 26, 2020

I don't think we should do a production release of this software without adding the tests, but if there's some compelling reason to merge first and refactor the tests later (like avoiding annoying merge conflicts if/when we move to the src/ layout), I'm fine with merging and then opening a release-blocker bug for "add significantly more thorough tests" here.

Yes, I want to avoid having to keep this in sync with new changes, especially since it is such a big PR. The current test suite is not optimal, but it does its job. I want to refactor it and remove a lot of the mocks, but that will have a lot of conflicts with this.
As I said before, we should not release without reworking the test suite. But stalling this until the test suite gets refactor will only slow things down as it will force me to spend time maintaining this in parallel.

@gaborbernat
Copy link
Contributor

But stalling this until the test suite gets refactor will only slow things down as it will force me to spend time maintaining this in parallel.

You would not need to if you switch to improving the test suite first and then coming back to this 🤔

Copy link
Contributor

@gaborbernat gaborbernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/FFY00/python-build/issues/52 is follow-up ticket for this, I feel like we'll need to fix this PR soon 👍 but seems @FFY00 wants to do no more with it.

@gaborbernat gaborbernat merged commit 39166d7 into master Jun 27, 2020
@gaborbernat gaborbernat deleted the envbuild branch June 27, 2020 13:21
@FFY00
Copy link
Member Author

FFY00 commented Jun 27, 2020

I was gonna open a PR to this branch but would be a bit of a weird review process, I prefer it this way. We can always revert the commits if needed.

@gaborbernat
Copy link
Contributor

We can always revert the commits if needed.

Not really once we do the refactor needed, e.g. src layout, the revert would cause merge conflicts.

@FFY00
Copy link
Member Author

FFY00 commented Jun 27, 2020

Which can be solved. Only if you build on top of the current code it might become an issue. There are no plans for that right now, what we have in hand is nox > tox (in.c src layout change) and refactoring the tests.

@gaborbernat
Copy link
Contributor

You forgot about https://github.com/FFY00/python-build/issues/19 which is bound to make changes everywhere

@FFY00
Copy link
Member Author

FFY00 commented Jun 27, 2020

I would stall it until all this is sorted out.

@gaborbernat
Copy link
Contributor

I would stall it until all this is sorted out.

In the end no need for that. Reverting it has no value at the end of the day because it's not like we'd go ahead without a similar feature. If it's not working we can just fix-it in a follow-up PR. In the end, the decision is with you as primary maintainer, but it would likely prevent me from doing PRs. Changing code that's not auto-formatted raises too many potential style arguments I'm not willing to engage as part of my free time.

@FFY00
Copy link
Member Author

FFY00 commented Jun 27, 2020

In the end no need for that. Reverting it has no value at the end of the day because it's not like we'd go ahead without a similar feature. If it's not working we can just fix-it in a follow-up PR.

Hence the "if needed".

but it would likely prevent me from doing PRs

You can make your PRs and after approving I can push a commit fixing the style and squash merge. You don't need to worry about this, feel free to send code PRs without worrying about the style. This would be temporary until we adopt black.

Anyway, this conversation is not going anywhere, we should focus on actually writing the code 😛.

@gaborbernat
Copy link
Contributor

No thanks I'll wait until black is in place 👍will take a look at tox/nox though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants