Skip to content

Commit

Permalink
Allow passing arbitrary args to PEX invocation when building FaaS art…
Browse files Browse the repository at this point in the history
…ifacts (Cherry-pick of #20237) (#20280)

This adds an `pex3_venv_create_extra_args` field to all FaaS targets
(`python_aws_lambda_function`, `python_aws_lambda_layer`,
`python_google_cloud_function`). This allows adding arbitrary extra
arguments that are passed to the `pex3 venv create --layout=flat-zipped
...` invocation that is used to create the final zip file.

Most acutely, this is driven by making it possible to pass the
`--collisions-ok` flag. This allows work around dependencies that are
packaged with files outside a namespaced directories, e.g. commonly a
LICENCE or README file, or `tests/` directory, since they'll have
different content. A command like `pip install ...` will happily install
them and have one of the files "win" arbitrarily, while PEX is more
correct and flags that it doesn't know what to do in that circumstance.

Fixes #20224

This is marked for cherry picking back to 2.18 because it can block
adoption of the new layout, and the old (lambdex) layout is deprecated
and using it is noisy in 2.18.

Co-authored-by: Huon Wilson <huon@exoflare.io>
  • Loading branch information
WorkerPants and huonw committed Dec 11, 2023
1 parent 2b9e758 commit 8486d36
Show file tree
Hide file tree
Showing 12 changed files with 306 additions and 10 deletions.
8 changes: 8 additions & 0 deletions docs/markdown/Python/python-integrations/awslambda-python.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,14 @@ Wrote dist/project/lambda.zip
>
> If this happens, you must either change your dependencies to only use dependencies with pre-built [wheels](https://pythonwheels.com) or find a Linux environment to run `pants package`.
> 🚧 "Encountering collisions" errors and failing to build?
>
> If a build fails with an error like `Encountered collisions populating ... from PEX at faas_repository.pex:`, listing one or more files with different `sha1` hashes, this likely means your dependencies package files in unexpected locations, outside their "scoped" directory (for instance, a package `example-pkg` typically only includes files within `example_pkg/` and `example_pkg-*.dist-info/` directories). When multiple dependencies do this, those files can have exactly matching file paths but different contents, and so it is impossible to create a Lambda artifact: which of the files should be installed and which should be ignored? Resolving this requires human intervention to understand whether any of those files are important, and hence PEX emits an error rather than making an (arbitrary) choice that may result in confusing and/or broken behaviour at runtime.
>
> Most commonly this seems to happen with metadata like a README or LICENSE file, or test files (in a `tests/` subdirectory), which are likely not important at runtime. In these cases, the collision can be worked around by adding [a `pex3_venv_create_extra_args=["--collisions-ok"]` field](doc:reference-python_aws_lambda_function#codepex3_venv_create_extra_argscode) to the `python_aws_lambda_...` targets.
>
> A better solution is to work with the dependencies to stop them from packaging files outside their scoped directories.
Step 4: Upload to AWS
---------------------

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,15 @@ Wrote dist/project/cloud_function.zip
>
> If this happens, you must either change your dependencies to only use dependencies with pre-built [wheels](https://pythonwheels.com) or find a Linux environment to run `pants package`.
> 🚧 "Encountering collisions" errors and failing to build?
>
> If a build fails with an error like `Encountered collisions populating ... from PEX at faas_repository.pex:`, listing one or more files with different `sha1` hashes, this likely means your dependencies package files in unexpected locations, outside their "scoped" directory (for instance, a package `example-pkg` typically only includes files within `example_pkg/` and `example_pkg-*.dist-info/` directories). When multiple dependencies do this, those files can have exactly matching file paths but different contents, and so it is impossible to create a GCF artifact: which of the files should be installed and which should be ignored? Resolving this requires human intervention to understand whether any of those files are important, and hence PEX emits an error rather than making an (arbitrary) choice that may result in confusing and/or broken behaviour at runtime.
>
> Most commonly this seems to happen with metadata like a README or LICENSE file, or test files (in a `tests/` subdirectory), which are likely not important at runtime. In these cases, the collision can be worked around by adding [a `pex3_venv_create_extra_args=["--collisions-ok"]` field](doc:reference-python_google_cloud_function#codepex3_venv_create_extra_argscode) to the `python_google_cloud_function` target.
>
> A better solution is to work with the dependencies to stop them from packaging files outside their scoped directories.

Step 4: Upload to Google Cloud
------------------------------

Expand Down
9 changes: 8 additions & 1 deletion src/python/pants/backend/awslambda/python/rules.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,11 @@
PythonAwsLambdaLayerDependenciesField,
PythonAwsLambdaRuntime,
)
from pants.backend.python.util_rules.faas import BuildPythonFaaSRequest, PythonFaaSCompletePlatforms
from pants.backend.python.util_rules.faas import (
BuildPythonFaaSRequest,
PythonFaaSCompletePlatforms,
PythonFaaSPex3VenvCreateExtraArgsField,
)
from pants.backend.python.util_rules.faas import rules as faas_rules
from pants.core.goals.package import BuiltPackage, OutputPathField, PackageFieldSet
from pants.core.util_rules.environments import EnvironmentField
Expand All @@ -31,6 +35,7 @@ class _BaseFieldSet(PackageFieldSet):
include_requirements: PythonAwsLambdaIncludeRequirements
runtime: PythonAwsLambdaRuntime
complete_platforms: PythonFaaSCompletePlatforms
pex3_venv_create_extra_args: PythonFaaSPex3VenvCreateExtraArgsField
output_path: OutputPathField
environment: EnvironmentField

Expand Down Expand Up @@ -65,6 +70,7 @@ async def package_python_aws_lambda_function(
output_path=field_set.output_path,
include_requirements=field_set.include_requirements.value,
include_sources=True,
pex3_venv_create_extra_args=field_set.pex3_venv_create_extra_args,
reexported_handler_module=PythonAwsLambdaHandlerField.reexported_handler_module,
),
)
Expand All @@ -84,6 +90,7 @@ async def package_python_aws_lambda_layer(
output_path=field_set.output_path,
include_requirements=field_set.include_requirements.value,
include_sources=field_set.include_sources.value,
pex3_venv_create_extra_args=field_set.pex3_venv_create_extra_args,
# See
# https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html#configuration-layers-path
#
Expand Down
68 changes: 67 additions & 1 deletion src/python/pants/backend/awslambda/python/rules_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,18 @@
import subprocess
from io import BytesIO
from textwrap import dedent
from typing import Any
from unittest.mock import Mock
from zipfile import ZipFile

import pytest

from pants.backend.awslambda.python.rules import (
PythonAwsLambdaFieldSet,
PythonAwsLambdaLayerFieldSet,
_BaseFieldSet,
package_python_aws_lambda_function,
package_python_aws_lambda_layer,
)
from pants.backend.awslambda.python.rules import rules as awslambda_python_rules
from pants.backend.awslambda.python.target_types import PythonAWSLambda, PythonAWSLambdaLayer
Expand All @@ -26,6 +31,10 @@
PythonSourcesGeneratorTarget,
)
from pants.backend.python.target_types_rules import rules as python_target_types_rules
from pants.backend.python.util_rules.faas import (
BuildPythonFaaSRequest,
PythonFaaSPex3VenvCreateExtraArgsField,
)
from pants.core.goals import package
from pants.core.goals.package import BuiltPackage
from pants.core.target_types import (
Expand All @@ -40,7 +49,7 @@
from pants.engine.internals.scheduler import ExecutionError
from pants.engine.target import FieldSet
from pants.testutil.python_rule_runner import PythonRuleRunner
from pants.testutil.rule_runner import QueryRule
from pants.testutil.rule_runner import MockGet, QueryRule, run_rule_with_mocks


@pytest.fixture
Expand Down Expand Up @@ -342,3 +351,60 @@ def test_layer_must_have_dependencies(rule_runner: PythonRuleRunner) -> None:
expected_extra_log_lines=(" Runtime: python3.7",),
layer=True,
)


@pytest.mark.parametrize(
("rule", "field_set_ty", "extra_field_set_args"),
[
pytest.param(
package_python_aws_lambda_function, PythonAwsLambdaFieldSet, ["handler"], id="function"
),
pytest.param(
package_python_aws_lambda_layer,
PythonAwsLambdaLayerFieldSet,
["dependencies", "include_sources"],
id="layer",
),
],
)
def test_pex3_venv_create_extra_args_are_passed_through(
rule: Any, field_set_ty: type[_BaseFieldSet], extra_field_set_args: list[str]
) -> None:
# Setup
addr = Address("addr")
extra_args = (
"--extra-args-for-test",
"distinctive-value-E40B861A-266B-4F37-8394-767840BE9E44",
)
extra_args_field = PythonFaaSPex3VenvCreateExtraArgsField(extra_args, addr)
field_set = field_set_ty(
address=addr,
include_requirements=Mock(),
runtime=Mock(),
complete_platforms=Mock(),
output_path=Mock(),
environment=Mock(),
**{arg: Mock() for arg in extra_field_set_args},
pex3_venv_create_extra_args=extra_args_field,
)

observed_calls = []

def mocked_build(request: BuildPythonFaaSRequest) -> BuiltPackage:
observed_calls.append(request.pex3_venv_create_extra_args)
return Mock()

# Exercise
run_rule_with_mocks(
rule,
rule_args=[field_set],
mock_gets=[
MockGet(
output_type=BuiltPackage, input_types=(BuildPythonFaaSRequest,), mock=mocked_build
)
],
)

# Verify
assert len(observed_calls) == 1
assert observed_calls[0] is extra_args_field
2 changes: 2 additions & 0 deletions src/python/pants/backend/awslambda/python/target_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
PythonFaaSDependencies,
PythonFaaSHandlerField,
PythonFaaSKnownRuntime,
PythonFaaSPex3VenvCreateExtraArgsField,
PythonFaaSRuntimeField,
)
from pants.backend.python.util_rules.faas import rules as faas_rules
Expand Down Expand Up @@ -150,6 +151,7 @@ class _AWSLambdaBaseTarget(Target):
PythonAwsLambdaIncludeRequirements,
PythonAwsLambdaRuntime,
PythonFaaSCompletePlatforms,
PythonFaaSPex3VenvCreateExtraArgsField,
PythonResolveField,
EnvironmentField,
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,11 @@
PythonGoogleCloudFunctionRuntime,
PythonGoogleCloudFunctionType,
)
from pants.backend.python.util_rules.faas import BuildPythonFaaSRequest, PythonFaaSCompletePlatforms
from pants.backend.python.util_rules.faas import (
BuildPythonFaaSRequest,
PythonFaaSCompletePlatforms,
PythonFaaSPex3VenvCreateExtraArgsField,
)
from pants.backend.python.util_rules.faas import rules as faas_rules
from pants.core.goals.package import BuiltPackage, OutputPathField, PackageFieldSet
from pants.core.util_rules.environments import EnvironmentField
Expand All @@ -30,6 +34,7 @@ class PythonGoogleCloudFunctionFieldSet(PackageFieldSet):
handler: PythonGoogleCloudFunctionHandlerField
runtime: PythonGoogleCloudFunctionRuntime
complete_platforms: PythonFaaSCompletePlatforms
pex3_venv_create_extra_args: PythonFaaSPex3VenvCreateExtraArgsField
type: PythonGoogleCloudFunctionType
output_path: OutputPathField
environment: EnvironmentField
Expand All @@ -47,6 +52,7 @@ async def package_python_google_cloud_function(
complete_platforms=field_set.complete_platforms,
runtime=field_set.runtime,
handler=field_set.handler,
pex3_venv_create_extra_args=field_set.pex3_venv_create_extra_args,
output_path=field_set.output_path,
include_requirements=True,
include_sources=True,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,15 @@
import subprocess
from io import BytesIO
from textwrap import dedent
from unittest.mock import Mock
from zipfile import ZipFile

import pytest

from pants.backend.google_cloud_function.python.rules import PythonGoogleCloudFunctionFieldSet
from pants.backend.google_cloud_function.python.rules import (
PythonGoogleCloudFunctionFieldSet,
package_python_google_cloud_function,
)
from pants.backend.google_cloud_function.python.rules import (
rules as python_google_cloud_function_rules,
)
Expand All @@ -25,6 +29,10 @@
PythonSourcesGeneratorTarget,
)
from pants.backend.python.target_types_rules import rules as python_target_types_rules
from pants.backend.python.util_rules.faas import (
BuildPythonFaaSRequest,
PythonFaaSPex3VenvCreateExtraArgsField,
)
from pants.core.goals import package
from pants.core.goals.package import BuiltPackage
from pants.core.target_types import (
Expand All @@ -37,7 +45,7 @@
from pants.engine.addresses import Address
from pants.engine.fs import DigestContents
from pants.testutil.python_rule_runner import PythonRuleRunner
from pants.testutil.rule_runner import QueryRule
from pants.testutil.rule_runner import MockGet, QueryRule, run_rule_with_mocks


@pytest.fixture
Expand Down Expand Up @@ -233,3 +241,44 @@ def handler(event, context):
assert "mureq/__init__.py" in names
assert "foo/bar/hello_world.py" in names
assert zipfile.read("main.py") == b"from foo.bar.hello_world import handler as handler"


def test_pex3_venv_create_extra_args_are_passed_through() -> None:
# Setup
addr = Address("addr")
extra_args = (
"--extra-args-for-test",
"distinctive-value-1EE0CE07-2545-4743-81F5-B5A413F73213",
)
extra_args_field = PythonFaaSPex3VenvCreateExtraArgsField(extra_args, addr)
field_set = PythonGoogleCloudFunctionFieldSet(
address=addr,
handler=Mock(),
runtime=Mock(),
complete_platforms=Mock(),
type=Mock(),
output_path=Mock(),
environment=Mock(),
pex3_venv_create_extra_args=extra_args_field,
)

observed_calls = []

def mocked_build(request: BuildPythonFaaSRequest) -> BuiltPackage:
observed_calls.append(request.pex3_venv_create_extra_args)
return Mock()

# Exercise
run_rule_with_mocks(
package_python_google_cloud_function,
rule_args=[field_set],
mock_gets=[
MockGet(
output_type=BuiltPackage, input_types=(BuildPythonFaaSRequest,), mock=mocked_build
)
],
)

# Verify
assert len(observed_calls) == 1
assert observed_calls[0] is extra_args_field
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
PythonFaaSCompletePlatforms,
PythonFaaSDependencies,
PythonFaaSHandlerField,
PythonFaaSPex3VenvCreateExtraArgsField,
PythonFaaSRuntimeField,
)
from pants.backend.python.util_rules.faas import rules as faas_rules
Expand Down Expand Up @@ -116,6 +117,7 @@ class PythonGoogleCloudFunction(Target):
PythonGoogleCloudFunctionRuntime,
PythonFaaSCompletePlatforms,
PythonGoogleCloudFunctionType,
PythonFaaSPex3VenvCreateExtraArgsField,
PythonResolveField,
EnvironmentField,
)
Expand Down
18 changes: 18 additions & 0 deletions src/python/pants/backend/python/util_rules/faas.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@
InvalidFieldException,
InvalidTargetException,
StringField,
StringSequenceField,
)
from pants.engine.unions import UnionRule
from pants.source.source_root import SourceRoot, SourceRootRequest
Expand All @@ -68,6 +69,21 @@
logger = logging.getLogger(__name__)


class PythonFaaSPex3VenvCreateExtraArgsField(StringSequenceField):
alias = "pex3_venv_create_extra_args"
default = ()
help = help_text(
"""
Any extra arguments to pass to the `pex3 venv create` invocation that is used to create the
final zip file.
For example, `pex3_venv_create_extra_args=["--collisions-ok"]`, if using packages that have
colliding files that aren't required at runtime (errors like "Encountered collisions
populating ...").
"""
)


class PythonFaaSHandlerField(StringField, AsyncFieldMixin):
alias = "handler"
required = True
Expand Down Expand Up @@ -405,6 +421,7 @@ class BuildPythonFaaSRequest:
handler: None | PythonFaaSHandlerField
output_path: OutputPathField
runtime: PythonFaaSRuntimeField
pex3_venv_create_extra_args: PythonFaaSPex3VenvCreateExtraArgsField

include_requirements: bool
include_sources: bool
Expand Down Expand Up @@ -495,6 +512,7 @@ async def build_python_faas(
layout=PexVenvLayout.FLAT_ZIPPED,
platforms=platforms.pex_platforms,
complete_platforms=platforms.complete_platforms,
extra_args=request.pex3_venv_create_extra_args.value or (),
prefix=request.prefix_in_artifact,
output_path=Path(output_filename),
description=f"Build {request.target_name} artifact for {request.address}",
Expand Down
Loading

0 comments on commit 8486d36

Please sign in to comment.