Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: expose python package entrypoints via an opaque struct #1189

Closed
wants to merge 11 commits into from

Conversation

aignas
Copy link
Collaborator

@aignas aignas commented Apr 30, 2023

Fixes #958.

For bzlmod users we did not have a working alternative for the entry_point macro for the non-bzlmod users. The reason it is tricky is that in order to know the name of the entry_point available to the user one needs to download the Python package and parse the metadata. This works around the limitation by exposing the list of entrypoints back to the hub repo as an opaque struct, which the users can use in their code.

Summary:

  • Added support for users setting incompatible_generate_aliases to True. They can use the bin struct exposed via @<hub_rep>//<dep>:bin.bzl.
  • Added support for bzlmod users not setting incompatible_generate_aliases. They can use it via use_repo("<hub_repo>_<dep") and then importing the same struct via @<hub_repo>_<dep>//:bin.bzl.
  • The legacy behaviour of the entry_point macro is unchanged for the remaining users for now.

Design notes:

  • Expose the struct in separate .bzl files in order to have no eager fetches.
  • Exposing it via a struct will give users an error message if the target with the specified name does not exist and it will tell the available struct attributes.
  • The inspiration comes from rules_js which @alexeagle has pointed to.

Open questions:

  • I know that @groodt was toying with an idea of removing the repository macros (requirement, et. al) and I am wondering if this may be the last piece that we need in order to give our users an alternative.
  • If the design looks good, I am wondering about the documentation we should have. Is the current addition to the bzlmod example sufficient?

@aignas
Copy link
Collaborator Author

aignas commented Apr 30, 2023

Depends on #1190 being merged to main. For now I have merged the contents of that PR into this branch to ensure that CI is happy.

@@ -475,6 +495,7 @@ def _pip_repository_impl(rctx):
]),
"%%ANNOTATIONS%%": _format_dict(_repr_dict(annotations)),
"%%CONFIG%%": _format_dict(_repr_dict(config)),
"%%ENTRYPOINT_REPO%%": "//{pkg}" if rctx.attr.incompatible_generate_aliases else "_{pkg}//",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this %ENTRYPOINT_REPO% used? I'm not seeing it.

@@ -475,6 +495,7 @@ def _pip_repository_impl(rctx):
]),
"%%ANNOTATIONS%%": _format_dict(_repr_dict(annotations)),
"%%CONFIG%%": _format_dict(_repr_dict(config)),
"%%ENTRYPOINT_REPO%%": "//{pkg}" if rctx.attr.incompatible_generate_aliases else "_{pkg}//",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something about this looks weird.

The variable is named "repo", but "//{pkg}" isn't a repo-name.

The "_{pkg}//" value: I dont't repos can start with underscore? I got an error when I tried to do that recently. Shouldn't it also have "@" in it?

def _generate_bin_bzl_contents(entrypoints: dict[str, str]) -> str:
return textwrap.dedent(
"""\
# Autogenerated by wheel_installer.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Autogenerated by wheel_installer.py
# Autogenerated by wheel_installer.py#_generate_bin_bzl_contents

Thanks for adding the "generated by" text. Those comments have been very helpful as I've tried to trace code around. I just a small suggestion to point directly to the part doing it. I'm also +1 on including any other extra info in the comment that makes it easier for maintainers.

# Autogenerated by wheel_installer.py
bin = struct(**{})
""".format(
repr({name: Label(target=target) for name, target in entrypoints.items()})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a dedicate formatting function instead of relying on repr(). The output of Python's repr isn't guaranteed to be stable between Python releases, nor is it guaranteed to be compatible with Starlark.

# Autogenerated by wheel_installer.py
bin = struct(**{})
""".format(
repr({name: Label(target=target) for name, target in entrypoints.items()})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sort the values on output. This just helps avoid any determinism issues from the dict ordering. While Python dicts now have guaranteed insertion order, it's easy for some other code to change the insertion order of the dict (e.g. someone calls set on some values earlier). We can avoid that problem by just sorting the output.

# This is using the struct defined by the spoke repo 'yamllint' and it is
# re-exported by the hub repo named 'pip'. This allows bzlmod and
# non-bzlmod users to access the entry point targets.
actual = yamllint_bin.yamllint,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really follow this part -- why does it have to be put into a bzl file and loaded like this? This is pointing to a target, right? So why can't it just be "@pip//yamllint:bin"?

(I'm guessing the answer has something to do with bzlmod, repo mapping, and repo visibility, but I'd like to understand more).

# Prefix the re-export with a dep name in order to avoid collisions
# during loading of multiple bin structs at the same time.
bin_content = """\
\"\"\"Autogenerated by 'pip_parse'.\"\"\"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: be a bit more specific in what generated it. "pip_parse" is a bit vague and there's several pieces to it.

@@ -32,6 +33,14 @@
from python.pip_install.tools.wheel_installer import namespace_pkgs, wheel


@dataclass
class Label:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this LabelWrapper or something? I just want it to be easier to tell when it's Python code calling the dataclass named Label vs generated bzlcode in the Python code calling the bzl Label. The behavior of Label() is confusing enough; adding "Which label object is this?" is going to make it very confusing.

Actually, can we remove this dataclass entirely? It looks like it's just used once?

@@ -32,6 +33,14 @@
from python.pip_install.tools.wheel_installer import namespace_pkgs, wheel


@dataclass
class Label:
target: str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "target_name" instead of just "target". A Target is a 3-tuple of (workspace, package, and name), and this is only the name portion. That, or add a comment to indicate it's only the name portion of a target.

@rickeylev
Copy link
Contributor

I am wondering about the documentation we should have.

Probably also update the "entry_point" function docs. If I'm trying to use an entry_point in a certain way, I'll consult its reference documentation to see how to use it. I'll look at examples when I'm first using a library, but after I use it a few times, I'm unlikely to look at the examples (they'll just tell me most of what I know).

@aignas
Copy link
Collaborator Author

aignas commented May 1, 2023

@rickylev, thanks for the great comments. It seems that the documentation for the bzlmod extensions is non-existent right now so I added minimal text to the right place even though it is not generated right now.

It will take me some time to fully understand how to best add the extensions.bzl documentation generation. I'll see how far I can get and I'll maybe include to this PR.

@aignas
Copy link
Collaborator Author

aignas commented May 1, 2023

So I have generated some docs for bzlmod extensions specifically, but I am not sure if how this is turning out is the best way it can be. See the link here.

It may be a good start, but without stardoc generating docs automatically for module_extension, there is just no good way to have consistent docs for this. Putting the docs to the non-bzlmod macros may be as, if not more, confusing.

@aignas aignas marked this pull request as ready for review May 1, 2023 13:34
@aignas
Copy link
Collaborator Author

aignas commented May 1, 2023

Depends on #1190, but marking as ready for review as I have addressed the first round of comments.

rickeylev pushed a commit that referenced this pull request May 1, 2023
…tc (#1190)

It seems that the macros for specifying the requirements break when the
user starts using `incompatible_generate_aliases=True`. This PR fixes
this.

Testing done:
1. Modify the example:
   ```
   $ git diff
diff --git a/examples/bzlmod/MODULE.bazel b/examples/bzlmod/MODULE.bazel
   index ce91228..1750210 100644
   --- a/examples/bzlmod/MODULE.bazel
   +++ b/examples/bzlmod/MODULE.bazel
   @@ -26,6 +26,7 @@ register_toolchains(
    pip = use_extension("@rules_python//python:extensions.bzl", "pip")
    pip.parse(
        name = "pip",
   +    incompatible_generate_aliases=True,
        requirements_lock = "//:requirements_lock.txt",
        requirements_windows = "//:requirements_windows.txt",
    )
   ```
2. Run `bazel build ...` and check that it is still working.

I noticed this when working on #1189 and creating a separate PR for
easier cherry-picking if we wanted to make a patch release which
includes this. I am not sure how I could make an automated test for this
other than creating a separate example.
aignas and others added 11 commits May 4, 2023 15:41
* Added support for users setting `incompatible_generate_aliases` to
  `True`. They can use the `bin` struct exposed via
  `@<hub_rep>//<dep>:bin.bzl`.
* Added support for bzlmod users not setting
  `incompatible_generate_aliases`. They can use it via
  `use_repo("<hub_repo>_<dep")` and then importing the same struct via
  `@<hub_repo>_<dep>//:bin.bzl`.
* The legacy behaviour of the `entry_point` macro is unchanged for the
  remaining users.

Design notes:
* Expose the struct in separate `.bzl` files in order to have no eager
  fetches.
* Exposing it via a struct will give users an error message if the
  target with the specified name does not exist and it will tell the
  available struct attributes.
* The inspiration comes from `rules_js` which @alexeagle has pointed to.

Fixes bazelbuild#958.
@aignas aignas marked this pull request as draft May 4, 2023 06:42
@aignas
Copy link
Collaborator Author

aignas commented May 4, 2023

Whilst trying to add tests for this I have noticed that it would be better to wait for a few PRs already in the queue, so that I can add better tests. What is more, maybe splitting the documentation for the extensions.bzl into its own PR would be a good idea.

@aignas
Copy link
Collaborator Author

aignas commented May 8, 2023

I am wondering if this is really the best way to solve this. The reason for it is that if the user would like to use a select statement to select the entrypoint based on a configuration option, it would inevitably eagerly fetch all of the repos used in the select statement. If we had an entry point macro that is in rules_python, it might be a different story.

ianpegg-bc pushed a commit to ianpegg-bc/rules_python that referenced this pull request May 12, 2023
…tc (bazelbuild#1190)

It seems that the macros for specifying the requirements break when the
user starts using `incompatible_generate_aliases=True`. This PR fixes
this.

Testing done:
1. Modify the example:
   ```
   $ git diff
diff --git a/examples/bzlmod/MODULE.bazel b/examples/bzlmod/MODULE.bazel
   index ce91228..1750210 100644
   --- a/examples/bzlmod/MODULE.bazel
   +++ b/examples/bzlmod/MODULE.bazel
   @@ -26,6 +26,7 @@ register_toolchains(
    pip = use_extension("@rules_python//python:extensions.bzl", "pip")
    pip.parse(
        name = "pip",
   +    incompatible_generate_aliases=True,
        requirements_lock = "//:requirements_lock.txt",
        requirements_windows = "//:requirements_windows.txt",
    )
   ```
2. Run `bazel build ...` and check that it is still working.

I noticed this when working on bazelbuild#1189 and creating a separate PR for
easier cherry-picking if we wanted to make a patch release which
includes this. I am not sure how I could make an automated test for this
other than creating a separate example.
@aignas
Copy link
Collaborator Author

aignas commented May 12, 2023

I am going to close this as #1220 is the approach I would recommend for now.

@aignas aignas closed this May 12, 2023
@aignas aignas deleted the feat/opaque_entry_point branch May 13, 2024 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bzlmod bug/regression] entry_point no longer working with blzmod
2 participants