Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load() arbitary file #10009

Open
pauldraper opened this issue Oct 14, 2019 · 10 comments
Open

load() arbitary file #10009

pauldraper opened this issue Oct 14, 2019 · 10 comments
Labels
P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) team-Starlark-Integration Issues involving Bazel's integration with Starlark, excluding builtin symbols type: feature request

Comments

@pauldraper
Copy link
Contributor

pauldraper commented Oct 14, 2019

Description of the problem / feature request:

load_content() loads the content of the file label into a variable. Syntax would be restricted to top-level load-ish restrictions.

Example usage:

load_content("//example:file.json, "example")

example_parsed = json_parse(example) # https://github.com/erickj/bazel_json

Alternatively, the load() function could be augmented in some way.

Feature requests: what underlying problem are you trying to solve with this feature?

I want to import data sets into WORKSPACE/BUILD files: dependency lock files, bazel query-type results.

Current approaches:

  • Use a repository rule to read the content, but this can cause extra levels of WORKSPACE complication, e.g. Why an extra layer of indirection? rules_jvm_external#265

  • Only store data in Starlark syntax. However, (1) this is harder for other tools to generate (2) this is harder for other tools to analyze.

Have you found anything relevant by searching the web?

https://stackoverflow.com/a/56034788/1212596

#3766

#7309

Any other information, logs, or outputs that you want to share?

What reasons are there for build-time data is restricted to *.bzl files/Starlark syntax? AFAIK there is none, other than consistency.

@laurentlb laurentlb added P3 We're not considering working on this, but happy to review a PR. (No assignee) type: feature request and removed untriaged labels Dec 16, 2019
@iksaif
Copy link

iksaif commented May 18, 2020

This would be very useful for me too. Currently I have big dict that contain things to generate targets, but I would love those to be out of .bzl file to easily edit them with other code (using buildozer would be too painful)

@brandjon
Copy link
Member

Currently, outside of WORKSPACE evaluation and repo fetching, the loading phase of Bazel depends only on BUILD and .bzl files. This would break that invariant, and blur the line between things needed for package loading and things needed for the build itself. For instance, this feature could be used to modify the build structure based on parsing source files.

Such a change would add a lot of expressivity to the build, and we certainly wouldn't want to do that without a lot of thought. At the same time, there exist straightforward (if a little cumbersome) workarounds to accomplish the end goal requested here.

@pauldraper
Copy link
Contributor Author

pauldraper commented Feb 20, 2021

For instance, this feature could be used to modify the build structure based on parsing source files.

That's....exactly the requested functionality.

For example, package-lock.json has data that should be turned into targets. It is fundamentally build structure.

Currently, you have to run a translation step to convert it into package-lock.bzl. Outside a fetish for converting the world to Skylark, there's no advantage or superior principle gained by requiring .json -> .bzl just to load the exact same data into the build.

@brandjon
Copy link
Member

It's not that we're gung ho for only allowing users to express data in the form of Starlark. It's that we're wary of allowing that data to come from arbitrary sources.

For instance, if a user wrote a .bzl file that read a Java source file in its top-level logic, then any modification to that Java file would trigger a reevaluation of all .bzl and BUILD files downstream from it. (And of course we'd have to update the .bzl loading logic to properly record these dependencies where they occur -- trivial if it's in a load() though.)

In general, we try to avoid adding big new features that increase Bazel's expressivity, particularly when a workaround in how you invoke Bazel can accomplish the same thing.

@pauldraper
Copy link
Contributor Author

pauldraper commented Feb 21, 2021

For instance, if a user wrote a .bzl file that read a Java source file in its top-level logic, then any modification to that Java file would trigger a reevaluation of all .bzl and BUILD files downstream from it.

Yes, agreed. Not seeing any issue there.

And of course we'd have to update the .bzl loading logic to properly record these dependencies where they occur -- trivial if it's in a load() though.

No? Dependencies still come from load()/load_content()....there literally nothing "special" about any of this, except -- hey, don't run it through a Starlark interpreter first.

we try to avoid adding big new features

It's literally less work and less complication than what is currently being done.

@ulfjack
Copy link
Contributor

ulfjack commented Mar 5, 2021

@brandjon, how does it increase Bazel's expressivity if it's already possible to do this?

@brandjon
Copy link
Member

brandjon commented Mar 7, 2021

Using this feature, you could write a macro to generate a java_library target who's deps are populated by parsing the source file for imports, instead of running an external tool and checking its output into source control prior to running Bazel. This is not possible today.

The dependency bloat story might not be as bad as I feared because the load()s don't necessarily have to happen in a .bzl file, they could be in the BUILD file defining the target. Still, I don't take it for granted that this is safe. I'll reopen this issue for now but remain skeptical.

@brandjon brandjon reopened this Mar 7, 2021
@brandjon brandjon added P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) team-Starlark-Interpreter Issues involving the Starlark interpreter used by Bazel and removed P3 We're not considering working on this, but happy to review a PR. (No assignee) labels Mar 7, 2021
@brandjon
Copy link
Member

brandjon commented Mar 7, 2021

To any future implementors of such a feature: This could be done in two ways: as a Starlark module defining a single symbol, or with a new Bazel builtin.

# First method: Any load of a non-.bzl extension does a file read and produces a module defining `content`.
load("//some:file.txt", "content")
print(content)
# Second method: new function taking the file to read, returns resulting value.
content = load_content("some:file.txt")
print(content)

Note that in the second case, from the Starlark interpreter's POV, load_content would be a function, not a piece of a language syntax. We'd need extra static validation over the AST prior to evaluation to ensure that the arguments are string literals and that the load_content calls appear at the top-level (and probably right below the normal load() statements). This validation would be in Bazel, not the interpreter, similarly to how BUILD file syntactic restrictions are enforced.

@brandjon brandjon added team-Starlark-Integration Issues involving Bazel's integration with Starlark, excluding builtin symbols and removed team-Starlark-Interpreter Issues involving the Starlark interpreter used by Bazel labels Oct 31, 2022
Copy link

github-actions bot commented Jan 5, 2024

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

@github-actions github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Jan 5, 2024
@pauldraper
Copy link
Contributor Author

I imagine the first is preferred, as it requires relatively little adjustment.

@github-actions github-actions bot removed the stale Issues or PRs that are stale (no activity for 30 days) label Jan 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) team-Starlark-Integration Issues involving Bazel's integration with Starlark, excluding builtin symbols type: feature request
Projects
None yet
Development

No branches or pull requests

6 participants