Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query: expose is_executable #15948

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

illicitonion
Copy link
Contributor

It can be useful to know this, and currently the best proxy appears to
be to look for attributes.$is_executable (which isn't documented), and
or-ing that with a match on rule_class.ends_with("_test").

Instead, let's have a clearly documented API here.

It can be useful to know this, and currently the best proxy appears to
be to look for `attributes.$is_executable` (which isn't documented), and
or-ing that with a match on `rule_class.ends_with("_test")`.

Instead, let's have a clearly documented API here.
@lberki lberki requested review from comius and removed request for lberki July 22, 2022 18:28
@sgowroji sgowroji added team-Build-Language awaiting-review PR is awaiting review from an assigned reviewer labels Jul 25, 2022
@KushalP
Copy link

KushalP commented Sep 6, 2022

Any ETA on merging this change?

@brandjon brandjon added team-Core Skyframe, bazel query, BEP, options parsing, bazelrc and removed team-Build-Language labels Nov 5, 2022
@brandjon
Copy link
Member

brandjon commented Nov 5, 2022

Not sure whether to route this to Core (@haxorz), Configurability (@gregestren), or Rules-API (@comius), so CCing TLs from all three.

@lberki
Copy link
Contributor

lberki commented Nov 7, 2022

Since this is in the twilight zone between all these teams, I'll step in: why is this logic useful for you? This at least doesn't work with aliases (and can't, since bazel query cannot resolve them) and doesn't match the logic in RunCommand.isExecutable().

So I don't immediately see how this is superior to looking at the $is_executable attribute.

@illicitonion
Copy link
Contributor Author

So I don't immediately see how this is superior to looking at the $is_executable attribute.

It's mostly superior in terms of being documented (and, by being a proto-field, being self-documenting). As a consumer, it's currently unclear whether $is_executable is a stable public API, or an accidental implementation details.

Also, being logic-free - right now the check is that a rule either has the $is_executable attr or that its rule_class ends up _test, which is logic which may need writing multiple times.

I'd be happy to instead document $is_executable as a known attribute folks can rely on, and ideally send out a change setting that attr of test targets too (to avoid needing to write the or-ing logic), if that'd be preferable

@haxorz
Copy link
Contributor

haxorz commented Nov 7, 2022

I'd prefer the core technical parts of internal issues 28639562, 119498716, and 31180215 first get resolved then the binaries query function can be open-sourced in Bazel.

Here's the [proposed] spec of binaries (iirc from the technical discussion in 119498716):

If E evaluates to S, then binaries(E) evalutes to the subset of S containing all the "binary" targets. A target is considered a "binary" target if it's an instance of a native rule class whose name ends in "_binary" or an instance of a Starlark rule class with executable = True.

@illicitonion wdyt about this binaries function? Would it obsolete this PR or at least complement it?


Due to some internal systems, implementing this binaries function is not as trivial as it sounds. The core technical issue is discussed in #6 - #20 in issue 119498716. This issue was never handled when the binaries function was added in ~2016; the implementation doesn't do the 'an instance of a Starlark rule class with executable = True' part of the spec, instead opting for the hacky-but-easy approach of hardcoding some names of common binary-ish Starlark rule classes.

This query function was never open-sourced in Bazel because of this tech debt. I prefer we resolve this tech debt.


[@lberki ] This at least doesn't work with aliases (and can't, since bazel query cannot resolve them)

Yeah, and ditto for the proposed binaries query function. See comment #7 in internal issue 28639562. But I don't think we should support alias; see comment #8.

@illicitonion
Copy link
Contributor Author

wdyt about this binaries function? Would it obsolete this PR or at least complement it?

Yes, a binaries function would obviate my need for this PR.


I don't have access to the internal bugs, but I'd be interested in the context of not wanting to support alias (in particular, for not wanting to support alias in cquery - I can see why it's hard to support in the context of query). It's cquery where I actually want this information, where I think we should be able to resolve through an alias...

@haxorz
Copy link
Contributor

haxorz commented Nov 7, 2022

Yes, a binaries function would obviate my need for this PR.

Cool, thanks for the info! I added a note to the internal issues.

... (in particular, for not wanting to support alias in cquery... It's cquery where I actually want this information, where I think we should be able to resolve through an alias...

Yeah cquery is technically capable of resolving alias chains.

I would be fine with cquery having different binaries semantics than query. Might be best to do this as a followup FR to the internal issues though.

I don't have access to the internal bugs, but I'd be interested in the context of not wanting to support alias... I can see why it's hard to support in the context of query

Yeah sorry for the lack of transparency. I gave those links mostly for @lberki and @brandjon.

The context is that if binaries followed alias chains (in query-land at least) then there's an internal system at Google whose incremental mode of operation would become unsound due to it not being technically feasible to know when an alias chain has changed. This system isn't doing top-down things like binaries(//p/...) (that is "please tell me all the binary targets in all packages under directory p") but instead doing bottom-up incremental things like "for a given repo diff D, tell me if any of the top-down things in my universe I care about have changed"). It's not technically feasible for this system to have access to sufficient information for doing this for alias chains.

@illicitonion
Copy link
Contributor Author

This system isn't doing top-down things like binaries(//p/...) (that is "please tell me all the binary targets in all packages under directory p") but instead doing bottom-up incremental things like "for a given repo diff D, tell me if any of the top-down things in my universe I care about have changed"). It's not technically feasible for this system to have access to sufficient information for doing this for alias chains.

Interesting - the thing that prompted me to file this PR is actually a very similar-sounding system: https://github.com/bazel-contrib/target-determinator - its goal is to be able to tell you which targets (that you care about) changed between two git commits.

Roughly the form this tools operation takes is: Given some query pattern:

  1. Run a cquery deps($query)
  2. Run a cquery $query
  3. Come up with a hash for every target in the matching set, including via digesting input files and their deps

We do this at both the "before" and the "after" commit, and compare hashes to find the affected targets.

I've seen a number of tools try to do similar analysis by crafting careful queries without actually winding back to the prior commit and performing the queries there too, and they have all missed important possible changes. Is that roughly what's going on in your internal tool too? I'd be curious as to ways we can generally making solving this kind of problem easier (either inside or outside of Bazel)...

@haxorz
Copy link
Contributor

haxorz commented Nov 7, 2022

Interesting - the thing that prompted me to file this PR is actually a very similar-sounding system

Yes, very interesting!

Roughly the form this tools operation takes is: Given some query pattern...

[Since I know the alias discussion wasn't your central point, this is an aside...]

If cquery's binaries follows alias chains, then I think you're in trouble in that niche situation.

Imagine //p1:t1 is an alias to //p2:t2. Imagine the pattern of interest is binaries(//p1:t1). Then step 2's cquery "deps(binaries(//p1:t1))" will be exactly the same as cquery "deps(//p2:t2)" and step 3's cquery "binaries(//p1:t1)" will be exactly the same as cquery "//p2:t2" i.e. steps 2 & 3 will not be helpful for detecting changes to the alias target of //p1:t1.

I've seen a number of tools try to do similar analysis by crafting careful queries without actually winding back to the prior commit and performing the queries there too, and they have all missed important possible changes.

Yeah agree that this is using an algorithmic smell.

Is that roughly what's going on in your internal tool too?

Similar goal but we take a bottom-up approach. Scale is too large for us to do the top-down approach of deps(T) for all T at every commit.

@comius comius removed their request for review September 6, 2023 14:09
@meteorcloudy meteorcloudy requested a review from a team as a code owner November 13, 2023 10:19
@meteorcloudy meteorcloudy requested review from sdtwigg and removed request for a team November 13, 2023 10:19
@katre katre added the team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts label Nov 13, 2023
@katre
Copy link
Member

katre commented Nov 13, 2023

I've deleted my previous comments because they were inaccurate.

This PR has no impact on Configurability and is a request for existing data to be exposed via the ctx API, so it should be handled by the Rules API.

I see that @comius un-assigned this a few months ago: it should either be assigned to someone on that team or closed because we don't plan to merge it.

Sorry for the confusion, @illicitonion.

@katre katre requested review from comius and removed request for sdtwigg November 13, 2023 17:08
@comius
Copy link
Contributor

comius commented May 15, 2024

Is there still interest to get this in?

@comius comius added awaiting-user-response Awaiting a response from the author and removed awaiting-review PR is awaiting review from an assigned reviewer labels May 15, 2024
@illicitonion
Copy link
Contributor Author

Yeah, I'd happily get this merged - let me know if I should rebase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-user-response Awaiting a response from the author team-Core Skyframe, bazel query, BEP, options parsing, bazelrc team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants