-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically map paths in Args#map_each
#21952
Conversation
*/ | ||
@CheckReturnValue | ||
default StarlarkSemantics storeIn(StarlarkSemantics semantics) { | ||
return new StarlarkSemantics(semantics.toBuilder().set(SEMANTICS_KEY, this).build()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to avoid creating new instance of StarlarkSemantics for NOOP case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do that, but could you confirm that the benchmarks are good before we make that change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benchmark is still running. I will let you know asap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the result
metric median Δ 1-pval
cpu: 16852.300s ±505.3s +442.1s, +2.7% 0.92 (weakly significant)
memory: 11110MB ±1.6MB -2.0MB, -0.0% 0.13 (not significant)
system: 1078.910s ±55.8s -5.8s, -0.5% 0.13 (not significant)
wall: 1402.026s ±53.6s +3.5s, +0.3% 0.13 (not significant)
Wondering if preventing new instances of StarlarkSemantics for NOOP case would help reduce the regression even further?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a flamegraph that can attribute the additional CPU time? We can try with the default semantics, but if that avoids the regression, we should also be able to do better for the path mapping case (we may have missed another cache).
Unfortunately Since Starlark I don't yet know what to do about this. |
@aranguyen I decided to first submit a fix for the action key staleness issue for existing functionality in #21999. When that has been merged and we are happy with the benchmarks here, I will add more logic and tests here to ensure that we don't run into the same kind of issue. |
When a command line item is subject to special default stringification behavior (currently, `Label` and `File` instances are), it does not suffice to fingerprint the information that computes the custom stringification; it is also necessary to track whether each individual item is subject to this special formatting. Otherwise, as demonstrated by numerous new test cases, command line items have identical fingerprints to their naive stringification, but may result in distinct command lines. This is fixed by tracking element types via additional UUID markers, either individually or for entire `NestedSet`s based on the `Depset` type.
@bazel-io fork 7.2.0 |
When path mapping is enabled, `File` objects accessed in a user-supplied callback passed to `Args#map_each` automatically have their paths mapped. Automatic rewriting is preferable to e.g. passing a rewriter object into the callback: All paths emitted into command lines must be rewritten with path mapping as inputs and outputs are staged at the mapped locations, so the user would need to manually map all paths - there is no choice. As an added benefit, the automatic rewriting ensures that existing rules relying on `map_each` do not need to be modified to use path mapping. This is a reland of 955b31e, which got rolled back due to a 7% CPU time increase on a benchmark caused by frequent comparisons of equal but not reference equal `StarlarkSemantics` used as keys in the `StarlarkClassDescriptor` cache in `CallUtils`. Instead of overriding `equals` and `hashCode` for `PathMapper`, this change subclasses `StarlarkSemantics` to provide a different, reference equal instance as the cache key. This is safe since the value associated with the `path_mapper` key does not affect the availability of any Starlark field or method, just the behavior of their implementations. Work towards bazelbuild#6526 Closes bazelbuild#21952. PiperOrigin-RevId: 629546010 Change-Id: Ib21fa2371a28a02f0c868523b410c5a40c2c6c82
Hi all, is this still needed for 7.2? |
When path mapping is enabled, `File` objects accessed in a user-supplied callback passed to `Args#map_each` automatically have their paths mapped. Automatic rewriting is preferable to e.g. passing a rewriter object into the callback: All paths emitted into command lines must be rewritten with path mapping as inputs and outputs are staged at the mapped locations, so the user would need to manually map all paths - there is no choice. As an added benefit, the automatic rewriting ensures that existing rules relying on `map_each` do not need to be modified to use path mapping. This is a reland of 955b31e, which got rolled back due to a 7% CPU time increase on a benchmark caused by frequent comparisons of equal but not reference equal `StarlarkSemantics` used as keys in the `StarlarkClassDescriptor` cache in `CallUtils`. Instead of overriding `equals` and `hashCode` for `PathMapper`, this change subclasses `StarlarkSemantics` to provide a different, reference equal instance as the cache key. This is safe since the value associated with the `path_mapper` key does not affect the availability of any Starlark field or method, just the behavior of their implementations. Work towards bazelbuild#6526 Closes bazelbuild#21952. PiperOrigin-RevId: 629546010 Change-Id: Ib21fa2371a28a02f0c868523b410c5a40c2c6c82 Closes bazelbuild#22221
When path mapping is enabled, `File` objects accessed in a user-supplied callback passed to `Args#map_each` automatically have their paths mapped. Automatic rewriting is preferable to e.g. passing a rewriter object into the callback: All paths emitted into command lines must be rewritten with path mapping as inputs and outputs are staged at the mapped locations, so the user would need to manually map all paths - there is no choice. As an added benefit, the automatic rewriting ensures that existing rules relying on `map_each` do not need to be modified to use path mapping. This is a reland of 955b31e, which got rolled back due to a 7% CPU time increase on a benchmark caused by frequent comparisons of equal but not reference equal `StarlarkSemantics` used as keys in the `StarlarkClassDescriptor` cache in `CallUtils`. Instead of overriding `equals` and `hashCode` for `PathMapper`, this change subclasses `StarlarkSemantics` to provide a different, reference equal instance as the cache key. This is safe since the value associated with the `path_mapper` key does not affect the availability of any Starlark field or method, just the behavior of their implementations. Work towards bazelbuild#6526 Closes bazelbuild#21952. PiperOrigin-RevId: 629546010 Change-Id: Ib21fa2371a28a02f0c868523b410c5a40c2c6c82 Closes bazelbuild#22221
When path mapping is enabled, `File` objects accessed in a user-supplied callback passed to `Args#map_each` automatically have their paths mapped. Automatic rewriting is preferable to e.g. passing a rewriter object into the callback: All paths emitted into command lines must be rewritten with path mapping as inputs and outputs are staged at the mapped locations, so the user would need to manually map all paths - there is no choice. As an added benefit, the automatic rewriting ensures that existing rules relying on `map_each` do not need to be modified to use path mapping. This is a reland of 955b31e, which got rolled back due to a 7% CPU time increase on a benchmark caused by frequent comparisons of equal but not reference equal `StarlarkSemantics` used as keys in the `StarlarkClassDescriptor` cache in `CallUtils`. Instead of overriding `equals` and `hashCode` for `PathMapper`, this change subclasses `StarlarkSemantics` to provide a different, reference equal instance as the cache key. This is safe since the value associated with the `path_mapper` key does not affect the availability of any Starlark field or method, just the behavior of their implementations. Work towards #6526 Closes #21952. PiperOrigin-RevId: 629546010 Change-Id: Ib21fa2371a28a02f0c868523b410c5a40c2c6c82 Closes #22221
When path mapping is enabled, `File` objects accessed in a user-supplied callback passed to `Args#map_each` automatically have their paths mapped. Automatic rewriting is preferable to e.g. passing a rewriter object into the callback: All paths emitted into command lines must be rewritten with path mapping as inputs and outputs are staged at the mapped locations, so the user would need to manually map all paths - there is no choice. As an added benefit, the automatic rewriting ensures that existing rules relying on `map_each` do not need to be modified to use path mapping. This is a reland of 955b31e, which got rolled back due to a 7% CPU time increase on a benchmark caused by frequent comparisons of equal but not reference equal `StarlarkSemantics` used as keys in the `StarlarkClassDescriptor` cache in `CallUtils`. Instead of overriding `equals` and `hashCode` for `PathMapper`, this change subclasses `StarlarkSemantics` to provide a different, reference equal instance as the cache key. This is safe since the value associated with the `path_mapper` key does not affect the availability of any Starlark field or method, just the behavior of their implementations. Work towards bazelbuild#6526 Closes bazelbuild#21952. PiperOrigin-RevId: 629546010 Change-Id: Ib21fa2371a28a02f0c868523b410c5a40c2c6c82
When path mapping is enabled,
File
objects accessed in a user-supplied callback passed toArgs#map_each
automatically have their paths mapped.Automatic rewriting is preferable to e.g. passing a rewriter object into the callback: All paths emitted into command lines must be rewritten with path mapping as inputs and outputs are staged at the mapped locations, so the user would need to manually map all paths - there is no choice. As an added benefit, the automatic rewriting ensures that existing rules relying on
map_each
do not need to be modified to use path mapping.This is a reland of 955b31e, which got rolled back due to a 7% CPU time increase on a benchmark caused by frequent comparisons of equal but not reference equal
StarlarkSemantics
used as keys in theStarlarkClassDescriptor
cache inCallUtils
. Instead of overridingequals
andhashCode
forPathMapper
, this change subclassesStarlarkSemantics
to provide a different, reference equal instance as the cache key. This is safe since the value associated with thepath_mapper
key does not affect the availability of any Starlark field or method, just the behavior of their implementations.Work towards #6526