New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update libstarlark to latest bazel branch #152
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Changelog below: - Eliminate raw field from the Lexer. This is a micro-optimization to avoid eagerly allocating a String every time we process a token, when only IntLiteral and FloatLiteral care about the raw text. [227da61a0a522e](bazelbuild/bazel@227da61#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573) - Intern identifier and keyword Strings in the lexer. This makes it so that the lexer returns a (weakly) canonical String instance for all occurrences of a given identifier or keyword token. This change was prompted by ilist@'s observation that there are many duplicate Strings that appear in profiling, consisting of common Starlark identifiers such as "name" and "ctx". Benchmark results: Interning identifiers+keywords was observed to save 3-4% on a `bazel query deps(...)` invocation of bazel itself, and almost 1% on a much larger target. Static strong interning was a bit better than static weak interning, which in turn was a bit better than non-static (strong) interning. We went with static weak since static strong would constitute a memory leak for long-running Bazel servers; this follows precedent elsewhere in Bazel. We considered interning bufferSlice() as well, which would intercept all conversions of the lexer's source text to String objects, but this showed no noticeable improvement and actually regressed RAM usage in some cases. The parser already interns string literals; that is unaffected by this CL. Note that we use Guava interners, not BlazeInterners, because this is Starlark interpreter code that must not depend on bazelisms. [9bf5a14850685](bazelbuild/bazel@9bf5a14#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573) - Catch an unchecked exception which may be thrown by Strings.repeat(). To provide a nicer error message for the user. Otherwise, we'd get an UncheckedEvalException without a Starlark stack trace. [1bd78126b14f](bazelbuild/bazel@1bd7812#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573) - Catch EvalException in toIterable calls, ensuring that error line numbers are set correctly.[592c38445e42](bazelbuild/bazel@592c384#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573) - Include the Starlark stack trace as part of the crash's stack trace for all unhandled throwables originating in Starlark evaluation. In keeping with tradition of not wrapping `Error` in `AbstractParallelEvaluator`, different exception types are used for `RuntimeException` and `Error`. Additionally embed a context supplier in `StarlarkThread`. [86b32a64f596](bazelbuild/bazel@86b32a6#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573) - We want to restrict the type to a Sequence in Java (to avoid an unsafe cast) and to document it as a sequence of unspecified type in Starlark API docs (unfortunately, ParamType annotations currently don't allow us to express a Sequence containing 2 different allowed types possibly intermixed). [8a43e0cb973ab9](bazelbuild/bazel@8a43e0c#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573) - Fix toolchains to support type lookup. [b120d4febc571](bazelbuild/bazel@b120d4f#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573) - Create a new interface to allow Starlark objects to get a thread when getIndex is called. [68effbecba59](bazelbuild/bazel@68effbe#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573)
mjallday
approved these changes
Sep 13, 2021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of changes in release / Impact of release:
Import latest changes for Starlark from bazel upstream: bazelbuild/bazel@50c8f32
Documentation
eagerly allocating a String every time we process a token, when only IntLiteral
and FloatLiteral care about the raw text. 227da61a0a522e
lexer returns a (weakly) canonical String instance for all occurrences of a
given identifier or keyword token. This change was prompted by ilist@'s
observation that there are many duplicate Strings that appear in profiling,
consisting of common Starlark identifiers such as "name" and "ctx". Benchmark
results: Interning identifiers+keywords was observed to save 3-4% on
a
bazel query deps(...)
invocation of bazel itself, and almost 1% on a much largertarget. Static strong interning was a bit better than static weak interning,
which in turn was a bit better than non-static (strong) interning. We went with
static weak since static strong would constitute a memory leak for long-running
Bazel servers; this follows precedent elsewhere in Bazel. We considered
interning bufferSlice() as well, which would intercept all conversions of the
lexer's source text to String objects, but this showed no noticeable improvement
and actually regressed RAM usage in some cases. The parser already interns
string literals; that is unaffected by this CL. Note that we use Guava
interners, not BlazeInterners, because this is Starlark interpreter code that
must not depend on bazelisms. 9bf5a14850685
provide a nicer error message for the user. Otherwise, we'd get an
UncheckedEvalException without a Starlark stack trace. 1bd78126b14f
set correctly.592c38445e42
unhandled throwables originating in Starlark evaluation. In keeping with
tradition of not wrapping
Error
inAbstractParallelEvaluator
, differentexception types are used for
RuntimeException
andError
. Additionally embeda context supplier in
StarlarkThread
. 86b32a64f596and to document it as a sequence of unspecified type in Starlark API
docs (unfortunately, ParamType annotations currently don't allow us to express a
Sequence containing 2 different allowed types possibly intermixed). 8a43e0cb973ab9
is called. 68effbecba59
Risks of this release
None
Is this a breaking change?
Is there a way to disable the change?