Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update libstarlark to latest bazel branch #152

Merged
merged 7 commits into from Sep 13, 2021
Merged

Update libstarlark to latest bazel branch #152

merged 7 commits into from Sep 13, 2021

Conversation

mahmoudimus
Copy link
Collaborator

@mahmoudimus mahmoudimus commented Aug 4, 2021

Description of changes in release / Impact of release:

Import latest changes for Starlark from bazel upstream: bazelbuild/bazel@50c8f32

Documentation

  • Eliminate raw field from the Lexer. This is a micro-optimization to avoid
    eagerly allocating a String every time we process a token, when only IntLiteral
    and FloatLiteral care about the raw text. 227da61a0a522e
  • Intern identifier and keyword Strings in the lexer. This makes it so that the
    lexer returns a (weakly) canonical String instance for all occurrences of a
    given identifier or keyword token. This change was prompted by ilist@'s
    observation that there are many duplicate Strings that appear in profiling,
    consisting of common Starlark identifiers such as "name" and "ctx". Benchmark
    results: Interning identifiers+keywords was observed to save 3-4% on
    a bazel query deps(...) invocation of bazel itself, and almost 1% on a much larger
    target. Static strong interning was a bit better than static weak interning,
    which in turn was a bit better than non-static (strong) interning. We went with
    static weak since static strong would constitute a memory leak for long-running
    Bazel servers; this follows precedent elsewhere in Bazel. We considered
    interning bufferSlice() as well, which would intercept all conversions of the
    lexer's source text to String objects, but this showed no noticeable improvement
    and actually regressed RAM usage in some cases. The parser already interns
    string literals; that is unaffected by this CL. Note that we use Guava
    interners, not BlazeInterners, because this is Starlark interpreter code that
    must not depend on bazelisms. 9bf5a14850685
  • Catch an unchecked exception which may be thrown by Strings.repeat(). To
    provide a nicer error message for the user. Otherwise, we'd get an
    UncheckedEvalException without a Starlark stack trace. 1bd78126b14f
  • Catch EvalException in toIterable calls, ensuring that error line numbers are
    set correctly.592c38445e42
  • Include the Starlark stack trace as part of the crash's stack trace for all
    unhandled throwables originating in Starlark evaluation. In keeping with
    tradition of not wrapping Error in AbstractParallelEvaluator, different
    exception types are used for RuntimeException and Error. Additionally embed
    a context supplier in StarlarkThread. 86b32a64f596
  • We want to restrict the type to a Sequence in Java (to avoid an unsafe cast)
    and to document it as a sequence of unspecified type in Starlark API
    docs (unfortunately, ParamType annotations currently don't allow us to express a
    Sequence containing 2 different allowed types possibly intermixed). 8a43e0cb973ab9
  • Fix toolchains to support type lookup. b120d4febc571
  • Create a new interface to allow Starlark objects to get a thread when getIndex
    is called. 68effbecba59

Risks of this release

None

Is this a breaking change?

  • Yes
  • No

Is there a way to disable the change?

  • Use previous release
  • Use a feature flag
  • No

Changelog below:

- Eliminate raw field from the Lexer. This is a micro-optimization to avoid
eagerly allocating a String every time we process a token, when only IntLiteral
and FloatLiteral care about the raw text. [227da61a0a522e](bazelbuild/bazel@227da61#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573)
- Intern identifier and keyword Strings in the lexer. This makes it so that the
lexer returns a (weakly) canonical String instance for all occurrences of a
given identifier or keyword token. This change was prompted by ilist@'s
observation that there are many duplicate Strings that appear in profiling,
consisting of common Starlark identifiers such as "name" and "ctx". Benchmark
results: Interning identifiers+keywords was observed to save 3-4% on a `bazel
query deps(...)` invocation of bazel itself, and almost 1% on a much larger
target. Static strong interning was a bit better than static weak interning,
which in turn was a bit better than non-static (strong) interning. We went with
static weak since static strong would constitute a memory leak for long-running
Bazel servers; this follows precedent elsewhere in Bazel. We considered
interning bufferSlice() as well, which would intercept all conversions of the
lexer's source text to String objects, but this showed no noticeable improvement
and actually regressed RAM usage in some cases. The parser already interns
string literals; that is unaffected by this CL. Note that we use Guava
interners, not BlazeInterners, because this is Starlark interpreter code that
must not depend on bazelisms. [9bf5a14850685](bazelbuild/bazel@9bf5a14#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573)
- Catch an unchecked exception which may be thrown by Strings.repeat(). To
provide a nicer error message for the user. Otherwise, we'd get an
UncheckedEvalException without a Starlark stack trace. [1bd78126b14f](bazelbuild/bazel@1bd7812#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573)
- Catch EvalException in toIterable calls, ensuring that error line numbers are
set correctly.[592c38445e42](bazelbuild/bazel@592c384#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573)
- Include the Starlark stack trace as part of the crash's stack trace for all
unhandled throwables originating in Starlark evaluation. In keeping with
tradition of not wrapping `Error` in `AbstractParallelEvaluator`, different
exception types are used for `RuntimeException` and `Error`. Additionally embed
a context supplier in `StarlarkThread`. [86b32a64f596](bazelbuild/bazel@86b32a6#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573)
- We want to restrict the type to a Sequence in Java (to avoid an unsafe cast)
and to document it as a sequence of unspecified type in Starlark API
docs (unfortunately, ParamType annotations currently don't allow us to express a
Sequence containing 2 different allowed types possibly intermixed). [8a43e0cb973ab9](bazelbuild/bazel@8a43e0c#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573)
- Fix toolchains to support type lookup. [b120d4febc571](bazelbuild/bazel@b120d4f#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573)
- Create a new interface to allow Starlark objects to get a thread when getIndex
is called. [68effbecba59](bazelbuild/bazel@68effbe#diff-37cd73cae488b2421089c2b94c9a869cfb4a5f109753f60602b1dd20959bc573)
@mjallday mjallday merged commit 42270b0 into master Sep 13, 2021
@mjallday mjallday deleted the update-starlark branch September 13, 2021 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants