Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REPLCompletions: replace get_type by the proper inference #49206

Merged
merged 1 commit into from
Apr 3, 2023

Conversation

aviatesk
Copy link
Sponsor Member

@aviatesk aviatesk commented Mar 31, 2023

This PR generalizes the idea from #49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current get_[value|type]-based implementation:

Analysis on arbitrary REPL expressions can be done by the following steps:

  • Lower a given expression
  • Form a top-level MethodInstance from the lowered expression
  • Run inference on the top-level MethodInstance

This PR implements REPLInterpreter, a custom AbstractInterpreter that:

  • aggressively resolves global bindings to enable reasonable completions
    for lines like Mod.a.| (where | is the cursor position)
  • aggressively concrete-evaluates :inconsistent calls to provide
    reasonable completions for cases like Ref(Some(42))[].|
  • does not optimize the inferred code, as REPLInterpreter is only used
    to obtain the type or constant information of given expressions

Aggressive binding resolution presents challenges for REPLInterpreter's
cache validation (since #40399 hasn't been resolved yet). To avoid cache
validation issue, REPLInterpreter only allows aggressive binding
resolution for top-level frame representing REPL input code
(repl_frame) and for child getproperty frames that are
constant propagated from the repl_frame. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached repl_frame

REPLInterpreter also aggressively concrete-evaluates :inconsistent
calls within repl_frame, allowing it to get get accurate type
information about complex expressions that otherwise can not be constant
folded, in a safe way, i.e. it still doesn't evaluate effectful
expressions like pop!(xs). Similarly to the aggressive binding
resolution, this aggressive concrete evaluation doesn't present any cache
validation issues because it is limited to repl_frame that is never cached.

Also note that the code cache for REPLInterpreter is separated from the
native code cache, ensuring that code caches produced by REPLInterpreter,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing CodeInstancess produced by
REPLInterpreter during precompilation to workaround #48453.

closes #36437
replaces #49199

@aviatesk
Copy link
Sponsor Member Author

@nanosoldier runbenchmarks(!"scalar", vs=":master")

@oscardssmith
Copy link
Member

Can you make it so this is able to concrete eval through non-consistent code?

@aviatesk
Copy link
Sponsor Member Author

aviatesk commented Mar 31, 2023

It might be reasonable to enable concrete eval for inconsistent calls limitedly, but we first need to finish #47154 in order to constant fold arrays.

stdlib/REPL/test/replcompletions.jl Outdated Show resolved Hide resolved
stdlib/REPL/test/replcompletions.jl Outdated Show resolved Hide resolved
@staticfloat
Copy link
Sponsor Member

as a fun little bonus, this approach is actually more correct when analyzing the degenerate case:

struct UnstableFoo
end
function Base.propertynames(::UnstableFoo)
    if rand() > 0.5
        return (:a, :b)
    else
        return (:a, :b, :c)
    end
end

struct Wrap
    a
end
julia> Wrap(UnstableFoo()).a.
a  b  c
julia> Wrap(UnstableFoo()).a.
a  b

@staticfloat
Copy link
Sponsor Member

Hmmm, playing around with this a bit, I did notice that unusual symbol names (such as those that might require use of var"" to access them) can confuse the tab autocompletion, since the var"" syntax looks like a string, and so it starts to auto-suggest filenames, rather than symbol names:

julia> struct WeirdNames
       end
       Base.propertynames(::WeirdNames) = (Symbol("oh no!"), Symbol("oh yes!"))

julia> WeirdNames().var"oh 
.buildkite-external-version  .clang-format                .clangd                      .codecov.yml                 .devcontainer/               .git-blame-ignore-revs
.git/                        .gitattributes               .github/                     .gitignore                   .mailmap                     CITATION.bib
CITATION.cff                 CONTRIBUTING.md              HISTORY.md                   LICENSE.md                   Make.inc                     Makefile
NEWS.md                      README.md                    THIRDPARTY.md                VERSION                      base/                        cli/
contrib/                     deps/                        doc/                         etc/                         julia                        julia.spdx.json
pkgimage.mk                  src/                         stdlib/                      sysimage.mk                  test/                        usr-staging/
usr/

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@aviatesk
Copy link
Sponsor Member Author

aviatesk commented Apr 1, 2023

I did notice that unusual symbol names (such as those that might require use of var"" to access them) can confuse the tab autocompletion, since the var"" syntax looks like a string, and so it starts to auto-suggest filenames, rather than symbol names:

Yes, you're right. This specific issue isn't related to this PR since it comes from our handling of REPL input code, which occurs before the type analysis step. We should definitely improve it, likely in a separate PR.

@aviatesk
Copy link
Sponsor Member Author

aviatesk commented Apr 1, 2023

Can you make it so this is able to concrete eval through non-consistent code?

Implemented. Now we can get completions for cases like Ref(Some(42))[].|. Combined with #47154, we will be able to get completions for cases like Any[Some(42)].| also.

@aviatesk
Copy link
Sponsor Member Author

aviatesk commented Apr 1, 2023

@nanosoldier runbenchmarks("linalg" || "inference" || "sparse", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

This PR generalizes the idea from #49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current `get_[value|type]`-based implementation:
- The need for various special cases is eliminated, as lowering normalizes
  expressions, and inference handles all language features.
- Constant propagation allows us to obtain accurate completions for complex
  expressions safely (see #36437).

Analysis on arbitrary REPL expressions can be done by the following steps:
- Lower a given expression
- Form a top-level `MethodInstance` from the lowered expression
- Run inference on the top-level `MethodInstance`

This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that:
- aggressively resolve global bindings to enable reasonable completions
  for lines like `Mod.a.|` (where `|` is the cursor position)
- aggressively concrete evaluates `:inconsistent` calls to provide
  reasonable completions for cases like `Ref(Some(42))[].|`
- does not optimize the inferred code, as `REPLInterpreter` is only used
  to obtain the type or constant information of given expressions

Aggressive binding resolution presents challenges for `REPLInterpreter`'s
cache validation (since #40399 hasn't been resolved yet). To avoid cache
validation issue, `REPLInterpreter` only allows aggressive binding
resolution for top-level frame representing REPL input code
(`repl_frame`) and for child `getproperty` frames that are
constant propagated from the `repl_frame`. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached `repl_frame`

`REPLInterpreter` also aggressively concrete evaluate `:inconsistent`
calls within `repl_frame`, allowing it to get get accurate type
information about complex expressions that otherwise can not be constant
folded, in a safe way, i.e. it still doesn't evaluate effectful
expressions like `pop!(xs)`. Similarly to the aggressive binding
resolution, aggressive concrete evaluation doesn't present any cache
validation issues because `repl_frame` is never cached.

Also note that the code cache for `REPLInterpreter` is separated from the
native code cache, ensuring that code caches produced by `REPLInterpreter`,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing `CodeInstances`s produced by
`REPLInterpreter` during precompilation to workaround #48453.

closes #36437
replaces #49199
@StefanKarpinski
Copy link
Sponsor Member

What a cool and practical use of our effect analysis!

@oscardssmith oscardssmith merged commit 98988d8 into master Apr 3, 2023
@oscardssmith oscardssmith deleted the avi/36437 branch April 3, 2023 20:29
@oscardssmith oscardssmith added the stdlib:REPL Julia's REPL (Read Eval Print Loop) label Apr 3, 2023
Xnartharax pushed a commit to Xnartharax/julia that referenced this pull request Apr 19, 2023
…g#49206)

This PR generalizes the idea from JuliaLang#49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current `get_[value|type]`-based implementation:
- The need for various special cases is eliminated, as lowering normalizes
  expressions, and inference handles all language features.
- Constant propagation allows us to obtain accurate completions for complex
  expressions safely (see JuliaLang#36437).

Analysis on arbitrary REPL expressions can be done by the following steps:
- Lower a given expression
- Form a top-level `MethodInstance` from the lowered expression
- Run inference on the top-level `MethodInstance`

This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that:
- aggressively resolve global bindings to enable reasonable completions
  for lines like `Mod.a.|` (where `|` is the cursor position)
- aggressively concrete evaluates `:inconsistent` calls to provide
  reasonable completions for cases like `Ref(Some(42))[].|`
- does not optimize the inferred code, as `REPLInterpreter` is only used
  to obtain the type or constant information of given expressions

Aggressive binding resolution presents challenges for `REPLInterpreter`'s
cache validation (since JuliaLang#40399 hasn't been resolved yet). To avoid cache
validation issue, `REPLInterpreter` only allows aggressive binding
resolution for top-level frame representing REPL input code
(`repl_frame`) and for child `getproperty` frames that are
constant propagated from the `repl_frame`. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached `repl_frame`

`REPLInterpreter` also aggressively concrete evaluate `:inconsistent`
calls within `repl_frame`, allowing it to get get accurate type
information about complex expressions that otherwise can not be constant
folded, in a safe way, i.e. it still doesn't evaluate effectful
expressions like `pop!(xs)`. Similarly to the aggressive binding
resolution, aggressive concrete evaluation doesn't present any cache
validation issues because `repl_frame` is never cached.

Also note that the code cache for `REPLInterpreter` is separated from the
native code cache, ensuring that code caches produced by `REPLInterpreter`,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing `CodeInstances`s produced by
`REPLInterpreter` during precompilation to workaround JuliaLang#48453.

closes JuliaLang#36437
replaces JuliaLang#49199
Pangoraw added a commit to Pangoraw/FuzzyCompletions.jl that referenced this pull request Aug 7, 2023
This updates the code taken from `REPL.REPLCompletions` with the changes
introduced in JuliaLang/julia#49206.

Otherwise, `using FuzzyCompletions` will complain about `get_type` and
`get_value` being undefined when precompiling and completely fail when
being used on the upcoming Julia 1.10 release.

Co-authored-by: Shuhei Kadowaki <aviatesk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib:REPL Julia's REPL (Read Eval Print Loop)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use const propagating inference in REPL tab completions
5 participants