Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Vector.new for fill and tuning... #3744

Merged
merged 14 commits into from
Nov 30, 2022
Merged

Use Vector.new for fill and tuning... #3744

merged 14 commits into from
Nov 30, 2022

Conversation

jdunkerley
Copy link
Member

@jdunkerley jdunkerley commented Sep 28, 2022

Pull Request Description

  • Make Vector.fill use the Vector.new method.
  • Tuning of some Range methods to try and get better performance.
Test Old Current Change
New Vector 77.5 72.5 94%
Fill Constant 71.8 42.1 59%
Fill Random 156.5 124.2 79%
Append Single 13.3 3.9 29%
Append Large 13.0 4.9 38%
Sum 146.4 122.3 84%
Drop First 20 and Sum 148.0 132.7 90%
Drop Last 20 and Sum 145.3 138.0 95%
Filter 79.4 68.5 86%
Filter With Index 152.9 158.5 104%
Map & Filter 438.0 440.7 101%
Partition 256.4 296.7 116%
Partition With Index 410.0 392.0 96%
Each 117.4 103.8 88%

Checklist

Please include the following checklist in your PR:

  • The documentation has been updated if necessary.
  • All code conforms to the
    Scala,
    Java,
    and
    Rust
    style guides.
  • All code has been tested:
    • Unit tests have been written where possible.
    • If GUI codebase was changed: Enso GUI was tested when built using BOTH
      ./run ide build and ./run ide watch.

hubertp added a commit that referenced this pull request Oct 7, 2022
This change stems from the investigation into ellusive performance
degradation that became visible when implementing `Builder.to_vector`
with `slice` to not copy the store in
#3744.
We were seeing about 40% drop for a simple `filter` method calls
without any reason.

IGV debugging was fun, but didn't reveal much. Profiling with VisualVM
revealed that we were previously inlining most of nodes, while with the
new version that was no longer the case. Still, it was hard to figure
out why.

Decided to use
```
-Dpolyglot.engine.TraceDeoptimizeFrame=true
-Dpolyglot.engine.BackgroundCompilation=false
-Dpolyglot.engine.TraceCompilation=true
-Dpolyglot.engine.TraceInlining=true
```
which revealed a more than expected number of deoptimizations. With
```
-Dpolyglot.engine.TraceTransferToInterpreter=true
```
we were able to see the source of it:
```
[info] [2022-10-07T13:13:12.98Z] [engine] transferToInterpreter at
  Builder.to_vector<arg-2>(built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso:1091) <split-c6d56df>
    Builder.to_vector(built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso:1091) <split-6500912a>
    Vector.filter(built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso:347) <split-660ff9b4>
    ...
    ...
  org.graalvm.truffle/com.oracle.truffle.api.CompilerDirectives.transferToInterpreterAndInvalidate(CompilerDirectives.java:90)
    org.enso.interpreter.node.callable.resolver.MethodResolverNodeGen.execute(MethodResolverNodeGen.java:47)
    org.enso.interpreter.node.callable.resolver.HostMethodCallNode.getPolyglotCallType(HostMethodCallNode.java:146)
    org.enso.interpreter.node.callable.InvokeMethodNodeGen.execute(InvokeMethodNodeGen.java:86)
    org.enso.interpreter.node.callable.InvokeCallableNode.invokeDynamicSymbol(InvokeCallableNode.java:254)
    org.enso.interpreter.node.callable.InvokeCallableNodeGen.execute(InvokeCallableNodeGen.java:54)
    org.enso.interpreter.node.callable.ApplicationNode.executeGeneric(ApplicationNode.java:99)
    org.enso.interpreter.node.ClosureRootNode.execute(ClosureRootNode.java:90)
```
So the problem was in the new implementation of `to_vector` but not in
`slice`, as we expected, but in `list.size`.

Problem was that `MethodResolver` would always deoptimize for polyglot
array method calls because newly created `UnresolvedSymbol` in
[HostMethodCallNode](https://github.com/enso-org/enso/blob/ea60cd5fab48d9f6b16923cea21620b97216a898/engine/runtime/src/main/java/org/enso/interpreter/node/callable/resolver/HostMethodCallNode.java#L145)
wouldn't `==` to the cached one in
[MethodResolver](https://github.com/enso-org/enso/blob/ea60cd5fab48d9f6b16923cea21620b97216a898/engine/runtime/src/main/java/org/enso/interpreter/node/callable/resolver/MethodResolverNode.java#L38).

Fixed by providing a custom `equals` implementation for
`UnresolvedSymbol` and `s/==/equals` in `MethodResolverNode` cache
check.
hubertp added a commit that referenced this pull request Oct 7, 2022
This change stems from the investigation into elusive performance
degradation that became visible when implementing `Builder.to_vector`
with `slice` to not copy the store in
#3744.
We were seeing about 40% drop for a simple `filter` method calls
without any reason.

IGV debugging was fun, but didn't reveal much. Profiling with VisualVM
revealed that we were previously inlining most of nodes, while with the
new version that was no longer the case. Still, it was hard to figure
out why.

Decided to use
```
-Dpolyglot.engine.TraceDeoptimizeFrame=true
-Dpolyglot.engine.BackgroundCompilation=false
-Dpolyglot.engine.TraceCompilation=true
-Dpolyglot.engine.TraceInlining=true
```
which revealed a more than expected number of deoptimizations. With
```
-Dpolyglot.engine.TraceTransferToInterpreter=true
```
we were able to see the source of it:
```
[info] [2022-10-07T13:13:12.98Z] [engine] transferToInterpreter at
  Builder.to_vector<arg-2>(built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso:1091) <split-c6d56df>
    Builder.to_vector(built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso:1091) <split-6500912a>
    Vector.filter(built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso:347) <split-660ff9b4>
    ...
    ...
  org.graalvm.truffle/com.oracle.truffle.api.CompilerDirectives.transferToInterpreterAndInvalidate(CompilerDirectives.java:90)
    org.enso.interpreter.node.callable.resolver.MethodResolverNodeGen.execute(MethodResolverNodeGen.java:47)
    org.enso.interpreter.node.callable.resolver.HostMethodCallNode.getPolyglotCallType(HostMethodCallNode.java:146)
    org.enso.interpreter.node.callable.InvokeMethodNodeGen.execute(InvokeMethodNodeGen.java:86)
    org.enso.interpreter.node.callable.InvokeCallableNode.invokeDynamicSymbol(InvokeCallableNode.java:254)
    org.enso.interpreter.node.callable.InvokeCallableNodeGen.execute(InvokeCallableNodeGen.java:54)
    org.enso.interpreter.node.callable.ApplicationNode.executeGeneric(ApplicationNode.java:99)
    org.enso.interpreter.node.ClosureRootNode.execute(ClosureRootNode.java:90)
```
So the problem was in the new implementation of `to_vector` but not in
`slice`, as we expected, but in `list.size`.

Problem was that `MethodResolver` would always deoptimize for polyglot
array method calls because newly created `UnresolvedSymbol` in
[HostMethodCallNode](https://github.com/enso-org/enso/blob/ea60cd5fab48d9f6b16923cea21620b97216a898/engine/runtime/src/main/java/org/enso/interpreter/node/callable/resolver/HostMethodCallNode.java#L145)
wouldn't `==` to the cached one in
[MethodResolver](https://github.com/enso-org/enso/blob/ea60cd5fab48d9f6b16923cea21620b97216a898/engine/runtime/src/main/java/org/enso/interpreter/node/callable/resolver/MethodResolverNode.java#L38).

Fixed by providing a custom `equals` implementation for
`UnresolvedSymbol` and `s/==/equals` in `MethodResolverNode` cache
check.
hubertp added a commit that referenced this pull request Oct 10, 2022
This change stems from the investigation into elusive performance
degradation that became visible when implementing `Builder.to_vector`
with `slice` to not copy the store in
#3744.
We were seeing about 40% drop for a simple `filter` method calls
without any reason.

IGV debugging was fun, but didn't reveal much. Profiling with VisualVM
revealed that we were previously inlining most of nodes, while with the
new version that was no longer the case. Still, it was hard to figure
out why.

Decided to use
```
-Dpolyglot.engine.TraceDeoptimizeFrame=true
-Dpolyglot.engine.BackgroundCompilation=false
-Dpolyglot.engine.TraceCompilation=true
-Dpolyglot.engine.TraceInlining=true
```
which revealed a more than expected number of deoptimizations. With
```
-Dpolyglot.engine.TraceTransferToInterpreter=true
```
we were able to see the source of it:
```
[info] [2022-10-07T13:13:12.98Z] [engine] transferToInterpreter at
  Builder.to_vector<arg-2>(built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso:1091) <split-c6d56df>
    Builder.to_vector(built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso:1091) <split-6500912a>
    Vector.filter(built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso:347) <split-660ff9b4>
    ...
    ...
  org.graalvm.truffle/com.oracle.truffle.api.CompilerDirectives.transferToInterpreterAndInvalidate(CompilerDirectives.java:90)
    org.enso.interpreter.node.callable.resolver.MethodResolverNodeGen.execute(MethodResolverNodeGen.java:47)
    org.enso.interpreter.node.callable.resolver.HostMethodCallNode.getPolyglotCallType(HostMethodCallNode.java:146)
    org.enso.interpreter.node.callable.InvokeMethodNodeGen.execute(InvokeMethodNodeGen.java:86)
    org.enso.interpreter.node.callable.InvokeCallableNode.invokeDynamicSymbol(InvokeCallableNode.java:254)
    org.enso.interpreter.node.callable.InvokeCallableNodeGen.execute(InvokeCallableNodeGen.java:54)
    org.enso.interpreter.node.callable.ApplicationNode.executeGeneric(ApplicationNode.java:99)
    org.enso.interpreter.node.ClosureRootNode.execute(ClosureRootNode.java:90)
```
So the problem was in the new implementation of `to_vector` but not in
`slice`, as we expected, but in `list.size`.

Problem was that `MethodResolver` would always deoptimize for polyglot
array method calls because newly created `UnresolvedSymbol` in
[HostMethodCallNode](https://github.com/enso-org/enso/blob/ea60cd5fab48d9f6b16923cea21620b97216a898/engine/runtime/src/main/java/org/enso/interpreter/node/callable/resolver/HostMethodCallNode.java#L145)
wouldn't `==` to the cached one in
[MethodResolver](https://github.com/enso-org/enso/blob/ea60cd5fab48d9f6b16923cea21620b97216a898/engine/runtime/src/main/java/org/enso/interpreter/node/callable/resolver/MethodResolverNode.java#L38).

Fixed by providing a custom `equals` implementation for
`UnresolvedSymbol` and `s/==/equals` in `MethodResolverNode` cache
check.
@jdunkerley jdunkerley force-pushed the wip/jd/vector-fill-new branch 4 times, most recently from 0fd1030 to b2bfae0 Compare October 17, 2022 15:27
@jdunkerley jdunkerley added the CI: No changelog needed Do not require a changelog entry for this PR. label Nov 28, 2022
@jdunkerley jdunkerley changed the title Avoid copying for Vector.Builder and Vector.fill Use Vector.new for fill and tuning... Nov 28, 2022
@jdunkerley jdunkerley marked this pull request as ready for review November 28, 2022 22:28
@jdunkerley jdunkerley force-pushed the wip/jd/vector-fill-new branch 2 times, most recently from fba70e4 to a6cddc3 Compare November 29, 2022 06:45
Copy link
Member

@radeusgd radeusgd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

While we are at it - seems that Vector.fill docs are completely out of date. Can we update them?

test/Benchmarks/src/Vector/Vector.enso Outdated Show resolved Hide resolved
@jdunkerley jdunkerley added the CI: Ready to merge This PR is eligible for automatic merge label Nov 29, 2022
jdunkerley and others added 7 commits November 29, 2022 11:03
Oops...

Oops...

Try using Range for Vector loops.

Before trying to sort Range...

Update Fold and Reduce...

Update Fold and Reduce...

Doh!

Release snapshot.

Don't use slice for builder

Allow slice to be whole Vector and then avoid copies for builder.

Use new instead of a Builder.

Add a fill test 2.

Add a fill test.
@@ -87,10 +87,8 @@ type Vector a

Vector.fill length=50 item=42
fill : Integer -> Any -> Vector Any
fill length ~item =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out removing ~ caused this regression #10142 (comment), CCing @radeusgd and @jdunkerley

Previously the ~item was evaluated again and again for each element of the array. Now it is evaluated just once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I find the new fill semantic better. Use Vector.new if you want to create vector with different values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I find the new fill semantic better. Use Vector.new if you want to create vector with different values.

Yes, I too think that current semantics is better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: No changelog needed Do not require a changelog entry for this PR. CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants