Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support string elements in array proxy. #831

Closed
wants to merge 6 commits into from

Conversation

laithsakka
Copy link
Contributor

@laithsakka laithsakka commented Jan 4, 2022

ONLY LAST COMMIT BELONGS TO THIS DIFF

  • This diff support string type elements for array proxy.
  • std::enable_if should be dependent on a function template argument,
    hence some changes on this diff fixes that for the std::like interface
    of the array proxy.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 4, 2022
velox/expression/tests/ArrayProxyTest.cpp Show resolved Hide resolved
velox/expression/tests/ArrayProxyTest.cpp Outdated Show resolved Hide resolved
auto& arrayProxy = writer.current();
// General interface only is allowed for arrays of strings.
{
auto& string = arrayProxy.add_item();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a push_back that takes a std::string_view doesn't seem like it would be too bad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would do the string copy i assume, the users might then not used write strings in the most efficient way

@facebook-github-bot
Copy link
Contributor

@laithsakka has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

laithsakka added a commit to laithsakka/velox that referenced this pull request Jan 25, 2022
Summary:
- This diff support string type elements for array proxy.
- std::enable_if should be dependent on a function template argument,
hence some changes on this diff fixes that for the std::like interface
of the array proxy.

Pull Request resolved: facebookincubator#831

Differential Revision: D33758295

Pulled By: laithsakka

fbshipit-source-id: 5f1562443b5b70ae92ccca704ac924010b4620f7
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D33758295

Differential Revision: D33687044

fbshipit-source-id: 1c2057505c23f958708d66ce262ec5c759bd97ee
Differential Revision: D33710171

fbshipit-source-id: ca47e4192cc908ba5bdcd83332dad8bedc59e0db
Differential Revision: D33710831

fbshipit-source-id: 6c1ee357d66b11fa1d1412940f7e4fa448ddbb96
Summary:
- There was 30% gap between vector and simpleWithResize
in the case of no nulls. This diff reduces the gap to 18%.

- The change also makes the "with nulls" case slightly faster.

- The change simplify the construction of the PrimitiveWriterProxy.
- The diff also remove FOLLY_ALWAYS_INLINE since it did not  effect performance,

```
no nulls:

before:

VectorBasic                                                160.02ns    6.25M
VectorResizeOptimized                                       76.78ns   13.02M
SimpleProxyWithResize                                      207.59ns    4.82M

after:

VectorBasic                                                160.02ns    6.25M
VectorResizeOptimized                                       76.78ns   13.02M
SimpleProxyWithResize                                      188.90ns    5.29M
```

```
with nulls

VectorBasic                                                405.50ns    2.47M
VectorResizeOptimized                                      313.68ns    3.19M
SimpleProxyWithResize                                      392.09ns    2.55M
```

Differential Revision: D33715941

fbshipit-source-id: e663dac619f778b79508fe086cf73ffa3897996b
Summary:
- This diff closes the gap in the performance of arrayProxy and vectorBasic
(make it even faster).
- Furthermore, it enhances the general interface when used with primitives by
100n. (around 30%)

Runtime without nulls:
```
============================================================================
velox/functions/prestosql/benchmarks/ArrayProxyBenchmark.cpprelative  time/iter  iters/s
============================================================================
VectorBasic                                                170.56ns    5.86M
VectorResizeOptimized                                       79.12ns   12.64M
SimpleProxyWithResize                                      156.16ns    6.40M
SimpleProxyPushBack                                        202.21ns    4.95M
SimpleGeneral                                              196.13ns    5.10M
SimpleOld                                                  476.05ns    2.10M
StdBasedImplementation                                     126.69ns    7.89M
============================================================================
```

Runtime with nulls:
```
============================================================================
velox/functions/prestosql/benchmarks/ArrayProxyBenchmark.cpprelative  time/iter  iters/s
============================================================================
VectorBasic                                                408.90ns    2.45M
VectorResizeOptimized                                      316.10ns    3.16M
SimpleProxyWithResize                                      367.67ns    2.72M
SimpleProxyPushBack                                        401.93ns    2.49M
SimpleGeneral                                              409.87ns    2.44M
SimpleOld                                                  641.79ns    1.56M
StdBasedImplementation                                     137.16ns    7.29M
============================================================================
```

It has two main changes:
1. Track the capacity of the underlying vector instead
a virtual capacity for the currently written array.

2. Localize the proxy by forcing an on-stack copy before calling the call function.
[have a larger effect on perf].

3. It removes the init call to proxy at each iteration and track the offset only in the proxy
instead of twice in the writer and the proxy.

I made sure this is not adding overhead when the written arrays are small.
The numbers below show the performance when the max array size is 10.
```
============================================================================
velox/functions/prestosql/benchmarks/ArrayProxyBenchmark.cpprelative  time/iter  iters/s
============================================================================
VectorBasic                                                  4.63ns  215.87M
VectorResizeOptimized                                        2.07ns  481.95M
SimpleProxyWithResize                                        4.69ns  213.34M
SimpleProxyPushBack                                          4.85ns  206.39M
SimpleGeneral                                                4.86ns  205.94M
SimpleOld                                                    9.43ns  106.05M
StdBasedImplementation                                     134.13ns    7.46M
============================================================================
```

Differential Revision: D33727971

fbshipit-source-id: 54db2442c98f8a3b79945749d5951154a58163d6
Summary:
- This diff support string type elements for array proxy.
- std::enable_if should be dependent on a function template argument,
hence some changes on this diff fixes that for the std::like interface
of the array proxy.

Pull Request resolved: facebookincubator#831

Differential Revision: D33758295

Pulled By: laithsakka

fbshipit-source-id: 57d1b98d12a3a0d4f1654dbfa6a580a2828b9cd4
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D33758295

laithsakka added a commit to laithsakka/velox that referenced this pull request Jan 25, 2022
Summary:
- This diff support string type elements for array proxy.
- std::enable_if should be dependent on a function template argument,
hence some changes on this diff fixes that for the std::like interface
of the array proxy.

Pull Request resolved: facebookincubator#831

Differential Revision: D33758295

Pulled By: laithsakka

fbshipit-source-id: b793f4478f43b75fb81317b147af8ede14c531fb
laithsakka added a commit to laithsakka/velox that referenced this pull request Jan 25, 2022
Summary:
- This diff support string type elements for array proxy.
- std::enable_if should be dependent on a function template argument,
hence some changes on this diff fixes that for the std::like interface
of the array proxy.

Pull Request resolved: facebookincubator#831

Differential Revision: D33758295

Pulled By: laithsakka

fbshipit-source-id: 719aa0451aabdf77421ecd6ef2eb0db7cac9f3a4
laithsakka added a commit to laithsakka/velox that referenced this pull request Jan 25, 2022
Summary:
- This diff support string type elements for array proxy.
- std::enable_if should be dependent on a function template argument,
hence some changes on this diff fixes that for the std::like interface
of the array proxy.

Pull Request resolved: facebookincubator#831

Differential Revision: D33758295

Pulled By: laithsakka

fbshipit-source-id: b83c7558dc374f962230117a8d621b1abf2bad0d
laithsakka added a commit to laithsakka/velox that referenced this pull request Jan 25, 2022
Summary:
- This diff support string type elements for array proxy.
- std::enable_if should be dependent on a function template argument,
hence some changes on this diff fixes that for the std::like interface
of the array proxy.

Pull Request resolved: facebookincubator#831

Differential Revision: D33758295

Pulled By: laithsakka

fbshipit-source-id: 242e688f5dd625481ce5471202398a9399abe3d7
laithsakka added a commit to laithsakka/velox that referenced this pull request Jan 31, 2022
…bator#831)

Summary:
- This diff support string type elements for array proxy.
- std::enable_if should be dependent on a function template argument,
hence some changes on this diff fixes that for the std::like interface
of the array proxy.

Pull Request resolved: facebookincubator#831

Differential Revision: D33758295

Pulled By: laithsakka

fbshipit-source-id: 14205aa9a8cc2b7510aa04631f4f4f5acfa20387
laithsakka added a commit to laithsakka/velox that referenced this pull request Jan 31, 2022
…bator#831)

Summary:
- This diff support string type elements for array proxy.
- std::enable_if should be dependent on a function template argument,
hence some changes on this diff fixes that for the std::like interface
of the array proxy.

Pull Request resolved: facebookincubator#831

Differential Revision: D33758295

Pulled By: laithsakka

fbshipit-source-id: 41fbfd514a111c352023a8621cef22cb6d371352
laithsakka added a commit to laithsakka/velox that referenced this pull request Jan 31, 2022
…bator#831)

Summary:
- This diff support string type elements for array proxy.
- std::enable_if should be dependent on a function template argument,
hence some changes on this diff fixes that for the std::like interface
of the array proxy.

Pull Request resolved: facebookincubator#831

Differential Revision: D33758295

Pulled By: laithsakka

fbshipit-source-id: f130f1b1430b0d6db33b1a60d172dee903f68426
laithsakka added a commit to laithsakka/velox that referenced this pull request Jan 31, 2022
…bator#831)

Summary:
- This diff support string type elements for array proxy.
- std::enable_if should be dependent on a function template argument,
hence some changes on this diff fixes that for the std::like interface
of the array proxy.

Pull Request resolved: facebookincubator#831

Differential Revision: D33758295

Pulled By: laithsakka

fbshipit-source-id: 6fb3c227a52d42b38e4c8609c23b6b086643f295
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants