Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD for decimal data type #1010

Closed
liukun4515 opened this issue Dec 7, 2021 · 10 comments
Closed

SIMD for decimal data type #1010

liukun4515 opened this issue Dec 7, 2021 · 10 comments
Assignees
Labels
enhancement Any new improvement worthy of a entry in the changelog

Comments

@liukun4515
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

From this datafusion issue, we will add the decimal data type
in the datafusion.
But some basic operations like +,- we implemented do not support SIMD.

In order to speed up the calculation, we need to add the SIMD feature for the decimal data type.

I have not figured out how to implement it.

Describe the solution you'd like
TODO

Describe alternatives you've considered
TODO

Additional context
TODO

@liukun4515 liukun4515 added the enhancement Any new improvement worthy of a entry in the changelog label Dec 7, 2021
@liukun4515
Copy link
Contributor Author

please assign this to me.

@alamb
Copy link
Contributor

alamb commented Dec 9, 2021

FYI @liukun4515 -- before implementing explicit SIMD versions of these kernels, it may be worth doing some profiling / disassembly of what rustc creates for the current kernels.

Perhaps a good start would be to add Decimal support to the existing kernels in https://docs.rs/arrow/6.3.0/arrow/compute/kernels/aggregate/index.html if it doesn't alread exist

I have not reviewed the code in the datafusion aggregate functions for a while, so I am not familiar with how much they do / don't use the arrow compute kernels.

@liukun4515
Copy link
Contributor Author

disassembly

Thanks for your suggestion, I will follow this.

@alamb
Copy link
Contributor

alamb commented Dec 11, 2021

FWIW I believe https://rust.godbolt.org/ is a popular tool for such work

@chadbrewbaker
Copy link

https://ispc.github.io etc are probably the way to go. For any expensive query you will want to shell out to LLVM and custom compile the worker binaries before the run - also probably blast the query plan with an SMT solver to reduce the expense/runtime depending on your constraints. I would avoid dynamic linking like the plague - be like the IBM Blue Gene/L.

@chadbrewbaker
Copy link

Before mucking with simd - probably good idea to have a repo like Rust coreutils that leverages existing regression tests for the SQL engine. Especially perf regressions.

https://github.com/postgres/postgres/tree/master/src/test/regress

https://github.com/s3team/Squirrel

@chadbrewbaker
Copy link

Here is the Google SIMD assembler https://github.com/google/zetasql/blob/master/zetasql/base/mathutil.h

@liukun4515
Copy link
Contributor Author

@chadbrewbaker thanks.

@tustvold
Copy link
Contributor

tustvold commented Nov 1, 2022

Implemented by #2881

@tustvold tustvold closed this as completed Nov 1, 2022
@liukun4515
Copy link
Contributor Author

Implemented by #2881

Thanks @tustvold

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

No branches or pull requests

4 participants