Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate benchmarks #34

Merged
merged 6 commits into from
Apr 23, 2021
Merged

Conversation

andygrove
Copy link
Member

Follows on from #23.

There is now a single benchmark crate that can run against DataFusion or Ballista.

Integration tests pass.

Closes #6.

@codecov-commenter
Copy link

codecov-commenter commented Apr 22, 2021

Codecov Report

Merging #34 (84d755d) into master (74cdf6f) will increase coverage by 0.41%.
The diff coverage is 11.11%.

❗ Current head 84d755d differs from pull request most recent head 78279fc. Consider uploading reports for the commit 78279fc to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master      #34      +/-   ##
==========================================
+ Coverage   75.83%   76.24%   +0.41%     
==========================================
  Files         135      134       -1     
  Lines       23152    23023     -129     
==========================================
- Hits        17557    17554       -3     
+ Misses       5595     5469     -126     
Impacted Files Coverage Δ
benchmarks/src/bin/tpch.rs 35.15% <11.11%> (-3.71%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 74cdf6f...78279fc. Read the comment docs.

@andygrove
Copy link
Member Author

@alamb @jorgecarleitao @Dandandan This is rebased and ready for review

*
from
orders
where
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍

Copy link
Contributor

@Dandandan Dandandan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I think this looks good

Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brutal ❤️ 💯

@andygrove andygrove merged commit 32951c3 into apache:master Apr 23, 2021
@andygrove andygrove deleted the consolidate-benchmarks branch April 23, 2021 19:34
@houqp houqp added the development-process Related to development process of arrow-datafusion label Jul 29, 2021
andygrove added a commit that referenced this pull request Jan 12, 2023
* Initial commit

* initial commit

* failing test

* table scan projection

* closer

* test passes, with some hacks

* use DataFrame (#2)

* update README

* update dependency

* code cleanup (#3)

* Add support for Filter operator and BinaryOp expressions (#4)

* GitHub action (#5)

* Split code into producer and consumer modules (#6)

* Support more functions and scalar types (#7)

* Use substrait 0.1 and datafusion 8.0 (#8)

* use substrait 0.1

* use datafusion 8.0

* update datafusion to 10.0 and substrait to 0.2 (#11)

* Add basic join support (#12)

* Added fetch support (#23)

Added fetch to consumer

Added limit to producer

Added unit tests for limit

Added roundtrip_fill_none() for testing when None input can be converted to 0

Update src/consumer.rs

Co-authored-by: Andy Grove <andygrove73@gmail.com>

Co-authored-by: Andy Grove <andygrove73@gmail.com>

* Upgrade to DataFusion 13.0.0 (#25)

* Add sort consumer and producer (#24)

Add consumer

Add producer and test

Modified error string

* Add serializer/deserializer (#26)

* Add plan and function extension support (#27)

* Add plan and function extension support

* Removed unwraps

* Implement GROUP BY (#28)

* Add consumer, producer and tests for aggregate relation

Change function extension registration from absolute to relative anchor
(reference)

Remove operator to/from reference

* Fixed function registration bug

* Add test

* Addressed PR comments

* Changed field reference from mask to direct reference (#29)

* Changed field reference from masked reference to direct reference

* Handle unsupported case (struct with child)

* Handle SubqueryAlias (#30)

Fixed aggregate function register bug

* Add support for SELECT DISTINCT (#31)

Add test case

* Implement BETWEEN (#32)

* Add case (#33)

* Implement CASE WHEN

* Add more case to test

* Addressed comments

* feat: support explicit catalog/schema names in ReadRel (#34)

* feat: support explicit catalog/schema names in ReadRel

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: use re-exported expr crate

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* move files to subfolder

* RAT

* remove rust.yaml

* revert .gitignore changes

* tomlfmt

* tomlfmt

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Daniël Heres <danielheres@gmail.com>
Co-authored-by: JanKaul <jankaul@mailbox.org>
Co-authored-by: nseekhao <37189615+nseekhao@users.noreply.github.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development-process Related to development process of arrow-datafusion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consolidate TPC-H benchmarks
5 participants