Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: temporary branch for IOx update (11-30-2023 to 12-09-2023) #8543

Conversation

appletreeisyellow
Copy link
Contributor

@appletreeisyellow appletreeisyellow commented Dec 14, 2023

⚠️ This draft PR is not intend to merge. This is a temporary branch for the purpose of updating DataFusion in InfluxData IOx.

Dec. 14, 2023

DataFusion commit history:

The head of this branch is at:

Cherry-picked two bug fixes:

Cherry-picked the commits from the main branch:

Dec. 15, 2023

Cherry-picked the commits from the main branch:

@github-actions github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Dec 14, 2023
appletreeisyellow and others added 2 commits December 14, 2023 11:03
… different schemas and statistics (apache#8533)

* Add test for schema evolution

* Fix reading parquet statistics

* Update tests for fix

* Add comments to help explain the test

* Add another test
Weijun-H and others added 9 commits December 14, 2023 15:42
* support LargeList in array_empty

* update err info
* feat: test queries for to_timestamp(float) WIP

* feat: Float64 input for to_timestamp

* cargo fmt

* clippy

* docs: double input type for to_timestamp

* feat: cast floats to timestamp

* style: cargo fmt

* fix: float64 cast for timestamp nanos only
* Support User Defined Table Function

Signed-off-by: veeupup <code@tanweime.com>

* fix comments

Signed-off-by: veeupup <code@tanweime.com>

* add udtf test

Signed-off-by: veeupup <code@tanweime.com>

* add file header

* Simply table function example, add some comments

* Simplfy exprs

* make clippy happy

* Update datafusion/core/tests/user_defined/user_defined_table_functions.rs

---------

Signed-off-by: veeupup <code@tanweime.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* document timestamp input limis

* fix text

* prettier

* remove doc for nanoseconds

* Update datafusion/physical-expr/src/datetime_expressions.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* fix: make ntile work in some corner cases

* fix comments

* minor

* Update datafusion/sqllogictest/test_files/window.slt

Co-authored-by: Mustafa Akur <106137913+mustafasrepo@users.noreply.github.com>

---------

Co-authored-by: Mustafa Akur <106137913+mustafasrepo@users.noreply.github.com>
Given that group keys inherently have few repeated values, especially
when grouping on a single column, the use of dictionary encoding is
unlikely to be yielding significant returns
* done

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* add more test

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* cleanup

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

---------

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
@github-actions github-actions bot added sql SQL Planner physical-expr Physical Expressions labels Dec 14, 2023
alamb and others added 12 commits December 14, 2023 16:41
* Minor: Improve the documentation on `ScalarValue`

* Update datafusion/common/src/scalar.rs

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>

* Update datafusion/common/src/scalar.rs

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>

---------

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
* add benchmark

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* fmt

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* address clippy

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* cleanup

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* fix comment

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

---------

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* minor changes

* PipelineStatePropagator tree refactor

* Remove duplications by children_unbounded()

* Remove on-the-fly tree construction

* Minor changes

---------

Co-authored-by: Mustafa Akur <mustafa.akur@synnada.ai>
…8121)

* feat: support  LargeList in make_array and
array_length

* chore: add tests

* fix: update tests for nested array

* use usise_as

* add new_large_list

* refactor array_length

* add comment

* update test in sqllogictest

* fix ci

* fix macro

* use usize_as

* update comment

* return based on data_type in make_array
…che#8404)

- remove `unalias` TableScan filters
- refactor  CreateExternalTable
- fix typo
…ls (apache#8400)

* fix transforming LogicalPlan::Explain use TreeNode::transform fails

* Update datafusion/expr/src/logical_plan/plan.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Minor: Improve the document format of JoinHashMap

* Docs: Fix `array_except` documentation example
* Minor: Improve the document format of JoinHashMap

* support named query parameters

* cargo fmt

* add `ParamValues` conversion

* improve doc
…tests for ORDER BY cases with RANGE frame (apache#8410)

* fix: RANGE frame can be regularized to ROWS frame only if empty ORDER BY clause

* Fix flaky test

* Update test comment

* Add code comment

* Update
mustafasrepo and others added 24 commits December 15, 2023 14:03
* Relax schema check for optimize projections.

* Minor changes

* Update datafusion/optimizer/src/optimize_projections.rs

Co-authored-by: jakevin <jakevingoo@gmail.com>

---------

Co-authored-by: jakevin <jakevingoo@gmail.com>
* list cmp

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* remove cfg

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

---------

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Support parquet_metadata for datafusion-cli

Signed-off-by: veeupup <code@tanweime.com>

* make tomlfmt happy

* display like duckdb

Signed-off-by: veeupup <code@tanweime.com>

* add test & fix single quote

---------

Signed-off-by: veeupup <code@tanweime.com>
* Fix nested count optimization

* fmt

* extend comment

* Clippy

* Update datafusion/optimizer/src/optimize_projections.rs

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>

* Add sqllogictests

* Fmt

---------

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 4 to 5.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](actions/setup-python@v4...v5)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* change get zero to first()

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* wake clone to wake_by_ref

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* more first()

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* try_from() to from()

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
…e treated as constant sort (apache#8445)

* fix: RANGE frame for corner cases with empty ORDER BY clause should be treated as constant sort

* fix

* Make the test not flaky

* fix clippy
* fix: don't unifies projection if expr is non-trival

* Update datafusion/core/src/physical_optimizer/projection_pushdown.rs

Co-authored-by: Alex Huang <huangweijun1001@gmail.com>

---------

Co-authored-by: Alex Huang <huangweijun1001@gmail.com>
* Minor: Add new bloom filter tests

* fmt
* Aggregate rewrite for dataframe API.

* Simplifications

* Minor changes

* Minor changes

* Add new test

* Add new tests

* Minor changes

* Add rule, for aggregate simplification

* Simplifications

* Simplifications

* Simplifications

* Minor changes

* Simplifications

* Add new test condition

* Tmp

* Push requirement below aggregate

* Add join and subqeury alias

* Add cross join support

* Minor changes

* Add logical plan repartition support

* Add union support

* Add table scan

* Add limit

* Minor changes, buggy

* Add new tests, fix existing bugs

* change concat type array_concat

* Resolve some of the bugs

* Comment out a rule

* All tests pass, when single distinct is closed

* Fix aggregate bug

* Change analyze and explain implementations

* All tests pass

* Resolve linter errors

* Simplifications, remove unnecessary codes

* Comment out tests

* Remove pushdown projection

* Pushdown empty projections

* Fix failing tests

* Simplifications

* Update comments, simplifications

* Remove eliminate projection rule, Add method for group expr len aggregate

* Simplifications, subquery support

* Update comments, add unnest support, simplifications

* Remove eliminate projection pass

* Change name

* Minor changes

* Minor changes

* Add comments

* Fix failing test

* Minor simplifications

* update

* Minor

* Remove ordering

* Minor changes

* add merge projections

* Add comments, resolve linter errors

* Minor changes

* Minor changes

* Minor changes

* Minor changes

* Minor changes

* Minor changes

* Minor changes

* Minor changes

* Review Part 1

* Review Part 2

* Fix quadratic search, Change trim_expr impl

* Review Part 3

* Address reviews

* Minor changes

* Review Part 4

* Add case expr support

* Review Part 5

* Review Part 6

* Finishing touch: Improve comments

---------

Co-authored-by: berkaysynnada <berkay.sahin@synnada.ai>
Co-authored-by: Mehmet Ozan Kabak <ozankabak@gmail.com>
* refactor data_trunc

* fix cast to timestamp array

* fix cast to timestamp scalar

* fix doc
* implement distinct func

implement slt & proto

fix null & empty list

* add comment for slt

Co-authored-by: Alex Huang <huangweijun1001@gmail.com>

* fix largelist

* add largelist for slt

* Use collect for rows & init capcity for offsets.

* fixup: remove useless match

* fix fmt

* fix fmt

---------

Co-authored-by: Alex Huang <huangweijun1001@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
…e#8377)

* Add `evaluate_demo` and `range_analysis_demo` to Expr examples

* Prettier

* Update datafusion-examples/examples/expr_api.rs

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>

* rename ExprBoundaries::try_new_unknown --> ExprBoundaries::try_new_unbounded

---------

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
…ont/back` (apache#8401)

* array_element done

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* clippy

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* replace array_slice

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* fix get_indexed_field_empty_list

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* replace pop front and pop back

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* clippy

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* add doc and comment

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* fmt

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

---------

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
* refactor trim

* add fmt for TrimType

* fix closure

* update comment
* add PlaceHolderRowExec

* Change produce_one_row=true calls to use PlaceHolderRowExec

* remove produce_one_row from EmptyExec, changes in proto serializer, working tests

* PlaceHolder => Placeholder

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
@appletreeisyellow appletreeisyellow changed the title chore: temporary branch for IOx update chore: temporary branch for IOx update (11-30-2023 to 12-09-2023) Jan 2, 2024
@appletreeisyellow
Copy link
Contributor Author

Closing this as the update is done

@appletreeisyellow appletreeisyellow deleted the chunchun/temp-fix-12-14-2023 branch January 24, 2024 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate documentation Improvements or additions to documentation logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Physical Expressions sql SQL Planner sqllogictest SQL Logic Tests (.slt) substrait
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet