Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-14062: [Format] Initial arrow-internal specification of compute IR #10934

Closed
wants to merge 96 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
96 commits
Select commit Hold shift + click to select a range
f2ba6e4
draft
bkietz Aug 13, 2021
760c804
[RFC] Arrow Compute Serialized Intermediate Representation draft for …
bkietz Aug 13, 2021
13c5bba
add interactive_output relation
bkietz Aug 13, 2021
8cd9894
Update format/ComputeIR.fbs
bkietz Aug 13, 2021
8cd6c46
add explicit shaping for Literals
bkietz Aug 13, 2021
7bcb504
Update format/ComputeIR.fbs
bkietz Aug 13, 2021
f2e319a
allow join_kind to be an arbitrary string
bkietz Aug 13, 2021
134faee
ensure Blob.bytes are aligned
bkietz Aug 14, 2021
213964a
use a single enum to specify ordering
bkietz Aug 14, 2021
194d51b
Update format/ComputeIR.fbs
bkietz Aug 13, 2021
a6b107b
Use Buffers for byte blobs
bkietz Aug 14, 2021
38731e2
clarify interactive_output by adding an rpc_service
bkietz Aug 14, 2021
4e08757
add common, union relations
bkietz Aug 15, 2021
58176d8
add ComputeIR.rst with basic introduction to Compute IR
bkietz Aug 16, 2021
62ec6c9
add gRPC generation, don't require vector-of-unions
bkietz Aug 16, 2021
de14e46
make common name required
bkietz Aug 16, 2021
3b4d33c
make aggregate keys required (may still be empty)
bkietz Aug 16, 2021
e91e8dc
remove Shape suffix
bkietz Aug 16, 2021
f19326b
clarify that namespaces end in ::
bkietz Aug 17, 2021
9b80f8a
clarify JoinOptions.join_name
bkietz Aug 17, 2021
d8b2f14
Make byte blobs inline so we don't need a sidecar buffer
bkietz Aug 17, 2021
6de1e8a
Use Field instead of Type to accommodate nested types
bkietz Aug 17, 2021
3ed5d5d
provide an explicit enum for canonical functions
bkietz Aug 17, 2021
8dbcf68
typo, add comment re vector-of-union
bkietz Aug 17, 2021
f4633c9
refactor InlineBuffer to avoid need for reinterpret_cast
bkietz Aug 18, 2021
333b58e
Move compute to experimental directory
cpcloud Aug 26, 2021
8d2af01
Add generated compute IR Python code
cpcloud Aug 26, 2021
4953530
Clean up shell script and add Python compute IR compilation
cpcloud Aug 26, 2021
ee1e696
Add generated C++ code
cpcloud Aug 26, 2021
f25a282
Remove generated RPC code
cpcloud Aug 26, 2021
a603b66
Make sure InlineBuffer buffer is the root_type for it's definition file
cpcloud Aug 26, 2021
d1be0b3
Try ignoring generated flatbuffers Python code
cpcloud Aug 26, 2021
dfc1e08
Ignore generated C++ in code review
cpcloud Aug 26, 2021
7800393
Fix generated C++ code after setting root type for InlineBuffer
cpcloud Aug 26, 2021
5fc1e89
Remove generated code until review is done
cpcloud Aug 26, 2021
76d7701
Remove generated C++ code for now
cpcloud Aug 26, 2021
6f973bd
Revert generated C++ for now
cpcloud Aug 26, 2021
06d51ae
Default relation index to 0
cpcloud Aug 27, 2021
263b598
Remove derived_plan for now
cpcloud Aug 27, 2021
7ad4bcc
Change aggregations field name to measures
cpcloud Aug 27, 2021
5380448
Change default to else in Case expressions
cpcloud Aug 27, 2021
67ab276
Add support for multiple groupings during aggregation
cpcloud Aug 27, 2021
898e49b
Add clustered collation and rename Ordering to Collation
cpcloud Aug 27, 2021
01b7b35
Remove predicate from AggregateCall
cpcloud Aug 27, 2021
b204a46
Add offset to limit relation
cpcloud Aug 27, 2021
b4729af
Remove CTE relation for now
cpcloud Aug 27, 2021
01be0b0
Remove noncanonical setops
cpcloud Aug 27, 2021
b39ff56
Remove required from primitive type
cpcloud Aug 27, 2021
4e2a432
Make literal relations support multiple rows
cpcloud Aug 27, 2021
243ba5d
Remove example extensions for now
cpcloud Aug 27, 2021
e42581e
Remove custom relations for now
cpcloud Aug 27, 2021
44166ac
Inline relations
cpcloud Aug 27, 2021
2955a1e
Remove TODO around window/asof
cpcloud Aug 27, 2021
4d81652
Remove custom joins for now
cpcloud Aug 27, 2021
b1a0e55
Remove cast and extract
cpcloud Aug 27, 2021
d4c5f7d
Remove InlineBuffer for now
cpcloud Aug 27, 2021
8e1dee3
Remove Write for now
cpcloud Aug 27, 2021
4d41f04
Comment on LiteralColumn
cpcloud Aug 27, 2021
33a5976
Split out case into conditional and simple
cpcloud Aug 27, 2021
535e140
Note about the meaning of name in Field on Expressions
cpcloud Aug 27, 2021
8413792
Describe why Type is not required
cpcloud Aug 27, 2021
50c4655
Remove unused variants
cpcloud Aug 27, 2021
f4f90c8
Remove Common relation
cpcloud Aug 27, 2021
bf406a7
Remove AggregateCall and move orderings to `Call` as an optional field
cpcloud Aug 27, 2021
55caa15
Remove (required) on JoinKind
cpcloud Aug 27, 2021
e1c4872
Move aggregates ids back to function ids
cpcloud Aug 27, 2021
2aa4764
Add comments about window bounds
cpcloud Aug 27, 2021
1ca4f49
Fix typo
cpcloud Aug 27, 2021
c715bba
Inline Bound
cpcloud Aug 27, 2021
e8e9c85
Move frame down a field
cpcloud Aug 27, 2021
9019397
Rename ArrayLiteral to ListLiteral
cpcloud Aug 31, 2021
a96110c
Leave the Decimal door open
cpcloud Aug 31, 2021
d0e46b5
Add DurationLiteral
cpcloud Aug 31, 2021
64266aa
Add units for time-related objects
cpcloud Aug 31, 2021
3399d03
Rename ordering to collation
cpcloud Aug 31, 2021
076ff73
Remove the type field from Expression
cpcloud Aug 31, 2021
e8afb42
Replace Read with more abstract Table
cpcloud Sep 14, 2021
6bb3306
Punt on function registry for now
cpcloud Sep 14, 2021
7f7db42
Check in generated code
cpcloud Sep 14, 2021
1ec041d
Move ir to experimental/computeir
cpcloud Sep 15, 2021
25db445
Remove generated python code
cpcloud Sep 15, 2021
856e518
Move computeir docs to developers section
cpcloud Sep 20, 2021
b1e9f93
Allow any type of value in MapLiteral key
cpcloud Sep 20, 2021
4e70dda
Reuse Time/DateUnit from Schema.fbs
cpcloud Sep 20, 2021
a89302b
Make DecimalLiteral scale/precision signed
cpcloud Sep 20, 2021
0f7b261
Add interval literal with nanos
cpcloud Sep 20, 2021
902e73f
Add FixedSizeBinaryLiteral
cpcloud Sep 20, 2021
427372d
Add comment about endiannes to DecimalLiteral bytes
cpcloud Sep 20, 2021
88b3d25
Fix outdated comments on FieldRef
cpcloud Sep 20, 2021
bc3766d
Rename Collation to Ordering and remove CLUSTERED
cpcloud Sep 20, 2021
c2e916a
MapKey expression should allow any type for the key
cpcloud Sep 20, 2021
b6513d7
Audit use of (required) in Literal.fbs
cpcloud Sep 20, 2021
36ce349
Restrict StructLiteral fields to string keys
cpcloud Sep 20, 2021
48da544
Rename FieldName to FieldIndex and use it Remap
cpcloud Sep 20, 2021
5ac6cbd
Remove python from .gitattributes file
cpcloud Sep 20, 2021
2b7e5d0
Remove intervalnanos for now
cpcloud Sep 20, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ r/R/arrowExports.R linguist-generated=true
r/src/RcppExports.cpp linguist-generated=true
r/src/arrowExports.cpp linguist-generated=true
r/man/*.Rd linguist-generated=true

cpp/src/generated/*.h linguist-generated=true
1 change: 1 addition & 0 deletions cpp/build-support/lint_exclusions.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
*_generated*
*.grpc.fb.*
*parquet_constants.*
*parquet_types.*
*windows_compatibility.h
Expand Down
45 changes: 27 additions & 18 deletions cpp/build-support/update-flatbuffers.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,22 +20,31 @@

# Run this from cpp/ directory. flatc is expected to be in your path

set -euo pipefail

CWD="$(cd "$(dirname "${BASH_SOURCE[0]:-$0}")" && pwd)"
SOURCE_DIR=$CWD/../src
FORMAT_DIR=$CWD/../../format
FLATC="flatc -c --cpp-std c++11"

$FLATC -o $SOURCE_DIR/generated \
--scoped-enums \
$FORMAT_DIR/Message.fbs \
$FORMAT_DIR/File.fbs \
$FORMAT_DIR/Schema.fbs \
$FORMAT_DIR/Tensor.fbs \
$FORMAT_DIR/SparseTensor.fbs \
src/arrow/ipc/feather.fbs

$FLATC -o $SOURCE_DIR/plasma \
--gen-object-api \
--scoped-enums \
$SOURCE_DIR/plasma/common.fbs \
$SOURCE_DIR/plasma/plasma.fbs
SOURCE_DIR="$CWD/../src"
PYTHON_SOURCE_DIR="$CWD/../../python"
FORMAT_DIR="$CWD/../../format"
TOP="$FORMAT_DIR/.."
FLATC="flatc"

OUT_DIR="$SOURCE_DIR/generated"
FILES=($(find $FORMAT_DIR -name '*.fbs'))
FILES+=("$SOURCE_DIR/arrow/ipc/feather.fbs")

# add compute ir files
FILES+=($(find "$TOP/experimental/computeir" -name '*.fbs'))

$FLATC --cpp --cpp-std c++11 \
--scoped-enums \
-o "$OUT_DIR" \
"${FILES[@]}"

PLASMA_FBS=("$SOURCE_DIR"/plasma/{plasma,common}.fbs)

$FLATC --cpp --cpp-std c++11 \
-o "$SOURCE_DIR/plasma" \
--gen-object-api \
--scoped-enums \
"${PLASMA_FBS[@]}"
Loading