-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-658: [C++] Implement a prototype in-memory arrow::Tensor type #438
Conversation
@@ -43,7 +43,7 @@ class Status; | |||
/// of bytes that where allocated for the buffer in total. | |||
/// | |||
/// The following invariant is always true: Size < Capacity | |||
class ARROW_EXPORT Buffer : public std::enable_shared_from_this<Buffer> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a red herring for this patch, but I ran across this cruft and decided to remove it. I can back out this change if there's concern
rebased |
cpp/src/arrow/tensor.h
Outdated
|
||
/// Constructor with non-negative strides | ||
Tensor(const std::shared_ptr<DataType>& type, const std::shared_ptr<Buffer>& data, | ||
std::vector<int64_t> shape, std::vector<int64_t> strides); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const references for the vector
attributes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
cpp/src/arrow/tensor.h
Outdated
std::vector<int64_t> shape, std::vector<int64_t> strides); | ||
|
||
/// Constructor with strides and dimension names | ||
Tensor(const std::shared_ptr<DataType>& type, const std::shared_ptr<Buffer>& data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
Change-Id: I3b28249f15ed649abc6ddde47de451b560117b6d
Change-Id: I7df3536c1663827e720fe5de2d1110e87c52ecc3
Change-Id: I2d6adc5ff961172540177ce02e7ff8a6b14ea385
rebased |
cpp/src/arrow/tensor.cc
Outdated
} | ||
|
||
int64_t Tensor::size() const { | ||
if (shape_.size() == 0) { return 0; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The size of a 0-D array should probably be 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cpp/src/arrow/tensor.cc
Outdated
int64_t size = 1; | ||
for (int64_t dimsize : shape_) { | ||
size *= dimsize; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using std::accumulate
with std::multiplies
should make this faster and more concise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Change-Id: I21003a73c07ab5c71aee01dafa16a65299e85069
Change-Id: Ife060d87a629f74e83fc33690d92a3d2dfd89433
Change-Id: Ia4562e4235f8b85709b1bbb39721bba0493065c4
Change-Id: I254944a46c1231b9a78605a5bf52a1e13c09055c
+1, merging to take a crack at the shared memory IPC, we can continue to iterate on the Tensor data structures as needed |
Author: Uwe L. Korn <uwelk@xhochy.com> Closes apache#438 from xhochy/PARQUET-1218 and squashes the following commits: 20ba13d [Uwe L. Korn] PARQUET-1218: More informative error message on too short pages
Author: Uwe L. Korn <uwelk@xhochy.com> Closes apache#438 from xhochy/PARQUET-1218 and squashes the following commits: 20ba13d [Uwe L. Korn] PARQUET-1218: More informative error message on too short pages Change-Id: I3b682d1860faf49b514ba660dc1e4e7656e75122
Author: Uwe L. Korn <uwelk@xhochy.com> Closes apache#438 from xhochy/PARQUET-1218 and squashes the following commits: 20ba13d [Uwe L. Korn] PARQUET-1218: More informative error message on too short pages Change-Id: I3b682d1860faf49b514ba660dc1e4e7656e75122
Author: Uwe L. Korn <uwelk@xhochy.com> Closes apache#438 from xhochy/PARQUET-1218 and squashes the following commits: 20ba13d [Uwe L. Korn] PARQUET-1218: More informative error message on too short pages Change-Id: I3b682d1860faf49b514ba660dc1e4e7656e75122
Author: Uwe L. Korn <uwelk@xhochy.com> Closes apache#438 from xhochy/PARQUET-1218 and squashes the following commits: 20ba13d [Uwe L. Korn] PARQUET-1218: More informative error message on too short pages Change-Id: I3b682d1860faf49b514ba660dc1e4e7656e75122
I haven't implemented much beyond the data container and automatically computing row major strides. If we agree on the basics, then I will implement IPC read/writes of this data structure in a follow up patch.
cc @pcmoritz @robertnishihara @JohanMabille @SylvainCorlay