New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi input concat #2232
Multi input concat #2232
Conversation
0d2c764
to
f8382aa
Compare
@@ -41,6 +41,10 @@ template <class T> void bind_concatenate(py::module &m) { | |||
}, | |||
py::arg("x"), py::arg("y"), py::arg("dim"), | |||
py::call_guard<py::gil_scoped_release>()); | |||
m.def( | |||
"concat", | |||
[](const std::vector<T> &x, const Dim dim) { return concat(x, dim); }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes copies of all inputs. Is it a shallow or a deep copy?
Unfortunately, I don't see a way to avoid this completely (except for ranges, see other comment) because concat
takes a span as argument and Python cannot provide a contiguous memory layout. So we need to at least shallow-copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy constructors correspond to shallow copies. Deep copies would be calls to copy()
.
@@ -345,7 +345,7 @@ class TargetBinBuilder { | |||
|
|||
void join(const Dim dim, const Variable &coord) { | |||
m_dims.addInner(dim, 1); | |||
m_joined.emplace_back(concatenate(min(coord), max(coord), dim)); | |||
m_joined.emplace_back(concat(std::vector{min(coord), max(coord)}, dim)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the function parameter should be something like a pair of pointers (-> a range?) in order to avoid these kinds of allocations. Do we have access to ranges yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet, next step is #2150.
docs/about/release-notes.rst
Outdated
Deprecations | ||
~~~~~~~~~~~~ | ||
|
||
* ``concatenate`` and ``groupby(..).concatenate`` are deprecated. Use ``concat`` and ``groupby(..).concatenate`` instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* ``concatenate`` and ``groupby(..).concatenate`` are deprecated. Use ``concat`` and ``groupby(..).concatenate`` instead. | |
* ``concatenate`` and ``groupby(..).concatenate`` are deprecated. Use ``concat`` and ``groupby(..).concat`` instead. |
return _call_cpp_func(_cpp.buckets.concatenate, self._obj, dim) | ||
raise RuntimeError("Reduction along all dims not supported yet.") | ||
out = _call_cpp_func(_cpp.buckets.concatenate, self._obj, other) | ||
return out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please mark this function as deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I'd avoid a deprecation cycle for bins.concatenate
since it likely has extremely few users for now.
Note that the function name itself is actually not deprecated (yet). There are two overloads of bins.concatenate
(and unfortunately there where mixing code here). What I did here was to rename the first overload to bins.concat
. The second overload is semantically different (concatenating from 2 input arrays), and I have (I think) a better plan on how to refactor/improve it. I'll describe it in an issue.
"Either both or neither of the inputs must be bin edges."); | ||
} else if (a_.dims()[dim] == (dimsA.contains(dim) ? dimsA.at(dim) : 1)) { | ||
out.emplace(key, concatenate(a_, b[key], dim)); | ||
"Either all or none of the inputs must have bin edge coordinates."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, TEST(ConcatenateTest, fail_mixing_point_data_and_histogram) {
, but I'll extend it to also cover more than 2 inputs.
Fixes #2166.
I have chosen to keep behavior of
concatenate
, but deprecate it. Instead this providesconcat
, which matches the naming in the Python Array API specification (and alsoxarray
). I had considered this rename for a while (not least becauseconcatenate
is very long), by doing this now we avoid breaking things twice.Also fix/change so masks cannot be shared with inputs.