Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify merge and concatenate #3234

Open
5 of 9 tasks
kaedonkers opened this issue Nov 29, 2018 · 9 comments
Open
5 of 9 tasks

Unify merge and concatenate #3234

kaedonkers opened this issue Nov 29, 2018 · 9 comments
Labels
Dragon 🐉 https://github.com/orgs/SciTools/projects/19?pane=info Feature: Merge/Concatenate VATools MetOffice User Group

Comments

@kaedonkers
Copy link
Member

kaedonkers commented Nov 29, 2018

There have been many comments through the years about combining the function of merge and concatenate to reduce the pain of turning many cubes into one. The main argument is that, while each function is doing a distinctly different thing, it should not matter from a user's point of view whether the cubes are merging from a scalar coordinate into a new dimension, or concatenating cubes onto an existing common dimension.

This issue acts as a placeholder to bring these issues together and assign this change to milestone Iris v3.0.
The (potentially) related issues are listed below:

Please edit this list as appropriate.

@kaedonkers
Copy link
Member Author

Ping @dkillick

@DPeterK DPeterK added this to the v3.0.0 milestone Nov 29, 2018
@bjlittle
Copy link
Member

@kaedonkers Just seeking clarification here... Are you proposing that merge and concatenate are combined in some way? Or are you just capturing all the issues regarding merge and concatenate in one place?

@DPeterK
Copy link
Member

DPeterK commented Nov 29, 2018

@bjlittle I'd love to see merge and concatenate become the same operation. Making a distinction between them is somewhat arbitrary, confusing for Iris users and perhaps primarily there to make life easy for devs and not users. Also merge is slow compared to concatenate, so if we could pull the functionality of concatenate in line with merge then we could remove merge and run everything through concatenate instead.

@bjlittle
Copy link
Member

@dkillick I think the issue is that you can get different results given the order that you use i.e. merge + concatenate vs concatenate + merge or some sort of recursive combination of the two.

That's why we originally left it to the user to decide, rather than iris doing something magical.

Keeping them separate also feeds into the custom pipeline approach of processing data...

I'm not against bringing merge + concatenate together, but we'd need to think about that carefully.

Concatenate doesn't create new dimensions, it only works with existing dimensions, whereas merge will create new dimensionality in the cube (just making the point, rather than teaching you to suck 🥚 's)

Otherwise, we could have concatenate with automatic scalar coordinate self-promotion to a new dimension, which might do the trick.... I had an experimental branch that did this years ago, but got shot down at the time. Just thought I'd resurrect that notion as food for thought... 🤔

@rcomer
Copy link
Member

rcomer commented Nov 29, 2018

Would concatenate be able to cope with examples like #2761?

@DPeterK
Copy link
Member

DPeterK commented Nov 29, 2018

@rcomer as it stands, no, because I don't think concatenate can handle scalar anything. The intention here is that if we unified merge and concatenate then the single resultant thing would be able to handle scalars (and get right what merge currently isn't, as per your example).

@bjlittle
Copy link
Member

@kaedonkers @dkillick @rcomer See here... couldn't resist 😉

@bjlittle
Copy link
Member

bjlittle commented Nov 30, 2018

So here's one possible unification proposal... note that, I'm initially aiming to build on what we have-ish, without a massive rewrite, although I'm not discounting that as a possibility.

Let's assume that the current recusive stratagem of concatenate is reasonably sound as a foundation from which to move forward, glazing over the obvious warts. So for me, the two big questions at this point are:

  1. how do we extend the dimensionality?
  2. how do we seed the dimensionality order efficiently?

Up front, I'm going to suppress the temptation to bias the design with any thoughts regarding optimisation through parallelism or sparse constructions of hypercubes. Let's keep it reasonably simple. For me, that's always a good place to start...

So going back to my comment above, I've previously toyed with the concept of concatenate automatically promoting a scalar coordinate to be a new singleton dimension. Then using the naive brute force approach inherent within concatenate, to automatically promote like scalar coordinates on candidate cubes in order to concatenate over the newly promoted dimension.

That's fine and dandy, but this only works for simple candidate cubes that differ by the promoted scalar coordinate. This simple approach doesn't work for a simple merge example such as:

A = [1, 1, 2, 2]
B = [3, 4, 3, 4]

Anyways, scalar coordinate promotion as a mechanism within concatenate addresses question [1.]

The interesting part is addressing question [2.]... and the leap of faith here is borrowing/leaning on the dimensionality smarts of merge to then seed the scalar coordinate promotion of concatenate in order to efficiently direct it to the desired hypercube dimensionality. However, the main point here is that the current concatenation logic requires to deal appropriately with duplicate scalar values on the newly promoted dimension i.e. A = [1, 1, 2, 2] becomes A = [1, 2] whilst ignoring the yet to be promoted scalar coordinates i.e. B = [3, 4, 3, 4] on the candidate cubes.

Given that, we then simply rinse and repeat, by promoting scalar coordinate B, which is the degenerate case for concatenate. Voilà?

The devil is most definitely in the detail, but there may be mileage with this approach... then again, may be not. But it does (to me) suggest there may be a ray of hope for possible unification...

@rcomer
Copy link
Member

rcomer commented Dec 2, 2018

Looks like #2592 and #512 are sub-issues of #1987.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dragon 🐉 https://github.com/orgs/SciTools/projects/19?pane=info Feature: Merge/Concatenate VATools MetOffice User Group
Projects
Status: 📌 Prioritised
Status: No status
Development

No branches or pull requests

7 participants