New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Features/174 concatenate #319
Conversation
…PI error but no message is thrown. commit before comp dies
bug was a Isend command was started but no data was sent, this resulted in an extra send command causing other commands to fail later on
…r this file, here: revert array to standard
Codecov Report
@@ Coverage Diff @@
## master #319 +/- ##
==========================================
+ Coverage 96.8% 96.88% +0.08%
==========================================
Files 53 53
Lines 8347 8744 +397
==========================================
+ Hits 8080 8472 +392
- Misses 267 272 +5
Continue to review full report at Codecov.
|
…tz-analytics/heat into features/174-concatenate
previously there was a bug that it would crash in __len__ of DNDarray when generating a new one. this will also handle the generation of a new array with the same split
reduction buffer used to determine if the array can be formed. new code generates the proper array gshape (previously the reduction buffer would sometimes be off by 1)
promote types used to detemine if the arrays are not the proper dtype, if they are not then they are cast to the proper type
@Markus-Goetz I made some changes to types and factories.array. these might interest you. I couldnt debug that reduction buffer anymore so i left it and call another allreduce to get the proper shape after the raise statement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Daniel @coquelin77,
I ran a few tests and as far as I can tell it works fine (except for the negative axis case).
I see that the concatenated tensor inherits the split value from the input tensors and is distributed, it makes sense but it's not what e.g. maximum() and minimum() are doing (and probably other reduction operations as well, I've got to check), so if we agree that's what we want, I'll make the relevant changes.
if not isinstance(arr0, dndarray.DNDarray) and not isinstance(arr1, dndarray.DNDarray): | ||
raise TypeError('Both arrays must be DNDarrays') | ||
if not isinstance(axis, int): | ||
raise TypeError('axis must be an integer, currently: {}'.format(type(axis))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose you have a good reason not to call sanitize_axis? At the moment ht.concatenate fails if axis<0.
Example axis>0, this one works fine:
a = ht.array(ht.random.randn(2, 4), split=1)
b = ht.array(ht.random.randn(2, 7), split=1)
c = ht.concatenate((a, b), axis=1)
Equivalent with axis<0, doesn't run, see below:
a = ht.array(ht.random.randn(2, 4), split=1)
b = ht.array(ht.random.randn(2, 7), split=1)
c = ht.concatenate((a, b), axis=-1)
mpirun --use-hwthread-cpus -n 2 -tag-output python local_test.py
[1,0]:Traceback (most recent call last):
[1,0]: File "local_test.py", line 8, in
[1,0]: c = ht.concatenate((a, b), axis=-1)
[1,0]: File "/Users/c.comito/HAF/heat/heat/core/manipulations.py", line 108, in concatenate
[1,0]: ' {}, {}'.format(arr0.gshape, arr1.gshape))
[1,0]:ValueError: Arrays cannot be concatenated, gshapes must be the same in every axis except the selected axis: (2, 4), (2, 7)
[1,1]:Traceback (most recent call last):
[1,1]: File "local_test.py", line 8, in
[1,1]: c = ht.concatenate((a, b), axis=-1)
[1,1]: File "/Users/c.comito/HAF/heat/heat/core/manipulations.py", line 108, in concatenate
[1,1]: ' {}, {}'.format(arr0.gshape, arr1.gshape))
[1,1]:ValueError: Arrays cannot be concatenated, gshapes must be the same in every axis except the selected axis: (2, 4), (2, 7)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Daniel @coquelin77,
I ran a few tests and as far as I can tell it works fine (except for the negative axis case).
I see that the concatenated tensor inherits the split value from the input tensors and is distributed, it makes sense but it's not what e.g. maximum() and minimum() are doing (and probably other reduction operations as well, I've got to check), so if we agree that's what we want, I'll make the relevant changes.
The reason that I did this was to minimize communication. I feel that it makes intuitive sense to inherent the split axis in this case. I do not handle the separate split case here as it would require more overhead. instead i leave it for the user to decide if they want to resplit the data. and if so then they can do it themselves.
closes issue #174