Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features/174 concatenate #319

Merged
merged 29 commits into from Jul 24, 2019
Merged

Features/174 concatenate #319

merged 29 commits into from Jul 24, 2019

Conversation

coquelin77
Copy link
Member

closes issue #174

@codecov-io
Copy link

codecov-io commented Jul 1, 2019

Codecov Report

Merging #319 into master will increase coverage by 0.08%.
The diff coverage is 98.76%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #319      +/-   ##
==========================================
+ Coverage    96.8%   96.88%   +0.08%     
==========================================
  Files          53       53              
  Lines        8347     8744     +397     
==========================================
+ Hits         8080     8472     +392     
- Misses        267      272       +5
Impacted Files Coverage Δ
heat/core/factories.py 100% <100%> (ø) ⬆️
heat/core/tests/test_manipulations.py 100% <100%> (ø) ⬆️
heat/core/types.py 90.86% <100%> (+0.3%) ⬆️
heat/core/dndarray.py 95.05% <100%> (-0.01%) ⬇️
heat/core/manipulations.py 97.67% <96.66%> (-1.01%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bd28e9d...ac3d0b7. Read the comment docs.

heat/core/manipulations.py Outdated Show resolved Hide resolved
heat/core/manipulations.py Outdated Show resolved Hide resolved
coquelin77 added 8 commits July 16, 2019 08:04
previously there was a bug that it would crash in __len__ of DNDarray when generating a new one.
this will also handle the generation of a new array with the same split
reduction buffer used to determine if the array can be formed.
new code generates the proper array gshape (previously the reduction buffer would sometimes be off by 1)
promote types used to detemine if the arrays are not the proper dtype, if they are not then they are cast to the proper type
@coquelin77
Copy link
Member Author

@Markus-Goetz I made some changes to types and factories.array. these might interest you. I couldnt debug that reduction buffer anymore so i left it and call another allreduce to get the proper shape after the raise statement

ClaudiaComito
ClaudiaComito previously approved these changes Jul 23, 2019
Copy link
Contributor

@ClaudiaComito ClaudiaComito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Daniel @coquelin77,

I ran a few tests and as far as I can tell it works fine (except for the negative axis case).

I see that the concatenated tensor inherits the split value from the input tensors and is distributed, it makes sense but it's not what e.g. maximum() and minimum() are doing (and probably other reduction operations as well, I've got to check), so if we agree that's what we want, I'll make the relevant changes.

if not isinstance(arr0, dndarray.DNDarray) and not isinstance(arr1, dndarray.DNDarray):
raise TypeError('Both arrays must be DNDarrays')
if not isinstance(axis, int):
raise TypeError('axis must be an integer, currently: {}'.format(type(axis)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you have a good reason not to call sanitize_axis? At the moment ht.concatenate fails if axis<0.

Example axis>0, this one works fine:

a = ht.array(ht.random.randn(2, 4), split=1)
b = ht.array(ht.random.randn(2, 7), split=1)
c = ht.concatenate((a, b), axis=1)

Equivalent with axis<0, doesn't run, see below:

a = ht.array(ht.random.randn(2, 4), split=1)
b = ht.array(ht.random.randn(2, 7), split=1)
c = ht.concatenate((a, b), axis=-1)

mpirun --use-hwthread-cpus -n 2 -tag-output python local_test.py
[1,0]:Traceback (most recent call last):
[1,0]: File "local_test.py", line 8, in
[1,0]: c = ht.concatenate((a, b), axis=-1)
[1,0]: File "/Users/c.comito/HAF/heat/heat/core/manipulations.py", line 108, in concatenate
[1,0]: ' {}, {}'.format(arr0.gshape, arr1.gshape))
[1,0]:ValueError: Arrays cannot be concatenated, gshapes must be the same in every axis except the selected axis: (2, 4), (2, 7)
[1,1]:Traceback (most recent call last):
[1,1]: File "local_test.py", line 8, in
[1,1]: c = ht.concatenate((a, b), axis=-1)
[1,1]: File "/Users/c.comito/HAF/heat/heat/core/manipulations.py", line 108, in concatenate
[1,1]: ' {}, {}'.format(arr0.gshape, arr1.gshape))
[1,1]:ValueError: Arrays cannot be concatenated, gshapes must be the same in every axis except the selected axis: (2, 4), (2, 7)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Daniel @coquelin77,

I ran a few tests and as far as I can tell it works fine (except for the negative axis case).

I see that the concatenated tensor inherits the split value from the input tensors and is distributed, it makes sense but it's not what e.g. maximum() and minimum() are doing (and probably other reduction operations as well, I've got to check), so if we agree that's what we want, I'll make the relevant changes.

The reason that I did this was to minimize communication. I feel that it makes intuitive sense to inherent the split axis in this case. I do not handle the separate split case here as it would require more overhead. instead i leave it for the user to decide if they want to resplit the data. and if so then they can do it themselves.

heat/core/tests/test_manipulations.py Outdated Show resolved Hide resolved
heat/core/tests/test_manipulations.py Show resolved Hide resolved
@coquelin77 coquelin77 merged commit 474d74e into master Jul 24, 2019
@coquelin77 coquelin77 deleted the features/174-concatenate branch July 24, 2019 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants