Skip to content
This repository was archived by the owner on Feb 2, 2024. It is now read-only.

Conversation

agayev
Copy link

@agayev agayev commented Apr 2, 2018

Currently, it assumes the array that is being indexed is a one-dimensional array of ints and appears to work. Please see if this makes sense. If it does, then I will extend it to support multi-dimensional arrays.

@coveralls
Copy link

coveralls commented Apr 2, 2018

Coverage Status

Coverage increased (+0.6%) to 83.384% when pulling 712e1e8 on agayev:permutation-array into 38b72f9 on IntelLabs:master.

arr_def = guard(get_definition, self.func_ir, index_var)
if (arr_def.op == 'call' and
self._call_table[arr_def.func.name] == ['permutation', 'random', np]):
self._meet_array_dists(lhs, rhs.value.name, array_dists)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arr_def could be None or any node (not necessarily Expr). This if needs more checks otherwise it fails.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check rhs.value to be multi-dimensional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call_table is gone now. need to use find_callname.

is_multi_dim = True

arr_def = guard(get_definition, self.func_ir, index_var)
if (arr_def.op == 'call' and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issue with arr_def check before.

if fdef == ('permutation', 'np.random'):
if fdef == ('permutation', 'numpy.random'):
assert len(args) == 1
assert self.typemap[args[0].name] == types.int64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs removed.

def _run_permutation_array_index(self, lhs, rhs, idx):
scope, loc = lhs.scope, lhs.loc
alloc = mk_alloc(self.typemap, self.calltypes, lhs,
tuple(self._array_sizes[lhs.name]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chunk size of first dimention: self._array_counts[lhs.name][0].

@agayev
Copy link
Author

agayev commented Apr 3, 2018

Fixed the chunk size. You were wrong about the code path not being exercised. It was being exercised, except when you execute your shuffle algorithm with one node, you end up sending all of the elements to yourself in sorted order. The missing thing was shuffle after MPI_Alltoallv. I added a hack to receive make up a PRNG and shuffle with that, but the proper way is to obtain the seed from numba and use that.

@agayev
Copy link
Author

agayev commented Apr 4, 2018

Dead code eliminator removes np.random.seed() call, which makes it hard to write test for this. Fixed it in numba/numba#2873.

}

std::vector<int> find_dest_ranks(int rank, int num_ranks,
int64_t *idx, int idx_len)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs doc.

MPI_INT64_T, MPI_COMM_WORLD);

std::mt19937 rng(0);
std::shuffle(ilhs, ilhs + dest_ranks.size(), rng);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shuffle? idx should determine the order.

B = np.random.permutation(n)
A = A[B]
return A.sum()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a test where two arrays are permuted with the same index array, to make sure they are permuted in the same way.


MPI_Alltoallv(send_buf.data(), send_counts.data(), send_disps.data(),
MPI_INT64_T, ilhs, recv_counts.data(), recv_disps.data(),
MPI_INT64_T, MPI_COMM_WORLD);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs support for non-int types.

@agayev
Copy link
Author

agayev commented Apr 4, 2018

The last patch addresses all the issues, except the unit tests. You can review this, while I work on the unit tests.

// |num_ranks|, finds the destination ranks of indices of the |rank|. For
// example, if |rank| is 1, |num_ranks| is 3, |p_len| is 12, and |p| is the
// following array [ 9, 8, 6, 4, 11, 7, 2, 3, 5, 0, 1, 10], the function returns
// [2, 0, 1, 0].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be [0,2,0,1]?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Fixed.

void apply_permutation(unsigned char *v, int64_t elem_size,
std::vector<size_t>& p)
{
std::vector<unsigned char> tmp(elem_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tmp not used.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

std::vector<unsigned char> tmp(elem_size);
for (size_t i = 0; i < p.size(); ++i) {
auto current = i;
while (i != p[current]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop needs doc/reference.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

}

MPI_Alltoallv(send_buf.data(), send_counts.data(), send_disps.data(),
MPI_INT64_T, lhs, recv_counts.data(), recv_disps.data(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs MPI data type of size elem_size.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


auto begin = p + hpat_dist_get_start(p_len, num_ranks, rank);
auto p1 = arg_sort(begin, dest_ranks.size());
auto p2 = arg_sort(p1.data(), dest_ranks.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs explanation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

@ehsantn
Copy link
Contributor

ehsantn commented Apr 5, 2018

Looks good. Merging.

@ehsantn ehsantn merged commit 407ee24 into IntelPython:master Apr 5, 2018
@agayev agayev deleted the permutation-array branch April 6, 2018 19:51
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants