Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copying a slice to another slice in the same array fails silently #16

Closed
manopapad opened this issue May 6, 2021 · 9 comments
Closed
Assignees
Labels
bug Something isn't working in progress

Comments

@manopapad
Copy link
Contributor

Bug report due to @piyueh

The following code, when run with -lg:numpy:test, prints [False], indicating that the slice has not been updated:

from legate import numpy
a = numpy.random.random((3, 3))
a[:, 0] = a[:, 2]
print(numpy.allclose(a[:, 0], a[:, 2]))

After some digging I found that we skip copies between sub-regions if they're backed by the same field, which is actually only safe if the slices are equivalent: https://github.com/nv-legate/legate.numpy/blob/2b460c5dfdd60b673e37e25231bf625fdf3ead0e/legate/numpy/deferred.py#L101-L105

If we simply skip this check then the copy ends up happening through a CopyTask, which works with subregions of the same base region.

However, the runtime errors out if the two slices overlap, e.g. if we do a[0,0:2] = a[0,1:3] (vanilla NumPy accepts this, and does the expected thing). We should at least check for overlaps in python and produce a reasonable error message.

We also want to add a case for this to the test suite.

@manopapad manopapad added the bug Something isn't working label May 6, 2021
@manopapad manopapad self-assigned this May 6, 2021
piyueh added a commit to piyueh/TorchSWE that referenced this issue May 7, 2021
@lightsighter
Copy link
Contributor

lightsighter commented May 8, 2021

Here's another variation on this problem that we need to handle:

a[ [4,2,5,1,3] ] = b[2:7]

Requires a destination indirect (scatter) copy with an additional transform indirection on the source.

@manopapad
Copy link
Contributor Author

The problem here is that we are missing a translation from NumPy (local) index space (i.e. every view's indices start from 0) to Legion (global) index space (i.e. every view's indices start wherever that view is placed within the base array) when emitting a scatter copy.

The most general case of this pattern (writing through an advanced view) also includes a view on the LHS:

a[10:15][ [4,2,5,1,3] ] = b[2:7]

I believe the best way to handle this case is to set up a full scatter-gather copy as follows:
(1) materialize a field to store the 0:5 -> 2:7 mapping, use that as src_indirect
(2) manually apply the 0:5 -> 10:15 transformation on top of [4,2,5,1,3] to produce [14,12,15,11,13], use that as dst_indirect

The work on StanfordLegion/legion#705 would allow us to avoid materializing an explicit field for (1).

@piyueh
Copy link

piyueh commented May 29, 2021

Hi, is it possible to raise a NotImplementedError when application code uses this kind of in-array copying? It seems it takes time to resolve it. However, this kind of copying is very common in numerical methods for solving PDEs, for example, in periodic boundary conditions and 1st-order approximation of Neumann boundary conditions. So I think raising an error to notify application developers is a better short-term solution.

Actually, even in the last step of the 12-step to CFD (the one used in GTC presentation), we also have this type of copying in pressure_poisson_periodic. (However, in this particular case, the pressure field does not change, so a failure in the copying does not affect the solution's correctness.)

Another example is the code shown in issue #29.

@manopapad
Copy link
Contributor Author

Yes, I meant to tackle this properly after fixing #12, but that's taking longer than expected. I will add some error messages as a stopgap.

@manopapad
Copy link
Contributor Author

As of b997647 the original testcase should be working. The case of overlapping sub-arrays is not working, but we will emit an error message about it. We also detect cases where a copy requires two indirections on a dimension, and error out.

@manopapad
Copy link
Contributor Author

My last change uncovered that some of our tests are doing copies between overlapping sub-arrays, reverting until I can fix those.

@manopapad manopapad reopened this Jun 3, 2021
@manopapad
Copy link
Contributor Author

The original testcase should be properly working now with 5ec5099. @piyueh please let me know if it works for you. If it does I will update the issue to track work on relevant features that are still missing (handling overlapping regions in copies, handling views on advanced copies).

@piyueh
Copy link

piyueh commented Jun 4, 2021

Yes, I confirm the original test case is working now. Thanks!

@manopapad
Copy link
Contributor Author

I have opened issues #39, #40 and #41 to track work on the remaining issues, closing this one.

piyueh added a commit to piyueh/TorchSWE that referenced this issue Jun 11, 2021
piyueh added a commit to piyueh/TorchSWE that referenced this issue Aug 25, 2021
manopapad pushed a commit to manopapad/cunumeric that referenced this issue Feb 14, 2024
* Convolve implementation

* Bump the Legate core commit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working in progress
Projects
None yet
Development

No branches or pull requests

3 participants