-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vv_intersect dim-check error #4
Comments
I believe the solution for the second paragraph there is to slice on |
oof. Thanks @mohawk2 for the report (and for your stackoverflow answer!) -- this does indeed seem to be a bug, but I can't get my head around implicit threading/broadcasting for binary operations like Perhaps I can ask a stupid question to help my understanding: where are implicit thread dimensions created in the dimension list (are they Taking the "needle+haystack" example, I can imagine implicit threading being useful for searching for multiple needle-vectors in a single haystack-set, or to search for a single needle-vector in multiple haystack-sets, or even both. So given a
... but that doesn't work (and may well never have worked, probably due to my misunderstanding). Am I correct in assuming now that the proper (PDL-ish) signatures for this sort of thing would arise from adding the implicit thread dimension (K) to the end of the dimension-lists like so:
?
Yep, Note to self: this may also affect |
@mohawk2 based on the assumptions above (= implicitly thread over final dimensions rather than initial ones), I've overhauled the |
Lots of changes! I'm not sure we're there yet, but please see if you agree (I added
The non-dummied results aren't right; the possibility of no dim being there needs to be accounted for by adding a "1" dim. The non-intersecting dummied result should give an empty as indicated. I am sure that my previous point about using I found the broadcasting thing a bit head-bending when I last-minute decided to add to that answer with the "multi" thing, so it's no surprise it's taking us a couple of goes. Your note above only allows for K needles and K haystacks, instead of L haystacks, which in my experience shows one is not there yet (I've battled this broadcasting thing quite a bit, and carry the psychological scars ;-). Here is my concept of this, hope it makes sense. My notation uses
I would observe that the tests you added appear superficially very thorough, but that using lots of slices and glues for both the inputs and the "expected" data is a flawed approach because one risks copying a flawed implementation into flawed tests for a false positive. For instance, around line 114 of the setops test, I honestly don't know what the slice/glue expression would produce. I'd suggest instead using explicit test data at least for outputs. Final observation; using |
Please note that I am unable to re-open the issue. I assume the repo settings are set to forbid that. |
Thanks for reopening! (I gather it's a long-complained-about feature of GH that if the maintainer closes an issue, the OP cannot reopen it, only the maintainer). I thought of an actual (albeit dumb) application for the multi, "needles" with dummying: if the vector is a 3-vector of a colour, you can produce a mask of an image, of pixels with exactly those values. |
thanks for the illustrative reply! I'm not sure I've understood everything yet, so please be patient:
I think I've got this kinda working locally with RedoDimsCode (thanks for the hint!)
I believe this was already fixed in v1.0.16.
I see the logic of your suggestion and agree that it's correct, but I don't see how to make work both efficiently & correctly without replacing the
The scalar-context convenience slicing in PMCode would reshape that using
... so the second (threaded) logical intersection of Using
... but we can't To invoke some Perl dogma: the "easy things" (here, intersecting exactly 2 sets) should be easy (as is the case with I still have to look deeper into it; just some thoughts for now... |
Yep, you're right, it is, I'm just lazy. |
For the intersection between a 3x2 and a 3x2x2x4, the I think you may be right that actually what's needed to find the size of the
The result I showed was from 1.0.16. Please try it ;-) If you think this stuff is confusing, I've just (FINALLY) fixed the gremlins in the TriD demos (PDLPorters/pdl#337). I decided I'd make the code generating the graphs' axes use ndarrays rather than Perl scalars and for-loops. Using some creative slicing, clumping, and dataflow, it all looked like it would work great. Segfault (the reverse dataflow destination has a NULL data pointer for some reason). More debugging! (luckily it's 100% reproducible) |
Thanks for your patience :-) ... I think pretty much everything (with the exception of With respect to:
I had (and I just did again) ;-) ... I think there may be a misunderstanding. The implicit ("convenience") trimming code which returns an
If that's returning non-empty on your setup in scalar context, there's likely something else going on. The 3x1 shape of the first element returned in list context is expected behavior. |
I won't have much time for this next week, so I've uploaded the current status quo to CPAN as PDL-VectorValued v1.0.17. Leaving the issue open (and omitting sensitive keywords from commit messages ;-) |
It's a pity the slice only happens in scalar (EDIT not list) context. |
(presumably a typo): only in scalar context, not in list context.
True, but
I think that v1,0.17 ought to work without the sub vv_in {
require PDL::VectorValued::Utils;
my ($needle, $haystack) = @_;
my ($c, $nc) = $needle->vv_intersect($haystack);
$nc;
}
pdl> p $titi = pdl(1,2,3)
[1 2 3]
pdl> p $toto = pdl([1,2,3], [4,5,6])
[
[1 2 3]
[4 5 6]
]
pdl> p $notin = pdl(7,8,9)
[7 8 9]
pdl> p vv_in($titi, $toto)
1
pdl> p vv_in($notin, $toto)
0 [EDIT for posterity: corrected broken example code to something which actually works as intended] As I added to the (3-year-old?) SO thread yesterday, I think a "better" way to do it would be to use sub vv_in_vsearch {
require PDL::VectorValued::Utils;
my ($needle, $haystack) = @_;
return ($haystack->dice_axis(1, $needle->vsearchvec($haystack)) == $needle)->bandover;
}
pdl> p vv_in_vsearch($titi, $toto)
[1]
pdl> p vv_in_vsearch($notin, $toto)
[0]
pdl> p vv_in_vsearch($toto, $toto)
[1 1]
pdl> p vv_in_vsearch($notin->cat($titi), $toto)
[0 1] |
Thank you! I think your SO answer is better than mine, so if you feel this matter is resolved, please feel free to close this issue :-) |
Thanks again for the report (and for the upvote :-); I've thrown out the redundant |
I added a comment on the GH commit, since that was a convenient place to do so as I was reading it with interest. Thanks for your updates! |
As shown in https://stackoverflow.com/a/71446817/3857002, there is what I believe is a logic error in the
PMCode
for the otherwise-excellentvv_intersect
; the "dimension mismatch" error comes from the twodim(-2)
being different, while it should just bedim(0)
being compared. This error comes to light when broadcasting (formerly known as "threading").There is also arguably an error in that the first element being returned should be an empty ndarray (vectorlength x 0) on no intersection, while the current implementation does a
slice
which gets this wrong. Also using$nc->sclr
will cause an error with current PDL if broadcasting, since it will be a multi-element ndarray.The text was updated successfully, but these errors were encountered: