New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes missing discard of duplicate secondary targets #544
Conversation
I'll try to look at this closely in the next day. But, the filtering on https://github.com/desihub/desitarget/blob/master/py/desitarget/io.py#L608-L615 Are you perhaps running the functions from within Python rather than running the command-line executables (such as |
Ah, ok. Now I know where the filtering happens (I didn't check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, again, for checking my working on this. I have a few requested changes before I merge...
py/desitarget/secondary.py
Outdated
# APC Remove duplicate targetids from secondary-only targets | ||
alldups = [] | ||
for _, dups in duplicates(scxtargs['TARGETID']): | ||
dups = np.delete(dups, 0) # Retain the first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to retain the first duplicate among matching secondary targets, I think we want to retain the duplicate with the highest priority. Is it possible to rewrite this to retain the highest priority secondary-only target?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks -- will sort this out. Thanks for the review!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, except I wasn't quite sure what was meant in the docstring by "the first encountered" . I wrote it to use the lowest index in the original table to break ties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, "the first encountered" here just means that if we find two secondaries with the same priority, we just keep the first one in the table/array/loop. In other words, we don't break ties in any clever way once we've broken ties to keep the highest-priority secondary. That's not critical though. The way we write everything back to the input files and generate consistent TARGETIDs should keep track of what was matched and how.
If you don't think the code does this, just remove the sentence in the docstring. As I noted, it's not critical how we break ties other than on priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, in that case I'll just use np.argmax directly, since that breaks ties on the lowest index in the list of duplicates. Done now.
Behaviour of select_secondary for this option now follows select_targets, i.e. a combined output is written as well as the BRIGHT and DARK files.
Follows docstring in finalize_secondary; ties on max priority broken by lowest index in target list.
The previous commit claimed to do this, but didn't....
@apcooper: I think this looks good, now. If you agree I'll merge this PR. Thanks for putting this effort into getting the secondary targets correct. It's important work and I'm swamped at the moment. |
OK, I pushed one last commit to simplify the tie-break part. Ready to merge now. |
One last request for good practice: Can you add a line or two to the changes.rst file to describe this PR. Then I'll merge. Thanks! |
Done, hope that's enough info in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes were good.
Fixes missing discard of duplicate secondary targets
@geordie666 , I think this fixes the issue I raised by email: multiple rows in the output of
select_secondary
for (a) secondary targets that appear in more than one input catalog or (b) primary targets matched to more than one secondary.Case (a) is fixed here:
desitarget/py/desitarget/secondary.py
Lines 648 to 654 in 203e392
Case (b) is fixed here:
desitarget/py/desitarget/secondary.py
Line 358 in 203e392
which was previously
desitarget/py/desitarget/secondary.py
Line 357 in 203e392
Question
desitarget/py/desitarget/secondary.py
Lines 547 to 554 in 203e392
I didn't see the
PRIORITY_INIT=-1
filtering implemented anywhere and it's not how I've handled case (a) in this PR. Wondering if I missed something about the intention here?