Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes missing discard of duplicate secondary targets #544

Merged
merged 12 commits into from Oct 4, 2019

Conversation

apcooper
Copy link
Contributor

@geordie666 , I think this fixes the issue I raised by email: multiple rows in the output of select_secondary for (a) secondary targets that appear in more than one input catalog or (b) primary targets matched to more than one secondary.

Case (a) is fixed here:

alldups = []
for _, dups in duplicates(scxtargs['TARGETID']):
dups = np.delete(dups, 0) # Retain the first
alldups.append(dups)
alldups = np.hstack(alldups)
log.debug("Removing {} duplicate targetids".format(len(alldups)))
scxtargs = np.delete(scxtargs,alldups)

Case (b) is fixed here:

for _, dups in duplicates(primtargs['TARGETID']):

which was previously

# for _, dups in duplicates(primids):

Question

- Secondaries without `OVERRIDE` are also matched to themselves
Such matches are given the same `TARGETID` (that of the primary
if they match a primary) and the bitwise or of `SCND_TARGET` and
`OBSCONDITIONS` bits across matches. The highest `PRIORITY_INIT`
is retained, and others are set to -1. Only secondaries with
priorities that are not -1 are written to the main file. If
multiple matching secondary targets have the same (highest)
priority, the first one encountered retains its `PRIORITY_INIT`

I didn't see the PRIORITY_INIT=-1 filtering implemented anywhere and it's not how I've handled case (a) in this PR. Wondering if I missed something about the intention here?

@geordie666
Copy link
Contributor

I'll try to look at this closely in the next day. But, the filtering on PRIORITY_INIT of -1 is done in the write-to-file, here:

https://github.com/desihub/desitarget/blob/master/py/desitarget/io.py#L608-L615

Are you perhaps running the functions from within Python rather than running the command-line executables (such as select_secondary)? If so, could you see if your original issue persists when running select_secondary instead of calling the secondary.select_secondary() function directly?

@apcooper
Copy link
Contributor Author

apcooper commented Oct 1, 2019

Ah, ok. Now I know where the filtering happens (I didn't check io.write_secondary) I'll just set PRIORITY_INIT=-1 rather than filtering in finalize_secondary.

Copy link
Contributor

@geordie666 geordie666 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, again, for checking my working on this. I have a few requested changes before I merge...

bin/select_secondary Show resolved Hide resolved
bin/select_secondary Outdated Show resolved Hide resolved
py/desitarget/secondary.py Outdated Show resolved Hide resolved
# APC Remove duplicate targetids from secondary-only targets
alldups = []
for _, dups in duplicates(scxtargs['TARGETID']):
dups = np.delete(dups, 0) # Retain the first
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to retain the first duplicate among matching secondary targets, I think we want to retain the duplicate with the highest priority. Is it possible to rewrite this to retain the highest priority secondary-only target?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks -- will sort this out. Thanks for the review!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, except I wasn't quite sure what was meant in the docstring by "the first encountered" . I wrote it to use the lowest index in the original table to break ties.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, "the first encountered" here just means that if we find two secondaries with the same priority, we just keep the first one in the table/array/loop. In other words, we don't break ties in any clever way once we've broken ties to keep the highest-priority secondary. That's not critical though. The way we write everything back to the input files and generate consistent TARGETIDs should keep track of what was matched and how.

If you don't think the code does this, just remove the sentence in the docstring. As I noted, it's not critical how we break ties other than on priority.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, in that case I'll just use np.argmax directly, since that breaks ties on the lowest index in the list of duplicates. Done now.

Behaviour of select_secondary for this option now follows select_targets,
i.e. a combined output is written as well as the BRIGHT and DARK files.
Follows docstring in finalize_secondary; ties on max priority
broken by lowest index in target list.
The previous commit claimed to do this, but didn't....
@geordie666
Copy link
Contributor

@apcooper: I think this looks good, now. If you agree I'll merge this PR.

Thanks for putting this effort into getting the secondary targets correct. It's important work and I'm swamped at the moment.

@apcooper
Copy link
Contributor Author

apcooper commented Oct 4, 2019

OK, I pushed one last commit to simplify the tie-break part. Ready to merge now.

@geordie666
Copy link
Contributor

One last request for good practice:

Can you add a line or two to the changes.rst file to describe this PR. Then I'll merge. Thanks!

@apcooper
Copy link
Contributor Author

apcooper commented Oct 4, 2019

Done, hope that's enough info in changes.rst.

Copy link
Contributor

@geordie666 geordie666 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes were good.

@geordie666 geordie666 merged commit 8fd1513 into master Oct 4, 2019
@geordie666 geordie666 deleted the apc_secondary_fix branch October 4, 2019 14:36
qmxp55 pushed a commit to qmxp55/desitarget that referenced this pull request Feb 12, 2020
qmxp55 pushed a commit to qmxp55/desitarget that referenced this pull request Feb 12, 2020
Fixes missing discard of duplicate secondary targets
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants