-
Notifications
You must be signed in to change notification settings - Fork 0
DM-37547: HSC exposure ID generator produces invalid IDs in 2023 #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
self.assertEqual(last_group, int(20110101) * 100_000) | ||
|
||
def test_exposure_id_hsc(self): | ||
group = "2023011100026" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just a number, avoiding the later int()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We get a lot of bugs related to integers being assigned as groups and then passed around like that, so I've insisted on other reviews that we always be strict with the types there. I could hardly make an exception for myself, especially since part of the problem is that we tend to think of group IDs as integers when formally they're not.
Broad queries for camera now always fail, so I picked the skymap as another "simple" dataset from the test data.
This table can be used by exposure ID generation algorithms to ensure that their output IDs are always valid.
493072e
to
c501677
Compare
@ktlim, I implemented the revised generator (thanks for catching the 1.75-year caveat!). I'm still testing, but so far output looks reasonable:
Can you take another look and let me know if the code meets with your approval? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks awesome — except the commit comment, which says that monotonicity is sacrificed when it isn't now!
The new ID generation algorithm is guaranteed to return an HSC ID in the correct range (0-21,474,800), and is less likely to have collisions from over-uploading. It is only valid until September 2024, but we hope that will be enough.
c501677
to
1d4b3d6
Compare
While we shouldn't need refresh for the local repo, because there is only one Butler for it, the central repo has one Butler per worker, and any individual worker may be long- or short-lived. While it's not a proper synchronization API, call `central_butler.registry.refresh` before operations that read from or write to the central repo.
1e267d3
to
9d972c8
Compare
This PR adds a regression test for the bug where HSC exposure IDs were being generated with values of 30,000,000+, and rewrites the ID generation algorithm to confine them in the allowed range.