-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change stable ID to sourceID + original page range rather than digital page range #555
Comments
Worth making a note on this separate issue that for the This may change depending on the results and decisions of this issue. It seems safest in this situation to use |
We'll want to redirect the old URLs and can adapt existing logic we have for that. |
@rlskoeser thank you for the helpful acceptance testing list -- I tested the items and this generally looks good! There are 2 issues I found, and one FYI:
I'm ready whenever you can generate the text corpus to test that last piece. |
@mnaydan thanks for testing. responses:
New question, potentially related: we don't currently do any validation on the original page number field; that means it would be possible to add a range that can't actually be used for the public url. Is this a concern? |
Thanks for the screenshots - my eyes totally glazed over that! And thank you for catching, this is the string method for the digitized work object, which we use to represent instances various places. Much better for it to use the original page number. Your validation test is good - it worries me that there is a potential to create a 500 error which would be non-obvious, I got distracted to see if there's an easy way to check for this — I'm sure I've done something similar but I forget where. (We can't use database constraints.) I'm going to look a little more on that one. I agree, the other things seem unlikely - I was thinking that if someone entered an inferred original page range with brackets it might end up looking weird, but pretty edge edge case. |
Almost there! Single page URL issue appears fixed, and the text corpus appears to be using the correct IDs. I retested the 500 error conditions, and it does result in a validation error instead now. However, it appears the validation fix is now overvalidating. When I change the digital page on a record and try to save it (leaving original page untouched), it says I can't save the record because that original page number + ID combo is not unique. |
@mnaydan thanks for checking! I had logic in the code for that case but it wasn't doing the right thing. Added a test case and fixed, please confirm. |
Fixed! I tested with a Gale item and a Hathi item. |
Now that we know that HathiTrust periodically rescans material (leading to changes in the digital page range), source ID + digital page range is no longer a stable ID as we had previously thought. It might make sense for us to move to source ID + original page range as the stable ID instead.
testing notes
dev notes
first_page
method to use original pages instead of digital (check ramifications...)DigitizedWorkDetailView.get_queryset
to filter on original page instead of digital; may need to use a regex to make sure we don't match inaccurate partial page range (e.g. p1 should not match p10)DigitizedWorkDetailView.get_queryset
to check for a match using the old logic (digital page range) and set redirect url if foundThe text was updated successfully, but these errors were encountered: