-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add script for unlinking some bad agents. #1153
Conversation
Is LXL-3919 the correct issue for this? |
Urgh.. nope :P It's LXL-3913 |
I also have a sneaking suspicion that maybe we shouldn't just remove them, but replace them with local entities? |
Ah, check. Yes, its only the linkage that's to be removed; "some" local entity data should be kept (presumably only names and possibly lifeSpan (albeit that might be wrong if added on the linked entity after original match)). Ask for clarification/spec of fields. |
boolean isABadLink(String candidate) { | ||
String s = candidate.substring(0, candidate.length()-3) // Trim off the #it | ||
System.err.println(candidate + " -> " + s) | ||
return s.endsWith("53hlt8kp5swj700") || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit-pick: safer (and slightly faster) to put these in a Set
and use that here and to build the select string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To use a Set I'd instead have to shave of the https://libris{-qa,-stg}/ (which varies in length), which could go wrong. I think I'll keep this as is!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or use candidate.substring(candidate.lastIndexOf('/') + 1, candidate.lastIndexOf('#'))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
candidate.split('[#/]')[-2]
should work too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is certainly much more succinct (albeit somewhat less performant, as it would parse a regexp (unless cached under the hood), splitting on all /
:s and building a new list of those substrings; though I bet any performance loss is dwarfed by the I/O going on here).
No description provided.