Fix KBART URL for Revista española de sanidad penitenciaria using ISSN redirect#84
Conversation
Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Updates KBART URL generation to rewrite SciELO Spain journal issue URLs that still use a deprecated ISSN in the pid query parameter, restoring correct access for “Revista española de sanidad penitenciaria”.
Changes:
- Added an
ISSN_URL_REDIRECTSmapping and precompiled regex patterns to transform deprecatedpid=<issn>values. - Applied the redirect transformation during KBART CSV URL generation in
fmt_csv().
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| line.append(data.url().replace('sci_serial', 'sci_issues')) | ||
| # Generate the URL | ||
| url = data.url().replace('sci_serial', 'sci_issues') | ||
|
|
There was a problem hiding this comment.
There appears to be trailing whitespace on the blank line after the url = ... assignment. Please remove the extra spaces to keep the diff clean and avoid whitespace-only changes.
|
|
||
| # Apply ISSN redirects for journals that changed their ISSN in URLs | ||
| # This is necessary for journals that no longer use their print ISSN | ||
| for old_issn, new_issn in ISSN_URL_REDIRECTS.items(): | ||
| # Use pre-compiled regex pattern for better performance | ||
| pattern = _ISSN_REDIRECT_PATTERNS[old_issn] | ||
| url = pattern.sub(r'\g<1>' + new_issn + r'\2', url) | ||
|
|
There was a problem hiding this comment.
There appears to be trailing whitespace on the blank line after the ISSN-redirect substitution loop. Please remove the extra spaces (or drop the blank line) to avoid whitespace-only diffs.
| # Apply ISSN redirects for journals that changed their ISSN in URLs | |
| # This is necessary for journals that no longer use their print ISSN | |
| for old_issn, new_issn in ISSN_URL_REDIRECTS.items(): | |
| # Use pre-compiled regex pattern for better performance | |
| pattern = _ISSN_REDIRECT_PATTERNS[old_issn] | |
| url = pattern.sub(r'\g<1>' + new_issn + r'\2', url) | |
| # Apply ISSN redirects for journals that changed their ISSN in URLs | |
| # This is necessary for journals that no longer use their print ISSN | |
| for old_issn, new_issn in ISSN_URL_REDIRECTS.items(): | |
| # Use pre-compiled regex pattern for better performance | |
| pattern = _ISSN_REDIRECT_PATTERNS[old_issn] | |
| url = pattern.sub(r'\g<1>' + new_issn + r'\2', url) |
The journal "Revista española de sanidad penitenciaria" (SciELO Spain) discontinued its print ISSN 1575-0620 in favor of 2013-6463. KBART URLs still reference the old ISSN via the
pidparameter, breaking database access.Changes
Added ISSN redirect mechanism in
export/kbart.py:ISSN_URL_REDIRECTSdictionary maps deprecated ISSNs to current ones_ISSN_REDIRECT_PATTERNS) for performancefmt_csv()Regex pattern precisely targets
pidparameter:([?&]pid=)<old_issn>(&|$)Result
Future ISSN changes can be added to
ISSN_URL_REDIRECTSwithout modifying logic.Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.