Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #3172 by reducing double square braces in IA ocr text #3177

Merged
merged 2 commits into from
Jun 14, 2022

Conversation

benwbrum
Copy link
Owner

The underlying problem was a page with three lines of text containing [[, an OCR artifact which we were interpreting as the beginning of a wiki-link. This reduces any repeated [ characters in imported OCR to a single [.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.002%) to 78.839% when pulling e264a56 on 3172-double-braces-ocr into 013e4f3 on development.

@sylvieed
Copy link
Collaborator

This looks good to me, the original archive imported with OCR without any problems.

@sylvieed sylvieed merged commit 87a6e37 into development Jun 14, 2022
@benwbrum benwbrum linked an issue Jun 16, 2022 that may be closed by this pull request
sylvieed added a commit that referenced this pull request Jul 14, 2022
Fix #3172 by reducing double square braces in IA ocr text
@benwbrum benwbrum deleted the 3172-double-braces-ocr branch December 6, 2022 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Internet Archive 422 error during import (USDA)
3 participants