Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs in ALTO handling #26

Merged
merged 2 commits into from May 2, 2019

Conversation

Projects
None yet
3 participants
@jbaiter
Copy link
Member

commented May 2, 2019

  • Get rid of trailing </> tags after HTML stripping
  • Combine all ALTO filtering into AltoCharFilterFactory, no more need for an explicit subsequent HTMLStripCharFilter in the analyzer chain

Many thanks to @mbennett-uoe for the helpful discussion!

Fix bugs in ALTO handling, thanks @mbennett-uoe for the helpful discu…
…ssion

- Get rid of trailing </> tags after HTML stripping
- Combine all ALTO filtering into AltoCharFilterFactory, no more need
  for a subsequent HTMLStripCharFilter

@jbaiter jbaiter force-pushed the alto-fixes branch from 95c06c9 to a1bc7e3 May 2, 2019

Minor file fixes
Missing closing tag in docstring
Missing > in Description regexp
@bitzl

bitzl approved these changes May 2, 2019

@bitzl bitzl merged commit 375903f into master May 2, 2019

1 check passed

ci/gitlab/gitlab.com Pipeline passed on GitLab
Details

@jbaiter jbaiter deleted the alto-fixes branch May 2, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.