Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Want semantic tags, don't want positional OCR attribs #18

Open
gsautter opened this issue Jul 26, 2019 · 1 comment
Open

Want semantic tags, don't want positional OCR attribs #18

gsautter opened this issue Jul 26, 2019 · 1 comment

Comments

@gsautter
Copy link

Want semantic tags, don't want positional OCR attribs

(3) If you want, a can re-add emphasis (you once complained about the complexity it adds to your paths), but number is plainly a pain in the back ... a leftover from a tagger that builds quantities from them. All these elements mark is sequences of digits, which are very easy to discover by other means.

I do want all the semantic (and formative tags) so tags like <bold>, <emphasis>, <underline>, <location>, <typeStatus>, <paragraph> are useful to me. Actually, the only things that are noise are the box info. But, as I mentioned above, I can easily ignore it so perhaps it may just be easier to not mess with this and just leave it as it it. We have bigger problems to solve.

In other words, from my side, I would forget about removing any info. I would only focus on adding the important info such as the GUIDs and other state info (more on that in a bit).

Originally posted by @punkish in #14 (comment)

@gsautter
Copy link
Author

I re-added emphasis, and filter the box attribute now, see http://treatment.plazi.org/GgServer/zenodeo/03AB8782012CFFFCFF1AFDD1FB38EFE9 , or any other treatment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant