Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text-to-TEI export enhancements #635

Open
1 task
rsimon opened this issue Aug 19, 2019 · 2 comments
Open
1 task

Text-to-TEI export enhancements #635

rsimon opened this issue Aug 19, 2019 · 2 comments

Comments

@rsimon
Copy link
Member

rsimon commented Aug 19, 2019

  • insert paragraph tags for \n (or \n\n?)
@GusRiva
Copy link

GusRiva commented Aug 19, 2019

I would insert paragraph tags with one or more new line characters after a point.
Regex: .\n+
That should cover most of the cases.

@rsimon rsimon changed the title TEI-to-text export enhancements Text-to-TEI export enhancements Sep 4, 2019
rsimon pushed a commit that referenced this issue Sep 4, 2019
@rsimon
Copy link
Member Author

rsimon commented Sep 4, 2019

I made a first/incomplete pass at this, which splits on \n\n only.

The issue I have is: I need to know how long the delimiting pattern is, so that I can correctly align the annotations within the paragraph. (Annotations on plaintext content are standoff markup inside Recogito, with every annotation recording the offset from text beginning.)

If you have any ideas on how to achieve this, let me know. I think there's something called "Lookahead"/"Lookbehind" which might achieve this (see this thread).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants