You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Law history project is just one of many possible projects where the input data is the output of an OCR process.
For such corpora, ideally, buzzword provides an interface that displays the original PDF beside its plaintext. Users should be able to submit changes (i.e. corrections) and/or add metadata tags to the text. In terms of implementation, the text displayed on the right should use a good high-level lib for this (martor right now). Needed features:
Corpus model updated to handle a parallel 'PDF corpus'
the compare/<slug> endpoint, possibly with page/pages query params for starting at a certain place
Navigation between pages. Essentially (<<first <prev next> last>>) [ goto ] is good enough for now.
Admin interface for viewing/accepting submitted corrections
Email service to tell users the status of their changes/tell admins that there are new suggestions
This should be built with knowledge that the explorer and compare interfaces will eventually be linked via #40
The text was updated successfully, but these errors were encountered:
Law history project is just one of many possible projects where the input data is the output of an OCR process.
For such corpora, ideally, buzzword provides an interface that displays the original PDF beside its plaintext. Users should be able to submit changes (i.e. corrections) and/or add metadata tags to the text. In terms of implementation, the text displayed on the right should use a good high-level lib for this (martor right now). Needed features:
compare/<slug>
endpoint, possibly with page/pages query params for starting at a certain place(<<first <prev next> last>>) [ goto ]
is good enough for now.This should be built with knowledge that the explorer and compare interfaces will eventually be linked via #40
The text was updated successfully, but these errors were encountered: