Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement/Add viewer pane for source document #269

Merged

Conversation

Projects
None yet
3 participants
@c-w
Copy link
Contributor

commented Jul 1, 2019

In many situations, users will use Doccano to annotate text that was extracted from documents such as PDF, Word or via OCR. Often times, the plain text extraction process will reduce the amount of information available to the annotator, e.g. loss of formatting, loss of page breaks, etc. Being able to view the source document during the annotation process therefore often is helpful to the annotator.

As such, this pull request adds a preview pane for PDF and Word documents to the annotation pane. The preview pane is toggled on the document level by adding the documentSourceUrl key in the metadata (sample corpus to test the document viewer pane).

Animation showing source document viewer pane

@guillim

This comment has been minimized.

Copy link
Contributor

commented Jul 3, 2019

@psorianom this could be very interesting for your use case

@c-w

This comment has been minimized.

Copy link
Contributor Author

commented Jul 3, 2019

@guillim @psorianom Could you tell me a bit more about your use-cases so that we can make sure to consider them in this implementation? Thanks in advance!

@guillim

This comment has been minimized.

Copy link
Contributor

commented Jul 3, 2019

Hello @c-w,
We are working on some legal case annotation. Such texts have very specific indentation & paragraph styling: it matters to understand the meaning of a text.

In my opinion, annotators would really benefit from such a feature. However, it could even be better if we could have a "markdown like" way to record texts, so that displaying line breaks in a text might be possible (instead of a text + text preview)

@c-w c-w force-pushed the CatalystCode:enhancement/render-source-document branch 2 times, most recently from 23d700f to 6f0ec1a Jul 9, 2019

@c-w

This comment has been minimized.

Copy link
Contributor Author

commented Jul 15, 2019

Docker build failure is due to #290, unrelated to the changes in this pull request.

@c-w c-w force-pushed the CatalystCode:enhancement/render-source-document branch from 6f0ec1a to 34028de Jul 16, 2019

@Hironsan Hironsan merged commit e445813 into chakki-works:master Jul 22, 2019

2 checks passed

Codacy/PR Quality Review Up to standards. A positive pull request.
Details
Travis CI - Pull Request Build Passed
Details

@c-w c-w deleted the CatalystCode:enhancement/render-source-document branch Jul 22, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.