Accurately inform user of document import progress (even on background jobs) #268

cometman · 2015-09-08T15:39:07Z

_Objective_
As a DC user, after uploading a document to DC, I need to be fully informed of any additional processing taking place on my document that could impede my current task.

_History_
With the introduction of decoupling entity extraction from document imports, documents are first imported, and then a background job is started to begin entity extraction. Today, after a document has imported the document appears to be complete, even if the entity extraction job is still running.

_Successful test case_
After a user uploads a document and the initial import completes, if the user attempts to clicked "View Entities" on a document that is still undergoing entity extraction, the user should be informed: "We're still extracting entities from your document. Check back in a little while."

_Technical Change Overview_

Update API to include jobs being processed in the response of GET /document/:id
Add error message (also localized) to constants
When clicking the action "View Entities" ensure all dependent jobs have completed.

reefdog · 2015-09-08T15:48:30Z

For background: now that a document is usable before being "fully" processed (i.e., now that you can view/edit/annotate/etc. while entities are still processing), we need a redesigned and more robust "what's the status of my document?" interface in the workspace, both in the index (workspace) and show (viewer) screens. That will require a bit more work. This solution lays the technical foundation for that but with a stopgap interface (see "successful test case").

@cometman Rather than "Your document is still being processed", let's go with the more explicit "We're still extracting entities from your document. Check back in a little while."

cometman · 2015-09-08T15:49:57Z

👍 error message text change

knowtheory · 2015-09-08T16:03:49Z

Yeah, just for a bit more background (ha):

Entity Extraction to this point has been a portion of document importation & processing. Processing has been treated as monolithic. Either all processing has been completed, or it hasn't. That processing has included image extraction, text extraction and then entity extraction from the text.

In order to be able to rate limit Entity Extraction, it is necessary to break apart document processing (so that Entity Extraction can be controlled independently from document importation) into multiple backgrounded jobs which may be chained together in sequence.

As a consequence, the DocumentCloud workspace, which currently can only treat documents as Available or Unavailable, must be updated to also reflect a sequence of possible statuses.

Rather than do a whole hog reworking of the workspace with an actual state machine, for the time being, we will provide the workspace with additional information about what jobs are being run for a particular document.

If an entity extraction job is currently being run, entity display tools will not be available for that document in the work space.

anthonydb · 2015-09-08T18:23:16Z

As a consequence, the DocumentCloud workspace, which currently can only treat documents as Available or Unavailable, must be updated to also reflect a sequence of possible statuses.

The messages to the user should be framed in terms of "what can I do with the document." So, for example, we might consider status messages that include:

Ready to annotate.
Ready to publish.
Ready to explore entities.

reefdog · 2015-09-08T18:24:45Z

That's an excellent idea, Tony.

anthonydb · 2015-09-08T18:57:27Z

Thanks. Further to expand the idea, these statuses could be grayed-out and gradually get filled in (or a check mark added) as the step completes. This also serves to educate users as to what's happening to their docs behind the scenes and is better than a simple percent-done indicator.

reefdog · 2015-09-08T19:04:33Z

Right. I like the idea of effectively enabling visible feature flags, versus a progress bar which implies both sequential processing and a mythical concept of "completeness". For instance, users who don't use entities would consider a pre-entity-extracted document as complete, so to generically communicate that a pre-extraction document is "80% ready for use" would be wrong. The model you laid out is more like "hey, you can do A/B/C on this doc, but check back for Z" which communicates personal completeness. I like.

knowtheory · 2015-09-08T19:17:06Z

Yeah that's my preferred course of action as well, however doing that will require taking a look at the document tiles, what information is conveyed there, and how we can integrate that information into the design & interaction.

The thing we're trying to prevent right now is the following:

Once a document has finished processing it is marked as complete. The workspace has only one notion of "completeness" and notifies the user that the document is done. The user then clicks on the completed document and asks for the entities. The workspace reports back that the document has no entities.

The workspace needs to know on a data level what jobs are connected with which documents, and what users are or are not allowed to do.

The start for this is definitely just having the workspace tell users they should come back later for entities if entities are currently being processed.

cometman added feature request 2 - Working labels Sep 8, 2015

cometman self-assigned this Sep 8, 2015

cometman changed the title ~~Document upload progress notification~~ Accurately inform user of document import progress (even on background jobs) Sep 8, 2015

cometman mentioned this issue Sep 23, 2015

Entity extraction ui #273

Open

reefdog unassigned cometman Feb 24, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accurately inform user of document import progress (even on background jobs) #268

Accurately inform user of document import progress (even on background jobs) #268

cometman commented Sep 8, 2015

reefdog commented Sep 8, 2015

cometman commented Sep 8, 2015

knowtheory commented Sep 8, 2015

anthonydb commented Sep 8, 2015

reefdog commented Sep 8, 2015

anthonydb commented Sep 8, 2015

reefdog commented Sep 8, 2015

knowtheory commented Sep 8, 2015

Accurately inform user of document import progress (even on background jobs) #268

Accurately inform user of document import progress (even on background jobs) #268

Comments

cometman commented Sep 8, 2015

reefdog commented Sep 8, 2015

cometman commented Sep 8, 2015

knowtheory commented Sep 8, 2015

anthonydb commented Sep 8, 2015

reefdog commented Sep 8, 2015

anthonydb commented Sep 8, 2015

reefdog commented Sep 8, 2015

knowtheory commented Sep 8, 2015