proposing subtypes of `TextDocument`

### New Feature Summary

With a number of recent development, I'd like to propose more vocab types that are subcategories of `TextDocument` (all names are tentative in the proposal)
- `Transcript`: a subtype of text document, always aligned to annotations in non-linguistic modalities (audio, vision), and represent linguistic, and "literal" transcript of the source modality. (e.g. ASR, TR/OCR)
- `Translation`/`Transformation`/`Extraction`: a subtype of text document, always aligned to another `TextDocument`-type annotations. The content of this annotation must be a kind "re-writing" of the source text document. (e.g. identity function in text-slicer, structural parsing in RFB, summary in text-summarizer apps) 
- `Caption`: this is similar to `Transcript` but the content is not "literal" transcript of the source modality (e.g. image-based captioning app, audio-based summarizer app handles non-linguistic sounds like dog barking)




### Related

The addition of subtypes of text document will ease the identification of "app patterns" without relying on a specific app name, and hence help generalize I/O specs for any downstream/consumer applications. 

The issue of view pattern identification has been raised many times, including 

- https://github.com/clamsproject/clams-python/issues/50
- https://github.com/clamsproject/clams-python/issues/262
- https://github.com/clamsproject/clams-python/issues/77

### Alternatives

_No response_

### Additional context

Also see https://github.com/clamsproject/app-role-filler-binder-new/issues/4 for discussion on development of a prototype "app pattern". 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

proposing subtypes of `TextDocument` #231

New Feature Summary

Related

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

proposing subtypes of TextDocument #231

Description

New Feature Summary

Related

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

proposing subtypes of `TextDocument` #231