Skip to content

proposing subtypes of TextDocument #231

@keighrim

Description

@keighrim

New Feature Summary

With a number of recent development, I'd like to propose more vocab types that are subcategories of TextDocument (all names are tentative in the proposal)

  • Transcript: a subtype of text document, always aligned to annotations in non-linguistic modalities (audio, vision), and represent linguistic, and "literal" transcript of the source modality. (e.g. ASR, TR/OCR)
  • Translation/Transformation/Extraction: a subtype of text document, always aligned to another TextDocument-type annotations. The content of this annotation must be a kind "re-writing" of the source text document. (e.g. identity function in text-slicer, structural parsing in RFB, summary in text-summarizer apps)
  • Caption: this is similar to Transcript but the content is not "literal" transcript of the source modality (e.g. image-based captioning app, audio-based summarizer app handles non-linguistic sounds like dog barking)

Related

The addition of subtypes of text document will ease the identification of "app patterns" without relying on a specific app name, and hence help generalize I/O specs for any downstream/consumer applications.

The issue of view pattern identification has been raised many times, including

Alternatives

No response

Additional context

Also see clamsproject/app-role-filler-binder#4 for discussion on development of a prototype "app pattern".

Metadata

Metadata

Assignees

No one assigned

    Labels

    ✨NNew feature or request

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions