Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotation lists with different target specificities (word, line, paragraph) #758

Closed
glenrobson opened this issue Mar 24, 2016 · 10 comments
Closed

Comments

@glenrobson
Copy link
Member

How do I share multiple version of the same annotation list with different specificities so for example I might have:

  • a word level annotation list for harvesting by Europeana
  • a line level annotation list for use in Mirador
  • a paragraph annotation list for OCR correction.

I would like to be able to link to these options to allow the client to decide which ones they want to use.

@zimeon zimeon changed the title Annotation lists with different traget specificities (word, line, paragraph) Annotation lists with different target specificities (word, line, paragraph) Mar 24, 2016
@jpstroop
Copy link
Member

The only known spec for hOCR:

https://docs.google.com/document/d/1QQnIQtvdAC_8n92-LhwPcjtAUFwBlzE8EWnKAxlgVf0/preview

Typesetting elements seem relevant, though they're not consistent with any hOCR that I can find..

Sample:

    <div class='ocr_page' id='page_1' title='image "crowley/ah8_Vol01_0005_original.jpg"; bbox 0 0 5018 3668; ppageno 0'>
      <div class='ocr_carea' id='block_1_1' title="bbox 449 291 2388 3013">
        <p class='ocr_par' dir='ltr' id='par_1_1' title="bbox 474 291 2388 656">
          <span class='ocr_line' id='line_1_1' title="bbox 474 291 2388 372; baseline -0.009 -13">
            <span class='ocrx_word' id='word_1_1' lang='ita' title='bbox 474 356 476 359; x_wconf 83'>
              <strong>.</strong>
            </span>
            <span class='ocrx_word' dir='ltr' id='word_1_2' lang='ita' title='bbox 498 308 637 365; x_wconf 88'>
              <strong>detto,</strong>
            </span>
            <span class='ocrx_word' dir='ltr' id='word_1_3' lang='ita' title='bbox 649 307 675 357; x_wconf 90'>
              <strong>il</strong>
            </span>
            <span class='ocrx_word' dir='ltr' id='word_1_4' lang='ita' title='bbox 690 305 948 368; x_wconf 79'>
              <strong>_Sanl&#39;ouino,</strong>
            </span>

@zimeon
Copy link
Member

zimeon commented Mar 24, 2016

From discussion: Suggest addition of a hint (likely not called hintyMcHint) to an Annotation List that specifies a list of resources and their types that are referred to in the set of annotations in the annotation list, irrespective of whether these resources are referred to by the annotations, bodies or targets, and without indication of the predicate connecting these resources. There should be a set of strings defined in the context as shorthand to deal with the particular granularity use case (perhaps "page", "area", "paragraph", "line", "word", see #764). Example might be:

"hintyMcHint" : [
  "line",                    // the annotations are at line level, expands to some URI
  { "@id": "http://example.org/canvas/1", 
    "@type": "sc:Canvas" },
  { "@id": "http://example.org/canvas/2", 
    "@type": "sc:Canvas" }  // canvas/1 and canvas/2 are annotated (see #754)
]

@azaroth42
Copy link
Member

Propose defer along with all related issues.

@azaroth42 azaroth42 modified the milestone: Presentation 2.2 Apr 13, 2016
@azaroth42 azaroth42 removed the defer label Oct 7, 2016
@azaroth42 azaroth42 modified the milestones: Presentation 2.2, Presentation 3.0 Sep 20, 2017
@zimeon zimeon removed this from the Presentation 3.0 milestone Sep 20, 2017
@zimeon
Copy link
Member

zimeon commented Sep 20, 2017

Have removed Presentation 3.0 milestone for now because we don't know when the Text Granularity TSG will conclude with a recommendation

@zimeon zimeon added this to the Presentation 3.0 milestone Sep 28, 2017
@azaroth42
Copy link
Member

@zimeon Do you recall why we added it back into 3.0 milestone?

@zimeon
Copy link
Member

zimeon commented Nov 7, 2017

@azaroth42 I think the hope (and likelihood) is that the Text Granularity TSG will conclude with a recommendation in time for 3.0

@azaroth42 azaroth42 removed this from the Presentation 3.0 milestone Apr 12, 2018
@glenrobson
Copy link
Member Author

Please leave comments here on the proposed extension which is available at:

https://preview.iiif.io/api/text-granularity/api/extension/text-granularity/

@zimeon
Copy link
Member

zimeon commented Oct 9, 2019

TRC review: IIIF/trc#30

@zimeon
Copy link
Member

zimeon commented May 13, 2020

Closing, handled by Text Granularity Extension

@zimeon zimeon closed this as completed May 13, 2020
@glenrobson
Copy link
Member Author

glenrobson commented May 13, 2020

Should this be left open as a search 3.0 issue? With the text-granularity we can say if an annotation is word/line paragraph but not on an annotation list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants