Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a content editor, I want transcription formatting preserved in search result display but ignored for search text so that I can see where in the transcription matching terms are. #1049

Closed
10 tasks done
rlskoeser opened this issue Aug 31, 2022 · 9 comments
Assignees

Comments

@rlskoeser
Copy link
Contributor

rlskoeser commented Aug 31, 2022

testing notes

  • transcription line numbers (when present) should display in the search results by default (no keyword search)
  • transcription line numbers (when present) should display with keywords in context if match is near the beginning of the document (won't always display if your match is from the middle of the text; is this acceptable?)
  • transcription keywords in context matches display the correct line number when the match is from the middle of the text

dev notes

  • index html instead of plain text
  • update search result formatting for list instead of plain text with line breaks
  • calculate and add line number to html when indexing
    • parse the HTML to look for ol > li and use "start" attribute if present
    • add either a data attribute or value attribute to each li, doesn't matter which
  • revise css to display pseudo-marker based on line number attribute for search only
  • update styles in admin that are suppressing line numbers
@rlskoeser rlskoeser self-assigned this Aug 31, 2022
@rlskoeser rlskoeser added the 🗜️ awaiting testing Implemented and ready to be tested label Sep 1, 2022
@richmanrachel
Copy link

richmanrachel commented Sep 1, 2022

transcription line numbers (when present) should display with keywords in context if match is near the beginning of the document (won't always display if your match is from the middle of the text; is this acceptable?)

  • I'm afraid this isn't acceptable -- it's really helpful as a researcher to have all the line numbers on the search page so that I can quickly jot down where to find the term in my spreadsheet/notes without having to go into the full document.

Also, it looks a bit messy and confusing to have a mixture of numbered and un-numbered entries:
image

@rlskoeser
Copy link
Contributor Author

ok! I wondered if it might not be sufficient, but wanted to know for sure before doing more work. Now that we're using proper ordered lists, we don't have the sequence number on each line by default, so we'll have to do a little bit of extra work to put it in so we can display it on the search version.

@rlskoeser rlskoeser added ⚠️ tested needs attention Has been through acceptance testing and needs additional work and removed 🗜️ awaiting testing Implemented and ready to be tested labels Sep 1, 2022
@richmanrachel
Copy link

Okay, thanks so much!

Also, just an oddity that's happening now with the formatting for results that are too far into the text for line numbers - there's a first line and then an indentation:
image
https://test-geniza.cdh.princeton.edu/en/documents/?q=%D7%9E%D7%9E%D7%9C%D7%95%D7%9B%D7%9A&docdate_0=&docdate_1=&sort=relevance

@kseniaryzhova
Copy link

@rlskoeser transcriptions which have numbered lines on the public site do not have them on the admin site. None of the documents with transcriptions on the public site (I've clicked through about 10) have numbered transcription lines.
image
image
PGPID 9044

@rlskoeser
Copy link
Contributor Author

ah, good catch, we forgot to update some styles in admin that are suppressing the numbers

@blms blms self-assigned this Sep 13, 2022
blms added a commit that referenced this issue Sep 14, 2022
blms added a commit that referenced this issue Sep 14, 2022
blms added a commit that referenced this issue Sep 14, 2022
blms added a commit that referenced this issue Sep 14, 2022
Refactor organization: move class out of models, use utility function
ref #1049
rlskoeser added a commit that referenced this issue Sep 15, 2022
…-lines

Add line numbers to transcription lines in indexed html (#1049)
@blms blms removed the ⚠️ tested needs attention Has been through acceptance testing and needs additional work label Sep 15, 2022
@rlskoeser rlskoeser added the 🗜️ awaiting testing Implemented and ready to be tested label Sep 15, 2022
@kseniaryzhova
Copy link

@rlskoeser transcription keywords in context do not display the line number when the match is from the middle of the text - I took the following example from the middle of line 10 but the correct line number does not appear in the search results (for the same document):
image

@blms
Copy link
Contributor

blms commented Sep 19, 2022

@kseniaryzhova Could you comment with the PGPID and the search query you used so I can test this and try to figure out why it's wrong? Thanks!

@rlskoeser rlskoeser added this to the CDH/PGP end of grant year 2 milestone Sep 19, 2022
@kseniaryzhova
Copy link

@blms sorry I only just saw this! It's PGPID 1223 and I used the name "הנכבד בן כגק"

@kseniaryzhova
Copy link

@rlskoeser works for me!

@rlskoeser rlskoeser removed the 🗜️ awaiting testing Implemented and ready to be tested label Sep 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants