Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a content admin, I want scholarship records exported to github so that there is a publicly accessible, versioned copy of project data available for researchers. #1100

Closed
10 tasks done
rlskoeser opened this issue Sep 19, 2022 · 6 comments
Assignees

Comments

@rlskoeser
Copy link
Contributor

rlskoeser commented Sep 19, 2022

testing notes

test the following scenarios for exporting to CSV:

  • test exporting a subset of sources (scholarship records) from django admin
  • test exporting all sources from django admin
  • test exporting a subset of footnotes from django admin
  • test exporting all footnotes from django admin (likely slow because includes transcription content; please report if it finishes and if it's usable or not)

retesting notes

  • confirm that footnotes with no transcription content don't display "None" in the content field
  • confirm that opening the downloaded CSV in Excel automatically reads transcription content properly

dev notes

@rlskoeser rlskoeser added this to the CDH/PGP end of grant year 2 milestone Sep 19, 2022
@rlskoeser rlskoeser self-assigned this Nov 8, 2022
rlskoeser added a commit that referenced this issue Nov 9, 2022
)

* Rewrite source tabular export to use new exporter class

ref #1100 #278

* Document methods on new SourceQuerySet class

* Revise tests for admin csv export to not use mocks
@rlskoeser rlskoeser added the 🗜️ awaiting testing Implemented and ready to be tested label Nov 11, 2022
@kseniaryzhova
Copy link

@rlskoeser so the footnote download did not have transcription content aside from the fact that transcriptions exist or not and the download was super quick (for all footnotes). Is the csv for footnotes supposed to have actual transcription content?

@rlskoeser
Copy link
Contributor Author

@kseniaryzhova I should have probably given you links, just to make sure we were talking about the same things.

Were you testing the footnotes export from this page? https://test-geniza.cdh.princeton.edu/admin/footnotes/footnote/

It looks to me like it does include the text content of the transcription — happy to remove it if it's not useful / important to have here or to have in this format.

And the sources export should be tested from this page: https://test-geniza.cdh.princeton.edu/admin/footnotes/source/

@kseniaryzhova
Copy link

@rlskoeser I tried it again with the links you provided - sources look good, they have the URLs, etc. But I'm still not seeing the transcription content for footnotes.
image

@rlskoeser
Copy link
Contributor Author

@kseniaryzhova so weird!

Would you try downloading the subset from this link and report back?
https://test-geniza.cdh.princeton.edu/admin/footnotes/footnote/?q=5454
select them and then use the action -> export selected to csv -> go

I'm getting None for content for the first one and transcription content for the second one. (We need to fix the None at least, but I'm still not sure if we should have the content here or not – if it continues to not work for you maybe we should drop it.)

I tried exporting all footnotes and I think it failed; it was definitely slow. So that slowness might be enough reason to drop the content from this export, since we have the content in the annotation backup repo already.

@rlskoeser
Copy link
Contributor Author

discussed with @kseniaryzhova and finally figured out what's going on here:

  • Excel limits rows to one line by default, so it's hard to see if there's any content (if anything you probably only see an english language label like recto or verso)
  • Excel is not reading as UTF-8, so transcription content is garbage

changes needed for this to be acceptable:
— should not display None when there is no content
— should have byte order mark so that Excel will automatically read as unicode

@rlskoeser rlskoeser added ⚠️ tested needs attention Has been through acceptance testing and needs additional work 🗜️ awaiting testing Implemented and ready to be tested and removed 🗜️ awaiting testing Implemented and ready to be tested ⚠️ tested needs attention Has been through acceptance testing and needs additional work labels Nov 15, 2022
@kseniaryzhova
Copy link

@rlskoeser works, closing!

@rlskoeser rlskoeser removed the 🗜️ awaiting testing Implemented and ready to be tested label Nov 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants