As an admin, I want a way to reproducibly generate a full-text corpus of all public PPA content in order to support computational research on PPA materials. #556

quadrismegistus · 2023-10-30T18:29:36Z

Adapt Vineet's script to export plain text corpus

mnaydan · 2023-11-30T15:56:27Z

jerielizabeth · 2023-12-20T19:01:06Z

Skipped test for works with no pages (id: uga1.32108002998303) because suppressed during staging set up.

jerielizabeth · 2023-12-20T20:04:18Z

Moving additional testing to the related bug issues. any changes where we need to test this script should be batched due to testing effort.

jerielizabeth · 2023-12-20T20:04:39Z

all tests passed!! 🎊

mnaydan mentioned this issue Nov 30, 2023

Complete first draft of script to pull full-text directly from Solr rather than local copy #561

Closed

quadrismegistus mentioned this issue Dec 1, 2023

Generate plain text corpus via export command #562

Merged

rlskoeser assigned quadrismegistus Dec 13, 2023

jerielizabeth closed this as completed Dec 20, 2023

rlskoeser changed the title ~~Adapt Vineet's script to export plain text corpus~~ As an admin, I want a way to reproducibly generate a full-text corpus of all public PPA content in order to support computational research on PPA materials. Jan 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

As an admin, I want a way to reproducibly generate a full-text corpus of all public PPA content in order to support computational research on PPA materials. #556

As an admin, I want a way to reproducibly generate a full-text corpus of all public PPA content in order to support computational research on PPA materials. #556

quadrismegistus commented Oct 30, 2023 •

edited by rlskoeser

mnaydan commented Nov 30, 2023 •

edited by jerielizabeth

jerielizabeth commented Dec 20, 2023

jerielizabeth commented Dec 20, 2023

jerielizabeth commented Dec 20, 2023

As an admin, I want a way to reproducibly generate a full-text corpus of all public PPA content in order to support computational research on PPA materials. #556

As an admin, I want a way to reproducibly generate a full-text corpus of all public PPA content in order to support computational research on PPA materials. #556

Comments

quadrismegistus commented Oct 30, 2023 • edited by rlskoeser

mnaydan commented Nov 30, 2023 • edited by jerielizabeth

jerielizabeth commented Dec 20, 2023

jerielizabeth commented Dec 20, 2023

jerielizabeth commented Dec 20, 2023

quadrismegistus commented Oct 30, 2023 •

edited by rlskoeser

mnaydan commented Nov 30, 2023 •

edited by jerielizabeth