Peabody Notecard Pipeline
The peabody archeological museum has thousands of typewritten notecards that index all objects in their possession. In order to help them with their indexing, I created an automatic OCR (Optical Character Recognition) pipeline. It takes all the notecards and converts them to an easily searchable csv. I further then helped in the creation of an internal, django website that allows for correction by workduty students. That code is not currently posted, but I can if interest is expressed — a photo of the website is below.
"Name": 'Butt of arrowhead',
"Site": 'Squibnocket Cliff.',
"Locality": 'Squibnocket Head, southwest side of Martha's Vineyard, Mass.',
"Situation": 'On sand under shell just south of stake 1.', "Remarks": '', "Figured": ''}