Skip to content

OllyButters/flatten-bl-xml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 

Repository files navigation

flatten-bl-xml

Prototype flattening of XML from the scanned in books from the British Library.

This was part of the JISC funded AMASED project (2015), see doi:10.6084/m9.figshare.1319503.v4 and doi:10.6084/m9.figshare.1480941.v6 for more info.

The idea here is to take an XML file representing a scanned in page (ALTO format?), parse it to then build the relevant table structures in opal using its API, then to import all of the data into it. This is a proof of principle, so needs some work to tidy up (proper passwords, better error handling etc) before it could be used in production, but it worked on the sample of books we had.

See https://github.com/obiba/opal for info on opal.

Olly Butters

About

First attempt at flattening a BL XML doc for opal

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages