GitHub - harrylloyd-bl/convert-a-card: Generating ingestible records from catalogue cards as part of the Convert-a-Card project

This repo is part of the Convert-a-Card project to convert catalogue cards, primarily from the Asian and African Studies Reading Room at the BL. Code here consumes xml files produced from card transcription by Transkribus, parses the xml to extract card title/author/shelfmark the queries OCLC Worldcat to see if a matching record exists.

The prototype for this repo was created by Giorgia Tolfo and Victoria Morris and has been developed to working stage by Harry LLoyd. Readme by HL.

Structure

├── README.md           <- The top-level README for developers using this project.  
├── data  
│   ├── processed       <- The final, canonical data sets  
│   └── raw             <- The original, immutable data dump.  
│  
├── notebooks           <- Jupyter notebooks.  
│  
├── reports             <- Generated analysis as HTML, PDF, LaTeX, etc.  
│   └── figures         <- Generated graphics and figures to be used in reporting  
│  
├── hide_env.txt        <- hidden conda/mamba environment to force streamlit recreate from requirements.txt, recreated via e.g. conda create -f environment.yml
├── requirements.txt	<- pip freeze environment solely for streamlit use
│  
├── src                 <- Source code for use in this project.  
│   ├── __init__.py     <- Makes src a Python module  
│   │  
│   ├── data            <- Scripts to download or generate data  
│   │   └── oclc.py     <- wrappers for Zoom queries to query OCLC Worldcat
│   │   └── xml_extraction.py   <- extract labelled text from xml files 
│
├── tests               <- pytest unit tests for src  
│
└── z3950               <- combined PyZ3950 and PyMARC modules to run Z3950 queries and display MARC records

Name		Name	Last commit message	Last commit date
Latest commit History 147 Commits
data/raw		data/raw
notebooks		notebooks
src		src
.gitignore		.gitignore
Readme.md		Readme.md
cfg.py		cfg.py
do-not-use-environment.yml		do-not-use-environment.yml
main.py		main.py
oclc_api.py		oclc_api.py
requirements.txt		requirements.txt
sidebar_docs.txt		sidebar_docs.txt
streamlit_record_selection.py		streamlit_record_selection.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data/raw

data/raw

notebooks

notebooks

src

src

.gitignore

.gitignore

Readme.md

Readme.md

cfg.py

cfg.py

do-not-use-environment.yml

do-not-use-environment.yml

main.py

main.py

oclc_api.py

oclc_api.py

requirements.txt

requirements.txt

sidebar_docs.txt

sidebar_docs.txt

streamlit_record_selection.py

streamlit_record_selection.py

Repository files navigation

About

Releases

Packages

Languages

harrylloyd-bl/convert-a-card

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Languages