vocab-apis

API resources

Here's a quick summary of the endpoints I tend to use, and some of their documentation.

vocabulary	endpoint	API Documentation
AAT	http://vocab.getty.edu/sparql	Getty Vocabularies: SPARQL endpoint
Europeana	https://www.europeana.eu/api/	Europeana Record API
FAST (read)	http://id.worldcat.org/fast	FAST Linked Data API
FAST (Autosuggest)	http://fast.oclc.org/searchfast/fastsuggest	FAST Linked Data API
FAST (SRUSearch)	http://id.worldcat.org/fast/search	FAST Linked Data API
FAST (search, actually the best results)	http://experimental.worldcat.org/fast/search	not documented as an official endpoint idk?
GeoNames	https://sws.geonames.org/	GeoNames Web Services Documentation
Internet Archive	http://archive.org/metadata/	Internet Archive Developer Portal
Library of Congress Authorities	http://id.loc.gov/authorities/	LOC Linked Data Service: Technical Center
VIAF	http://www.viaf.org/viaf/	VIAF Authority Cluster Resource

Python resources

python library	main purpose	docs
bs4	Parses XML	https://www.crummy.com/software/BeautifulSoup/bs4/doc/
pandas	Everything tabular data	https://pandas.pydata.org/docs/user_guide/index.html
requests	Sends HTTP requests	https://docs.python-requests.org/en/master/
rdflib	Parses RDF/XML, N3, NTriples, Turtle, etc.	https://rdflib.readthedocs.io/en/stable/index.html
xml.etree.ElementTree	Parses XML	https://docs.python.org/3/library/xml.etree.elementtree.html#module-xml.etree.ElementTree

Scripts

searchForStringMatch

authorizeFASTAndLCAuthorityHeadings.py

Starting data: A spreadsheet that searches a string heading in LCNAF and FAST and produces a URI if there is an exact match.
APIs: Library of Congress Authorities, FAST (SRUSearch)

Confirms the heading is authorized by retrieving the URIs and label from the APIs.

searchForStringMatchInFAST.py

Starting data: A spreadsheet with strings of possible FAST headings.
APIs: FAST (Autosuggest), FAST (SRUSearch)

Finds exact and close matches to FAST subject headings.

getItemMetadata

getEuropeanaData.py

Starting data: Europeana item identifier as variable item.
APIs: Europeana

Downloads item record in JSON-LD, and saves as file "query.json."

getItemMetadataFromInternetArchive.py

Starting data: Internet Archive item identifier as variable internet_id.
APIs: Internet Archive

Downloads item record in JSON and saves metadata in CSV.

getPropertiesWikiData.py

Starting data: Entity id from WikiData.
APIs: Wikidata

Finds properties of entity and saves in CSV.

Convert

convertFASTAndVIAFIdentifiersToURI.py

Starting data: A spreadsheet with FAST or VIAF identifiers.
APIs: none

Converts FAST and VIAF identifiers to URIs.

convertLCNAFToGeonames.py

Starting data: A spreadsheet with geographic headers (from FAST or LCNAF).
APIs: FAST (read), Library of Congress Authorities, GeoNames

Convert geographic names from LCNAF to geonames identifiers. Example: Baltimore County (Md.) n79018713 is converted to Baltimore County https://www.geonames.org/4347790. It also builds full hierarchical name: Baltimore County, Maryland, United States from GeoNames.

convertLSCHToFAST.py

Starting data: A spreadsheet with Library of Congress Subject Headings.
APIs: FAST (read), Library of Congress Authorities

Converts LCSH to one or more FAST headings.

convertYearsToFASTDecades.py

Starting data: A spreadsheet with year dates from 1800s onwards.
APIs: none

Converts years into written out decades as given in FAST.

getAdditionalPropertiesFromIdentifiers

getAlternativeIdentifiersFromFAST.py

Starting data: A spreadsheet with FAST identifiers.
APIs: FAST (read)

Retrieves alternative identifiers from other authorities (VIAF, GeoNames, LCSH, etc.) given in FAST records.

getFacetForTerms_FAST.py

Starting data: A spreadsheet with FAST identifiers.
APIs: FAST (read)

Converts the FAST identifier to a link, gets the rdf.xml record, and extracts the facet information (topical, geographical, corporate name, meeting or event, personal name, uniform title, form, period).

getFacetForTerms_VIAF.py

Starting data: A spreadsheet of VIAF URIs formatted like https://viaf.org/viaf/149920363. The script won't work if there is an ending dash (ex: https://viaf.org/viaf/149920363/).
APIs: VIAF, Library of Congress Authorities

Takes a list of VIAF URIs from a spreadsheet, finds the LCNAF authority record, and extracts the facet information from the rdf.xml record.

getLabelFromURI.py

Starting data: A spreadsheet with URIs from the FAST, Library of Congress Authorities, GeoNames, VIAF, or AAT vocabularies.
APIs: FAST (read), Library of Congress Authorities, GeoNames, VIAF, AAT

Retrieves the authorized heading or label from the correct vocabulary using the URIs.

getNameComponents_VIAF.py

Starting data: A spreadsheet of VIAF URIs formatted like https://viaf.org/viaf/149920363. The script won't work if there is an ending dash (ex: https://viaf.org/viaf/149920363/).
APIs: VIAF, Library of Congress Authorities

Takes a list of VIAF URIs from a spreadsheet, finds the LCNAF authority record, and extracts the name components from the .marcxml.xml record.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.idea		.idea
convert		convert
getAdditionalPropertiesFromIdentifier		getAdditionalPropertiesFromIdentifier
getItemMetadata		getItemMetadata
getURIsFromLabel		getURIsFromLabel
searchForStringMatch		searchForStringMatch
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

jhu-library-applications/vocab-apis

Folders and files

Latest commit

History

Repository files navigation

vocab-apis

API resources

Python resources

Scripts

searchForStringMatch

getItemMetadata

Convert

getAdditionalPropertiesFromIdentifiers

Get URIs from authorized headings or labels

About

Topics

Resources

Stars

Watchers

Forks

Languages