GitHub - MITLibraries/archivesspace-api-python-scripts: Scripts for performing various tasks with the ArchivesSpace API

Python scripts used to perform various tasks with the ArchivesSpace API

Authenticating to the API

All of these scripts require a secrets.py file in the same directory that must contain the following text:

baseURL='[ArchivesSpace API URL]'
user='[user name]'
password='[password]'
repository='[repository]'

This secrets.py file will be ignored according to the repository's .gitignore file so that ArchivesSpace login details will not be inadvertently exposed through GitHub.

If you are using both a development server and a production server, you can create a separate secrets.py file with a different name (e.g. secretsProd.py) and containing the production server information. When running each of these scripts, you will be prompted to enter the file name (e.g 'secretsProd' without '.py') of an alternate secrets file. If you skip the prompt or incorrectly type the file name, the scripts will default to the information in the secrets.py file. This ensures that you will only access the production server if you really intend to.

Scripts

addBibNumbersAndPost.py

Based on a specified CSV file with URIs and bib numbers, posts the specified bib number to the ['user_defined]['real_1'] field for record specified by the URI.

dateCheck.py

Retrieves 'begin,' 'end,' 'expression,' and 'date_type' for all dates associated with all resources in a repository

eadToCsv.py

Based on a specified file name and a specified file path, extracts selected elements from an EAD XML file and prints them to a CSV file.

getAccessionUDFs.py

Retrieves all of the user-defined fields from all of the accessions in the specified repository.

getAccessions.py

Retrieves all of the accessions from a particular repository into a JSON file.

getAllArchivalObjectTitles.py

Retrieves titles from all archival objects in a repository. Upon running the script, you will be prompted enter the resource ID (just the number, not the full URI).

getArchivalObjectCountByResource.py

Retrieves a count of archival objects associated with a particular resource. Upon running the script, you will be prompted enter the resource ID (just the number, not the full URI).

getArchivalObjectsByResource.py

Extracts all of the archival objects associated with a particular resource. Upon running the script, you will be prompted enter the resource ID (just the number, not the full URI).

getArchivalObjectRefIdsForResource.py

Extracts the title, URI, ref_id, date expression, and level for all archival objects associated with a particular resource. Upon running the script, you will be prompted enter the resource ID (just the number, not the full URI).

getArrayPropertiesFromAgentsPeopleCSV.py

Retrieves specific properties, including proprerties that have arrays as values, from the JSON of ArchivesSpace agent_people records. In this example, the 'dates_of existence' property contains an array that must be iterated over. This requires a second level of iteration with 'for j in range (...)' on line 20, which is in addition to the iteration function 'for i in range (...)' on line 19, which was also found in the getPropertiesFromAgentsPeopleCSV.py script. As with the previous script, it also writes the properties' values into a CSV file which is specified in variable 'f' on line 17.

getPropertiesFromAgentsPeopleCSV.py

Retrieves specific properties from the JSON of ArchivesSpace agent_people records into a CSV file which is specified in variable 'f' on line 17. In this example, the script retrieves the 'uri,' 'sort_name,' 'authority_id,' and 'names' properties from the JSON records by iterating through the JSON records with the function 'for i in range (...)' on line 19. The f.writerow(....) function on line 20 specifies which properties are retrieved from the JSON and the f.writerow(....) on line 18 specifies header row of the CSV file.

getPropertiesFromResources.py

Extracts select properties from all resources in the repository.

getPropertiesFromSingleResource.py

Based on user input, extracts select properties from the specified resource.

getResources.py

Retrieves all of the resources from a particular repository into a JSON file which is specified in variable 'f' on line 16. This GET script can be adapted to other record types by editing the 'endpoint' variable on line 13 (e.g. 'repositories/[repo ID]/accessions' or 'agents/corporate_entities').

getSingleRecord.py

Based on user input, retrieves a single ArchivesSpace record based on the specified record's 'uri.'

getTopContainerCountByResource.py

Retrieves a count of top containers associated with archival objects associated with a particular resource. Upon running the script, you will be prompted enter the resource ID (just the number, not the full URI).

getTopContainerCountByResourceNoAOs.py

Retrieves a count of top containers directly associated (not through an archival object) with a particular resource. Upon running the script, you will be prompted enter the resource ID (just the number, not the full URI).

getTopContainers.py

Retrieves all of the top containers from a particular repository into a JSON file.

getUrisAndIds.py

For the specified record type, retrieves URI and the 'id_0,' 'id_1,' 'id_2,' 'id_3,' and a concatenated version of all the 'id' fields.

modifyDigitalObjectUrls.py

Based on user input, replaces a string in the URLs in both the 'Identifier' and 'File URI' fields for digital objects across the repository.

postContainersFromCSV.py

Creates instances (consisting of top_containers) from a separate CSV file. The CSV file should have two columns, indicator and barcode. The directory where this file is stored must match the directory in the filePath variable. The script will prompt you first for the exact name of the CSV file, and then for the exact resource or accession to attach the containers to.

postContainerLinksToRecords.py

Based on user input, posts containers to a specified record based on a specified CSV file.

postContainerLinksToRecordsFromCSV.py

Based on user input, posts containers to a specified record based on a specified CSV file of top container and resource URIs.

postCorporateAgentsFromCSV.py

Based on user input, posts corporate agents based on a specified CSV file.

postFamilyAgentsFromCSV.py

Based on user input, posts family agents based on a specified CSV file.

postNew.py

Posts new records to a generic API endpoint based the record type, 'agents/people' in this example. This script can be modified to accommodate other data types (e.g. 'repositories/[repo ID]/resources' or 'agents/corporate_entities'). It requires a properly formatted JSON file (specified where [JSON File] appears in the 'records' variable on line 13) for the particular ArchivesSpace record type you are trying to post.

postOverwrite.py

Overwrites existing ArchivesSpace records based the 'uri' and can be used with any ArchivesSpace record type (e.g. resource, accession, subject, agent_people, agent_corporate_entity, archival_object, etc.). It requires a properly formatted JSON file (specified where [JSON File] appears in the 'records' variable on line 13) for the particular ArchivesSpace record type you are trying to post.

Name		Name	Last commit message	Last commit date
Latest commit History 191 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
addBibNumbersAndPost.py		addBibNumbersAndPost.py
dateCheck.py		dateCheck.py
eadToCsv.py		eadToCsv.py
getAccessionUDFs.py		getAccessionUDFs.py
getAccessions.py		getAccessions.py
getAllArchivalObjectTitles.py		getAllArchivalObjectTitles.py
getArchivalObjectCountByResource.py		getArchivalObjectCountByResource.py
getArchivalObjectRefIdsForResource.py		getArchivalObjectRefIdsForResource.py
getArchivalObjectsByResource.py		getArchivalObjectsByResource.py
getArrayPropertiesFromAgentsPeopleCSV.py		getArrayPropertiesFromAgentsPeopleCSV.py
getPropertiesFromAgentsPeopleCSV.py		getPropertiesFromAgentsPeopleCSV.py
getPropertiesFromResources.py		getPropertiesFromResources.py
getPropertiesFromSingleResource.py		getPropertiesFromSingleResource.py
getResources.py		getResources.py
getSingleRecord.py		getSingleRecord.py
getTopContainerCountByResource.py		getTopContainerCountByResource.py
getTopContainerCountByResourceNoAOs.py		getTopContainerCountByResourceNoAOs.py
getTopContainers.py		getTopContainers.py
getUrisAndIds.py		getUrisAndIds.py
modifyDigitalObjectUrls.py		modifyDigitalObjectUrls.py
postContainerLinksToRecords.py		postContainerLinksToRecords.py
postContainerLinksToRecordsFromCSV.py		postContainerLinksToRecordsFromCSV.py
postContainersFromCSV.py		postContainersFromCSV.py
postCorporateAgentsFromCSV.py		postCorporateAgentsFromCSV.py
postFamilyAgentsFromCSV.py		postFamilyAgentsFromCSV.py
postNew.py		postNew.py
postOverwrite.py		postOverwrite.py
postPeopleAgentsFromCSV.py		postPeopleAgentsFromCSV.py
postRightsStatementsToRecords.py		postRightsStatementsToRecords.py
postSubjectsFromCSV.py		postSubjectsFromCSV.py
publishAOs.py		publishAOs.py
publishAgents.py		publishAgents.py
publishResources.py		publishResources.py
pull-request-template.md		pull-request-template.md
resourcesWithBibNum.py		resourcesWithBibNum.py
resourcesWithNoBibNum.py		resourcesWithNoBibNum.py
searchForUnassociatedContainers.py		searchForUnassociatedContainers.py
transferAoDatesToDos.py		transferAoDatesToDos.py
unpublishArchivalObjectsByResource.py		unpublishArchivalObjectsByResource.py
unpublishNotesOnAOs.py		unpublishNotesOnAOs.py
updateResourceWithAgentOrSubjectLinks.py		updateResourceWithAgentOrSubjectLinks.py
updateResourceWithCSV.py		updateResourceWithCSV.py

License

MITLibraries/archivesspace-api-python-scripts

Folders and files

Latest commit

History

Repository files navigation

Authenticating to the API

Scripts

About

Resources

License

Stars

Watchers

Forks

Languages