Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harvest data from ANU OAI-PMH #21

Closed
lalewis1 opened this issue Apr 16, 2024 · 4 comments
Closed

Harvest data from ANU OAI-PMH #21

lalewis1 opened this issue Apr 16, 2024 · 4 comments
Assignees

Comments

@lalewis1
Copy link
Contributor

lalewis1 commented Apr 16, 2024

https://openresearch-repository.anu.edu.au/oai/request?verb=ListSets

Start with the ANU Thesis sets and then anything else with ANU in the title.

It is a Dspace server.

@lalewis1 lalewis1 self-assigned this Apr 16, 2024
@lalewis1
Copy link
Contributor Author

data can be downloaded in batches per set as rdf using the following syntax:

https://openresearch-repository.anu.edu.au/oai/request?verb=ListRecords&resumptionToken=rdf///com_1885_1/0

where results are given in batches of 100 starting from record 0 (specified at the end of the url), and com_1885_1 is the set identifier (sets are like collections of records, i.e. ANU Research set)

@lalewis1
Copy link
Contributor Author

data harvested for all sets that start with ANU.
Initially the data has been converted with the same RDF mappings as given by the oai-pmh server. and most things mapped as literals.

mappings to be improved.

@lalewis1
Copy link
Contributor Author

mapped the dc:type literals to appropriate SDO.CreativeWork subclasses. bringing the data in line with the extracts from aries and the thesis dump.

@lalewis1
Copy link
Contributor Author

all pushed to graphdb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant