# Metadaten via DOI von DataCite abfragen
[DataCite Webseite](https://datacite.org/)  
[Manuelle Suche](https://commons.datacite.org/)  

- Suche nach git in "Works" 
    - Filter: Dataset
- Unter **Downloads** DataCite JSON öffnen (unter Firefox)
- Datenset anschauen

[DataCite REST API](https://support.datacite.org/docs/api)  


In [26]:
import urllib.request
import json

base_url = "https://api.datacite.org/application/vnd.datacite.datacite+json/"
doi = "10.5281/zenodo.5563162"
full_url = base_url + doi

In [27]:
doi_json_dataset = urllib.request.urlopen(full_url).read()

In [28]:
doi_dataset = json.loads(doi_json_dataset)

In [29]:
doi_dataset

{'id': 'https://doi.org/10.5281/zenodo.5563162',
 'doi': '10.5281/ZENODO.5563162',
 'url': 'https://zenodo.org/record/5563162',
 'types': {'ris': 'DATA',
  'bibtex': 'misc',
  'citeproc': 'dataset',
  'schemaOrg': 'Dataset',
  'resourceTypeGeneral': 'Dataset'},
 'creators': [{'name': 'Banda, Juan M.',
   'givenName': 'Juan M.',
   'familyName': 'Banda',
   'affiliation': [{'name': 'Georgia State University'}],
   'nameIdentifiers': [{'schemeUri': 'https://orcid.org',
     'nameIdentifier': 'https://orcid.org/0000-0001-8499-824X',
     'nameIdentifierScheme': 'ORCID'}]},
  {'name': 'Tekumalla, Ramya',
   'givenName': 'Ramya',
   'familyName': 'Tekumalla',
   'affiliation': [{'name': 'Georgia State University'}],
   'nameIdentifiers': [{'schemeUri': 'https://orcid.org',
     'nameIdentifier': 'https://orcid.org/0000-0002-1606-4856',
     'nameIdentifierScheme': 'ORCID'}]},
  {'name': 'Wang, Guanyu',
   'givenName': 'Guanyu',
   'familyName': 'Wang',
   'affiliation': [{'name': 'Universit

In [5]:
type(doi_dataset)

dict

In [6]:
doi_dataset.keys()

dict_keys(['id', 'doi', 'url', 'types', 'creators', 'titles', 'publisher', 'container', 'subjects', 'contributors', 'dates', 'publicationYear', 'language', 'identifiers', 'sizes', 'formats', 'version', 'rightsList', 'descriptions', 'geoLocations', 'fundingReferences', 'relatedIdentifiers', 'relatedItems', 'schemaVersion', 'providerId', 'clientId', 'agency', 'state'])

## Access the year of the publication

In [7]:
# Access the year of the publication
print(doi_dataset["publicationYear"])

2021


## Access the title of the publication

In [8]:
## Getting title within a list with a dictionary inside
print(doi_dataset["titles"])

[{'title': 'A large-scale COVID-19 Twitter chatter dataset for open scientific research - an international collaboration'}]


In [9]:
## Getting the first element out of the list, which is the dictionary
print(doi_dataset["titles"][0])

{'title': 'A large-scale COVID-19 Twitter chatter dataset for open scientific research - an international collaboration'}


In [10]:
# Access the key of the dictionary which is the title of the publication
print(doi_dataset["titles"][0]["title"])

A large-scale COVID-19 Twitter chatter dataset for open scientific research - an international collaboration


## Access creator of the publication

In [11]:
## Access keys from creators list
doi_dataset["creators"][0].keys()

dict_keys(['name', 'givenName', 'familyName', 'affiliation', 'nameIdentifiers'])

In [12]:
# Access creator of the publication
print(doi_dataset["creators"][0]["name"])

Banda, Juan M.


In [13]:
# next creator
print(doi_dataset["creators"][1]["name"])

Tekumalla, Ramya


In [15]:
# access only givenName
print(doi_dataset["creators"][0]["givenName"])

Juan M.


In [16]:
# Same approach as above for a larger set of DOIs
# Exercise: Show doi, title, publisher
# Bonus exercise: publicationYear 
dois = ["10.6084/m9.figshare.155613",
        "10.6084/m9.figshare.153821.v1",
        "10.7490/f1000research.1115338.1",
        "10.5281/zenodo.2599866"]

for doi in dois:
    doi_json_dataset = urllib.request.urlopen(base_url + doi).read()
    doi_dataset = json.loads(doi_json_dataset)
    print(doi_dataset["titles"][0]["title"])
    print("- " + doi)
    print("- " + doi_dataset["publisher"])
    print("- " + str(doi_dataset["publicationYear"]))
    print("-----------------")

git repository for paper on git and reproducible science
- 10.6084/m9.figshare.155613
- figshare
- 2013
-----------------
git can facilitate greater reproducibility and increased transparency in science
- 10.6084/m9.figshare.153821.v1
- figshare
- 2013
-----------------
The role of the German National Library for Life Sciences ZB MED in the approach to a FAIR Research Data Infrastructure in Agricultural Science embedded in the Life Sciences
- 10.7490/f1000research.1115338.1
- F1000 Research Limited
- 2018
-----------------
Nachnutzbare Awarenessmaterialien für Forschungsdatenmanagement (FDM)
- 10.5281/zenodo.2599866
- Zenodo
- 2019
-----------------


## just in case someone asks for all creators
**If nobody asks please skipt this part**

In [30]:
# just in case someone asks for all creators
for creator in doi_dataset["creators"]:
    print(creator["name"])

Banda, Juan M.
Tekumalla, Ramya
Wang, Guanyu
Yu, Jingyuan
Liu, Tuo
Ding, Yuning
Artemova, Katya
Tutubalina, Elena
Chowell, Gerardo


In [31]:
# just in case someone asks for all creators
# list comprehension - creating a list out of an existing list
creator_names = [creator["name"] for creator in doi_dataset["creators"]]

In [32]:
creator_names

['Banda, Juan M.',
 'Tekumalla, Ramya',
 'Wang, Guanyu',
 'Yu, Jingyuan',
 'Liu, Tuo',
 'Ding, Yuning',
 'Artemova, Katya',
 'Tutubalina, Elena',
 'Chowell, Gerardo']