# Creating an institute profile 

Example of an institute profile using different data sources. Here, we are using ROR and GBIF data. 

In [3]:
import json 
import urllib.request as ur
import requests
from pygbif import registry
from IPython.display import Markdown as md

In [114]:
rorurl = "https://api.ror.org/organizations/https://ror.org/052d1a351"
response = ur.urlopen(rorurl)
data = json.loads(response.read())
#print(data)
name = data['name']
rorid = data['id']
links = data['links']
aliases = data['aliases']
grid  = data['external_ids']['GRID']['preferred']
wikidata  = data['external_ids']['Wikidata']['all'][0]

<b>

# PROFILE

In [149]:
md("<b> Museum Name: {}".format(name))

<b> Museum Name: Museum für Naturkunde

In [135]:
md("<b> ROR ID: {}".format(rorid))

<b> ROR ID: https://ror.org/052d1a351

In [136]:
md("<b> Museum Link: {}".format(links[0]))

<b> Museum Link: http://www.naturkundemuseum-berlin.de/en/

The GBIF example uses the dataset API and scopes it using the museum as the publisher of the dataset. Even though this is not collection description data, it illustrates the power of API and standards. 

In [141]:
md("<b> Collection Name  and Count")

<b> Collection Name  and Count

In [148]:
#https://www.gbif.org/dataset/search?publishing_org=10980920-6dad-11da-ad13-b8a03c50a862
# Museum für Naturkunde Berlin
jsondump=registry.dataset_search(publishing_org='10980920-6dad-11da-ad13-b8a03c50a862')
gbifresult=jsondump['results']
sum = 0
#print(gbifresult)
for i in gbifresult: 
    
    print(i['title']+ "  " + str(i['recordCount']))
    sum = i['recordCount'] + sum 

Neptune Deep-Sea Microfossil Occurrence Database  500808
MfN - Heteroptera Collection  103178
MfN - Fossil vertebrates IV  77414
Sphaeroceridae Collection  47499
MfN - Fossil invertebrates Ia  36845
Animal Sound Archive  35322
EDIT - ATBI in Mercantour/Alpi Marittime (France/Italy)  31680
MfN - Fossil invertebrates III  23319
MfN - Fossil Fish Collection  22516
MfN - Auchenorrhyncha Collection  17369
MfN - Fossil plants (Cenophytic)  16479
Anymals+plants - Citizen Science Data  15278
MfN - Fossil invertebrates IIb  14660
MfN - Fossil plants (Paleophytic)  13945
MfN - Mallophaga Collection  13736
SuLaMa bird occurence data  9210
EDIT - ATBI in Gemer area (Slovakia)  9140
MfN - Fossil plants (Mesophytic)  8532
MfN - Fossil vertebrates III  7346
Staatliches Naturhistorisches Museum Braunschweig - Coleoptera Collection  6821
Geologisch-Paläontologische Sammlung Universität Leipzig  5787
MfN - Fossil vertebrates V  4310
MfN - Phasmid Collection  3989
MfN - Diptera Collection  2970
MfN Fish 

In [147]:
md("<b>Total Specimens {}".format(sum))

<b>Total Specimens 1037274

At the moment we do not have an API to extract facilities and instrument information but this above
profile can be enhanced. Example (based on a test Cordra DO repository) 

In [162]:
#http://145.136.243.81:8080/objects/test/1980ff87b86230db8258

instr = "http://145.136.243.81:8080/objects/test/1980ff87b86230db8258"
response1 = ur.urlopen(instr)
data1 = json.loads(response1.read())
#print(data1)
instrumentName=data1['name']
instrumentDesc=data1['description']

## Faclities Information

In [154]:
md("<b> Museum Name: {}".format(name))

<b> Name: Museum für Naturkunde

In [157]:
md("<b> Instrument Name: {}".format(instrumentName))

<b> Instrument Name: Scanning Electron Microscope

In [164]:
md("<b> Instrument Description: {}".format(instrumentDesc))

<b> Instrument Description: FEI XL 30 ESEM. The new apparatus has a number of advantages, such as the possibility to work in wet mode and to record digitised images on floppy disks or CD ROM.


# Collection Summary 
Another schema (as Digital Object) 
Overview information about a collection and the organization holding it. 

In [43]:
collsummary = "http://145.136.243.81:8080/objects/?query=type:%22CollectionSummary%22"
response2 = ur.urlopen(collsummary)

data2 = json.loads(response2.read())

print("PRINT ONE RECORD")
print("++++++++++++++++")
print(json.dumps(data2['results'][4], indent=2))

print("++++++++++++++++")
print("++++++++++++++++")

print("NAME       " + "                    Total Count"    + " DNA-Bank    " + "    SEED-Bank")
collsumresult = data2['results']
for j in collsumresult: 
    
    print(j['content']['name'] + "  " + str(j['content']['total_specimens_all_coll']) + "   " +j['content']['dnabank']\
    + "              " + j['content']['seedbank'])
    

PRINT ONE RECORD
++++++++++++++++
{
  "id": "test/fa4522c39fb8c0358231",
  "type": "CollectionSummary",
  "content": {
    "id": "test/fa4522c39fb8c0358231",
    "name": "Museum fur Naturkunde",
    "abbv": "MfN",
    "grid": "https://www.grid.ac/institutes/grid.422371.1",
    "cetaf": "https://cetaf.org/berlin-natural-history-museum-0",
    "total_specimens_all_coll": 26351989,
    "dnabank": "yes",
    "seedbank": "no"
  },
  "metadata": {
    "createdOn": 1586209829913,
    "createdBy": "admin",
    "modifiedOn": 1586209829913,
    "modifiedBy": "admin",
    "txnId": 1586209829913175
  }
}
++++++++++++++++
++++++++++++++++
NAME                           Total Count DNA-Bank        SEED-Bank
Natural History Museum London  74460377   no              no
Naturhistorisches Museum Wien  28247450   yes              yes
Agentschap Plantentuin Meise  3800000   no              yes
Royal Belgian Institute of Natural Sciences  30650000   no              no
Museum fur Naturkunde  26351989   yes 