In [3]:
import pandas as pd
from connec_functions import GDB

### ICOS Carbon Portal

In [1]:
icos_url = "https://data.icos-cp.eu/portal/"

#### Notes/findings:
- 'classic' data portal --> intended for human navigation, not machine accessible
- semantics of search fields is clarified with '?' icons - though no reference to standard terms here
- metadata included links of 
  - ICOS entities (e.g. description of ICOS project itself) 
  - license
  - ...
- metadata of record is downloadable in various formats (including RDF/Turtle)
- rdf/turtle file of metadata:
  - limited description, not all info available in html view is described in triples
  - uses self-defined ontology rather than community standard ontologies such as prov-o, dcat,...
    - not known by machine upfront
    - requires getting to know the data model, location of ontology file,...

### ICOS Carbon Portal SPARQL Endpoint

In [5]:
# Define the SPARQL endpoint URL
sparql_endpoint_url = "https://meta.icos-cp.eu/sparql"
gdb = GDB(sparql_endpoint_url, "endpoint_queries")
gdb.execute_to_df("general.sparql") 
#curl https://meta.icos-cp.eu/sparql -X POST --data 'query=PREFIX%20cpmeta%3A%20%3Chttp%3A%2F%2Fmeta.icos-cp.eu%2Fontologies%2Fcpmeta%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX%20cpst%3A%20%3Chttp%3A%2F%2Fmeta.icos-cp.eu%2Fontologies%2Fstationentry%2F%3E%0ASELECT%0A%28coalesce%28%3FtcName%2C%20%3FhoName%29%20as%20%3FName%29%0A%3FTheme%0A%28coalesce%28%3FtcClass%2C%20%3FhoClass%29%20as%20%3FClass%29%0A%28coalesce%28%3FtcCountry%2C%20%3FhoCountry%29%20as%20%3FCountry%29%0AFROM%20%3Chttp%3A%2F%2Fmeta.icos-cp.eu%2Fresources%2Fstationentry%2F%3E%0AFROM%20%3Chttp%3A%2F%2Fmeta.icos-cp.eu%2Fontologies%2Fstationentry%2F%3E%0AFROM%20%3Chttp%3A%2F%2Fmeta.icos-cp.eu%2Fresources%2Ficos%2F%3E%0AWHERE%20%7B%0A%09%3Fs%20cpst%3AhasCountry%20%3FhoCountry%20%3B%20cpst%3AhasLongName%20%3FhoName%20%3B%20cpst%3AhasStationClass%20%3FhoClass0%20%3B%20a%2Frdfs%3Alabel%20%3FTheme%20.%0A%09optional%7B%0A%09%09%3Fs%20cpst%3AhasProductionCounterpart%20%3Fprods%20.%0A%09%09bind%28iri%28%3Fprods%29%20as%20%3Fprod%29%20.%0A%09%09optional%7B%3Fprod%20cpmeta%3AhasName%20%3FtcName%20%7D%0A%09%09optional%7B%3Fprod%20cpmeta%3AhasStationClass%20%3FtcClass%7D%0A%09%09optional%7B%3Fprod%20cpmeta%3AcountryCode%20%3FtcCountry%7D%0A%09%7D%0A%09bind%28if%28contains%28%3FhoClass0%2C%20%27Ass%27%29%2C%20%27Associated%27%2C%20%3FhoClass0%29%20as%20%3FhoClass%29%0A%09filter%20exists%20%7B%3Fs%20cpst%3AhasShortName%20%5B%5D%7D%0A%7D%0Aorder%20by%20%3FTheme%20%3FName'
#different compared to ARGO-ifremer SPARQL-endpoint (curl -X POST "https://sparql.ifremer.fr/argo/query" --data-urlencode "query=select ?s ?o ?p where{?s ?o ?p.} limit 10")

HTTPError: HTTP Error 501: Not Implemented

#### Notes/findings:
- RDF ~ data standardized
- use of internally defined ontologies
  - requires knowledge on the location of the ontologies / vocabularies and getting to know them ~~ less machine interoperable
  - sidenote: prefixes not registered on prefix.cc ~ would increase findability (+ also registering in ontology catalogues/archives)
- though also use of community driven ontologies (e.g. dcat description), though limited in information (other descriptions also available) 
- general queries take long time to run
- there is a set per-minute query quota
- use of external standard terms (e.g. MIME-Types using iana.org, licenses using CC-urls)