# Ocean Acidification

### References

* [ocean expert metadata doc](https://oceanexpert.org/document/26001)
* [oa erdap](https://erddap.oa.iode.org/erddap/index.html)
* An example query (used as guide for below): https://github.com/iodepo/odis-arch/blob/schema-dev-df/code/SPARQL/baseQuery.rq
* SHACL shapes for potential reference: https://github.com/iodepo/odis-arch/tree/schema-dev-df/code/SHACL

### Need to

* Look for datasets with distribution
* connect their prov
* validate with SHACL for variable measured


In [1]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

from SPARQLWrapper import SPARQLWrapper, JSON
import pandas as pd
# import dask, boto3
# import dask.dataframe as dd
import numpy as np
import json



In [2]:
sparqlep = "http://graph.oceaninfohub.org/blazegraph/namespace/oih/sparql"


In [3]:
#@title
def get_sparql_dataframe(service, query):
    """
    Helper function to convert SPARQL results into a Pandas data frame.
    """
    sparql = SPARQLWrapper(service)
    sparql.setQuery(query)
    sparql.setReturnFormat(JSON)
    result = sparql.query()

    processed_results = json.load(result.response)
    cols = processed_results['head']['vars']

    out = []
    for row in processed_results['results']['bindings']:
        item = []
        for c in cols:
            item.append(row.get(c, {}).get('value'))
        out.append(item)

    return pd.DataFrame(out, columns=cols)

In [4]:
rq_count = """PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX schema: <https://schema.org/>
PREFIX bds: <http://www.bigdata.com/rdf/search#>

SELECT DISTINCT  ?s ?url ?dist ?g ?type ?score ?name ?lit ?description ?headline
WHERE
{
   ?lit bds:search "ocean acidification" .
   ?lit bds:matchAllTerms "false" .
   ?lit bds:relevance ?score .
   graph ?g {
    ?s ?p ?lit .
    ?s rdf:type ?type .
    OPTIONAL { ?s schema:distribution ?dist .   }
    OPTIONAL { ?s schema:name ?name .   }
    OPTIONAL { ?s schema:headline ?headline .   }
    OPTIONAL { ?s schema:url ?url .   }
    OPTIONAL { ?s schema:description ?description .    }
  }

}
ORDER BY DESC(?score)
OFFSET 0
"""

In [5]:
dfsc = get_sparql_dataframe(sparqlep, rq_count)

Unnamed: 0,s,url,dist,g,type,score,name,lit,description,headline
0,oai:aquadocs.org:1834/22971,http://aquadocs.org/oai/request?/handle/1834/2...,,urn:gleaner.oih:summoned:invemardocuments:587d...,https://schema.org/CreativeWork,0.8838834764831843,A Comparison of the SNP Variation in the Calci...,Ocean Acidification,As the atmospheric levels of CO2 rise from hum...,
1,oai:aquadocs.org:1834/35724,http://aquadocs.org/oai/request?/handle/1834/3...,,urn:gleaner.oih:summoned:invemardocuments:587d...,https://schema.org/CreativeWork,0.8838834764831843,Production and carbonate dynamics of Halimeda ...,Ocean acidification,Ocean acidification poses a serious threat to ...,
2,oai:aquadocs.org:1834/36288,http://aquadocs.org/oai/request?/handle/1834/3...,,urn:gleaner.oih:summoned:invemardocuments:587d...,https://schema.org/CreativeWork,0.8838834764831843,Climate and oceanography of the Galapagos in t...,ocean acidification,With the likelihood that carbon dioxide and ot...,
3,oai:repository.oceanbestpractices.org:11329/1267,https://repository.oceanbestpractices.org/1132...,,urn:gleaner.oih:summoned:obps:b7186684bbf5a6b5...,https://schema.org/CreativeWork,0.8838834764831843,Arctic Ocean Acidification Assessment: 2018 Su...,Ocean acidification,- This document presents the Summary for Pol...,
4,oai:repository.oceanbestpractices.org:11329/851,https://repository.oceanbestpractices.org/1132...,,urn:gleaner.oih:summoned:obps:b7186684bbf5a6b5...,https://schema.org/CreativeWork,0.8838834764831843,Development of a Continuous Phytoplankton Cult...,Ocean acidification,- Around one third of all anthropogenic CO2 e...,


In [8]:
dfsc.head(30)

Unnamed: 0,s,url,dist,g,type,score,name,lit,description,headline
0,oai:aquadocs.org:1834/22971,http://aquadocs.org/oai/request?/handle/1834/2...,,urn:gleaner.oih:summoned:invemardocuments:587d...,https://schema.org/CreativeWork,0.8838834764831843,A Comparison of the SNP Variation in the Calci...,Ocean Acidification,As the atmospheric levels of CO2 rise from hum...,
1,oai:aquadocs.org:1834/35724,http://aquadocs.org/oai/request?/handle/1834/3...,,urn:gleaner.oih:summoned:invemardocuments:587d...,https://schema.org/CreativeWork,0.8838834764831843,Production and carbonate dynamics of Halimeda ...,Ocean acidification,Ocean acidification poses a serious threat to ...,
2,oai:aquadocs.org:1834/36288,http://aquadocs.org/oai/request?/handle/1834/3...,,urn:gleaner.oih:summoned:invemardocuments:587d...,https://schema.org/CreativeWork,0.8838834764831843,Climate and oceanography of the Galapagos in t...,ocean acidification,With the likelihood that carbon dioxide and ot...,
3,oai:repository.oceanbestpractices.org:11329/1267,https://repository.oceanbestpractices.org/1132...,,urn:gleaner.oih:summoned:obps:b7186684bbf5a6b5...,https://schema.org/CreativeWork,0.8838834764831843,Arctic Ocean Acidification Assessment: 2018 Su...,Ocean acidification,- This document presents the Summary for Pol...,
4,oai:repository.oceanbestpractices.org:11329/851,https://repository.oceanbestpractices.org/1132...,,urn:gleaner.oih:summoned:obps:b7186684bbf5a6b5...,https://schema.org/CreativeWork,0.8838834764831843,Development of a Continuous Phytoplankton Cult...,Ocean acidification,- Around one third of all anthropogenic CO2 e...,
5,oai:repository.oceanbestpractices.org:11329/1189,https://repository.oceanbestpractices.org/1132...,,urn:gleaner.oih:summoned:obps:b7186684bbf5a6b5...,https://schema.org/CreativeWork,0.8838834764831843,Perspectives on in situ Sensors for Ocean Acid...,Ocean acidification,- As ocean acidification (OA) sensor technolo...,
6,oai:repository.oceanbestpractices.org:11329/1195,https://repository.oceanbestpractices.org/1132...,,urn:gleaner.oih:summoned:obps:b7186684bbf5a6b5...,https://schema.org/CreativeWork,0.8838834764831843,AMAP Assessment 2013: Arctic Ocean Acidification.,Ocean acidification,- This assessment report presents the results...,
7,oai:repository.oceanbestpractices.org:11329/1258,https://repository.oceanbestpractices.org/1132...,,urn:gleaner.oih:summoned:obps:b7186684bbf5a6b5...,https://schema.org/CreativeWork,0.8838834764831843,How to document - Ocean Acidification Data.,Ocean acidification,- The number of ocean acidification (OA) stud...,
8,oai:repository.oceanbestpractices.org:11329/1641,https://repository.oceanbestpractices.org/1132...,,urn:gleaner.oih:summoned:obps:b7186684bbf5a6b5...,https://schema.org/CreativeWork,0.8838834764831843,Ocean and Atmospheric Observations at the Remo...,Ocean acidification,"- For open ocean environments, it is rare to ...",
9,oai:repository.oceanbestpractices.org:11329/339,https://repository.oceanbestpractices.org/1132...,,urn:gleaner.oih:summoned:obps:b7186684bbf5a6b5...,https://schema.org/CreativeWork,0.8838834764831843,Guide to best practices for ocean acidificatio...,Ocean acidification,- Ocean acidification is an undisputed fact. ...,


In [7]:
dfsc.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 98168 entries, 0 to 98167
Data columns (total 10 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   s            98168 non-null  object
 1   url          28786 non-null  object
 2   dist         2835 non-null   object
 3   g            98168 non-null  object
 4   type         98168 non-null  object
 5   score        98168 non-null  object
 6   name         23708 non-null  object
 7   lit          98168 non-null  object
 8   description  29983 non-null  object
 9   headline     0 non-null      object
dtypes: object(10)
memory usage: 7.5+ MB
