# Study prepping up for https://github.com/nvs-vocabs/OBISVocabs/issues/22

> Could I get some input about whether and how you would like to split biological parameters at the P02 group level 
> (i.e. the level above P01 that was originally designed to facilitate discovery of the P01). 
> Do you think this would be a good idea? or do you prefer to keep it as is? 
> Currently we only have one "bucket" P02 grouping code http://vocab.nerc.ac.uk/collection/P02/current/BPRP/ 
> which was created as a temporary measure to group together the codes we curate for the OBIS biological data. 
> We could create better categorisation to help discovery.

## basic python setup

In [65]:
from pykg2tbl import DefaultSparqlBuilder, KGSource, QueryResult
from pathlib import Path
from pandas import DataFrame

THIS_PATH = Path().absolute()

# SPARQL EndPoint to use - wrapped as Knowledge-Graph 'source'
NSV_ENDPOINT: str = "https://vocab.nerc.ac.uk/sparql/sparql"
NSV:KGSource = KGSource.build(NSV_ENDPOINT)

TEMPLATES_FOLDER = str(THIS_PATH)
GENERATOR = DefaultSparqlBuilder(templates_folder=TEMPLATES_FOLDER)

OUT_PATH = THIS_PATH / "results"

def generate_sparql(name: str, **vars) -> str: 
    """ Simply build the sparql by using the named query and applying the vars
    """
    return GENERATOR.build_syntax(name, **vars)


def execute_to_df(name: str, **vars) -> DataFrame:
    """ Builds the sparql and executes, returning the result as a dataframe.
    """
    sparql = generate_sparql(name, **vars)
    result: QueryResult = NSV.query(sparql=sparql)
    return result.to_dataframe()

## what specifics does the NVS structure reveal about "SDN:P02::BPRP" ?

In [21]:
sdnid = "SDN:P02::BPRP"
df = execute_to_df("nsv-term-narrowed.sparql", sdnid=sdnid)
df

Unnamed: 0,id,lbl,nid,nlbl,ncolt
0,SDN:P02::BPRP,SeaDataNet biological format biotic parameters,SDN:P05::002,biota,International Standards Organisation ISO19115 ...
1,SDN:P02::BPRP,SeaDataNet biological format biotic parameters,SDN:D01::D0100001,Parameters,DCAT Themes for Linked Data Representation of ...
2,SDN:P02::BPRP,SeaDataNet biological format biotic parameters,SDN:P03::B027,Other biological measurements,SeaDataNet Agreed Parameter Groups
3,SDN:P02::BPRP,SeaDataNet biological format biotic parameters,SDN:P23::MD002,Marine Biodiversity,MEDIN Parameter Discipline Keywords
4,SDN:P02::BPRP,SeaDataNet biological format biotic parameters,SDN:P08::DS01,Biological oceanography,SeaDataNet Parameter Disciplines
5,SDN:P02::BPRP,SeaDataNet biological format biotic parameters,SDN:L19::001,discipline,SeaDataNet keyword types
6,SDN:P02::BPRP,SeaDataNet biological format biotic parameters,SDN:L19::SDNKG03,parameter,SeaDataNet keyword types
7,SDN:P02::BPRP,SeaDataNet biological format biotic parameters,SDN:P22::31,Habitats and biotopes,"GEMET - INSPIRE themes, version 1.0"


 # To what P01 discovery does this BPRP lead?

In [54]:
sdnid = "SDN:P02::BPRP"
#discover terms in the P01 collection that are narrowed down from this
df = execute_to_df("nsv-discovery.sparql", sdnid=sdnid) #default tgt_col="P01", skosrel="narrower"
df

Unnamed: 0,id,tgtid,tgtlbl,tgtterm
0,SDN:P02::BPRP,SDN:P01::SDBIOL09,Dry weight biomass of biological entity specif...,http://vocab.nerc.ac.uk/collection/P01/current...
1,SDN:P02::BPRP,SDN:P01::SDBIOL07,Ash-free dry weight biomass of biological enti...,http://vocab.nerc.ac.uk/collection/P01/current...
2,SDN:P02::BPRP,SDN:P01::SDBIOL04,Wet weight biomass of biological entity specif...,http://vocab.nerc.ac.uk/collection/P01/current...
3,SDN:P02::BPRP,SDN:P01::SDBIOL12,Biomass as carbon of biological entity specifi...,http://vocab.nerc.ac.uk/collection/P01/current...
4,SDN:P02::BPRP,SDN:P01::DMAX7117,Depth maximum of biological entity specified e...,http://vocab.nerc.ac.uk/collection/P01/current...
...,...,...,...,...
70,SDN:P02::BPRP,SDN:P01::SPWLXX01,Length of biological entity specified elsewher...,http://vocab.nerc.ac.uk/collection/P01/current...
71,SDN:P02::BPRP,SDN:P01::SDBIOL14,Biovolume (calculated) of biological entity sp...,http://vocab.nerc.ac.uk/collection/P01/current...
72,SDN:P02::BPRP,SDN:P01::SAMPPROT,Sampling protocol,http://vocab.nerc.ac.uk/collection/P01/current...
73,SDN:P02::BPRP,SDN:P01::SPHLXX01,Length of biological entity specified elsewher...,http://vocab.nerc.ac.uk/collection/P01/current...


## and what details of these are already in associated terms?

In [67]:
sdnid = "SDN:P02::BPRP"
# discover terms in the P01 collection that are narrowed down from this
# and add the details to other broader collections
cols=["P06", "S06", "S05", "S25", "S26"]
df = execute_to_df("nsv-discovery-detail.sparql", sdnid=sdnid, cols=cols) #default tgt_col="P01", skosrel="narrower", 
df.to_csv(OUT_PATH / "bprp-measurement-details.csv", index=False)
df

Unnamed: 0,tgtid,tgtlbl,P06_IDS,S06_IDS,S25_IDS,S26_IDS,S05_IDS
0,SDN:P01::SDBIOL03,Ash-free dry weight biomass of biological enti...,SDN:P06::UGMS,SDN:S06::S0600086,SDN:S25::BE007117,SDN:S26::MAT00850,
1,SDN:P01::ADBIOL01,Abundance of biological entity specified elsew...,SDN:P06::NOPM,SDN:S06::S0600002,SDN:S25::BE007117,SDN:S26::MAT00640,
2,SDN:P01::SDBIOL01,Abundance of biological entity specified elsew...,SDN:P06::UCML,SDN:S06::S0600002,SDN:S25::BE007117,SDN:S26::MAT00640,
3,SDN:P01::PMA08441,Proportion coverage of macro vegetation (exclu...,SDN:P06::UPCT,SDN:S06::S0600171,SDN:S25::BE008441,SDN:S26::MAT00850,
4,SDN:P01::DBPDBE04,Likelihood category of death from predation of...,SDN:P06::XXXX,SDN:S06::S0600231,SDN:S25::BE007117,SDN:S26::MAT00906,
...,...,...,...,...,...,...,...
70,SDN:P01::AFDWBP01,Ash-free dry weight biomass production rate of...,SDN:P06::UUDX,SDN:S06::S0600232,SDN:S25::BE007117,SDN:S26::MAT00850,SDN:S05::S050214
71,SDN:P01::SACFORN1,Abundance category (SACFORN) of biological ent...,SDN:P06::XXXX,SDN:S06::S0600212,SDN:S25::BE007117,SDN:S26::MAT00906,
72,SDN:P01::SDBIOL07,Ash-free dry weight biomass of biological enti...,SDN:P06::UMMC,SDN:S06::S0600086,SDN:S25::BE007117,SDN:S26::MAT00640,
73,SDN:P01::SPBLXX01,Bill depth at gonys of biological entity speci...,SDN:P06::UXMM,SDN:S06::S0600102,SDN:S25::BE007117,SDN:S26::MAT00906,
