# Framing where we get all the nested items
In this notebook we take the JSON-LD content for a file, or group of files.  We start by importing the relevant libraries and then obtaining the url of our file in question


In [1]:
# import the library
import cmipld, json

def jprint(x):
    print(json.dumps(x,indent=4))

  from .autonotebook import tqdm as notebook_tqdm


#### Setting the file location (ID)

In [2]:
# for simplicity we can use the directory prefix
file = 'cmip6plus:experiment/1pctco2'

In [3]:
# get the full url path of the file.
url = cmipld.processor.resolve_prefix(file)

Substituting prefix:
cmip6plus: https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/1pctco2


## Using the CMIPLD wrapper
The easiest way to generate a corpus and frame items is through the use of the cmipld wrapper. 

In [4]:
# loads the experiments graph and relevant dependancies. 
experiments = cmipld.processor.EmbeddedFrame(url)

100%|██████████| 3/3 [00:00<00:00, 40.83it/s]


#### This contains our context, dependancies, and the corpus of all relevant graph files. 

In [5]:
print( f'Context Loc: {experiments.context}')

print( f'Dependencies Loc: {experiments.dependencies}')

print( f'Graph with all of the above : {experiments.corpus}')

Context Loc: https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/_context_
Dependencies Loc: ['https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/graph.jsonld', 'https://wcrp-cmip.github.io/WCRP-universe/source-type/graph.jsonld', 'https://wcrp-cmip.github.io/WCRP-universe/activity/graph.jsonld']
Graph with all of the above : {'@graph': [[{'https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/activity': [{'@id': 'https://wcrp-cmip.github.io/WCRP-universe/activity/cmip'}], 'https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/description': [{'@value': 'DECK: 1pctCO2'}], 'https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/end': [{'@value': -999}], '@id': 'https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/1pctco2', 'https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/label': [{'@value': '1pctCO2'}], 'https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/min_number_yrs_per_sim': [{'@value': 150}], 'https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/model_realms': [{'https://wcrp-

#### Finally we select our frame. 

In [6]:
frame = {'id':url}

# the Embedded frame class will apply the default context- this may be specified separately instead. 

result = experiments.frame(frame)

In [7]:
jprint(result)

{
    "@context": "https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/_context_",
    "id": "cmip6plus:experiment/1pctco2",
    "type": [
        "wcrp:experiment",
        "cmip6plus"
    ],
    "activity": {
        "id": "universal:activity/cmip",
        "type": [
            "wcrp:activity",
            "universal"
        ],
        "description": "CMIP DECK: 1pctCO2, abrupt-4xCO2, amip, esm-piControl, esm-historical, historical, and piControl experiments",
        "label": "CMIP",
        "url": "https://gmd.copernicus.org/articles/9/1937/2016/gmd-9-1937-2016.pdf"
    },
    "description": "DECK: 1pctCO2",
    "end": -999,
    "label": "1pctCO2",
    "min_number_yrs_per_sim": 150,
    "model_realms": {
        "universal:source-type/aer": {
            "type": [
                "wcrp:source_type",
                "universal"
            ],
            "description": "aerosol treatment in an atmospheric model where concentrations are calculated based on emissions, transformatio

## Framing using PyLD only

This section explains what is happening in the CMIPLD library with bit more detail. 

By generating a corpus we enable jsonld to easily fill referenced (nested) entries. This however does require us to frame our results and only extract what we are interested in. 

#### Get dependancies

In [8]:
dep = cmipld.processor.depends(url,graph=True)
dep

['https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/graph.jsonld',
 'https://wcrp-cmip.github.io/WCRP-universe/source-type/graph.jsonld',
 'https://wcrp-cmip.github.io/WCRP-universe/activity/graph.jsonld']

In [9]:
cmipld.processor.contextify('https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/_context_')

'https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/_context_'

##### Extract relevant graphs 

In [10]:
from p_tqdm import p_map
corpus = {'@graph':p_map(cmipld.jsonld.expand,dep)}

100%|██████████| 3/3 [00:00<00:00, 34.50it/s]


##### Get the data

We start by setting up our frame

In [11]:
context = cmipld.processor.contextify(url)
context

'https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/_context_'

In [12]:
frame = {
    "@context": context,
    # we want the file with an id 1pctco2
    "id":url
}

Then run use it to extract entries from JSONLD

In [13]:
entry_1pctco2 = cmipld.jsonld.frame(corpus,frame)
jprint(entry_1pctco2)

{
    "@context": "https://wcrp-cmip.github.io/CMIP6Plus_CVs/experiment/_context_",
    "id": "cmip6plus:experiment/1pctco2",
    "type": [
        "wcrp:experiment",
        "cmip6plus"
    ],
    "activity": {
        "id": "universal:activity/cmip",
        "type": [
            "wcrp:activity",
            "universal"
        ],
        "description": "CMIP DECK: 1pctCO2, abrupt-4xCO2, amip, esm-piControl, esm-historical, historical, and piControl experiments",
        "label": "CMIP",
        "url": "https://gmd.copernicus.org/articles/9/1937/2016/gmd-9-1937-2016.pdf"
    },
    "description": "DECK: 1pctCO2",
    "end": -999,
    "label": "1pctCO2",
    "min_number_yrs_per_sim": 150,
    "model_realms": {
        "universal:source-type/aer": {
            "type": [
                "wcrp:source_type",
                "universal"
            ],
            "description": "aerosol treatment in an atmospheric model where concentrations are calculated based on emissions, transformatio

Here we can see referenced entries from the universal repo have be substitued in

In [14]:
jprint(entry_1pctco2['activity'])

{
    "id": "universal:activity/cmip",
    "type": [
        "wcrp:activity",
        "universal"
    ],
    "description": "CMIP DECK: 1pctCO2, abrupt-4xCO2, amip, esm-piControl, esm-historical, historical, and piControl experiments",
    "label": "CMIP",
    "url": "https://gmd.copernicus.org/articles/9/1937/2016/gmd-9-1937-2016.pdf"
}


## Specific framing
As was demonstrated in the documentation, we can make our framing notably more specific. Here I want to extract all experiments, and the labels for the referenced fields only. 

In [15]:
frame2 = {
    "@context": context,
    # we want the file with an id 1pctco2
    # "id":url
    "type":"wcrp:experiment",
    "activity":{"@explicit":True, "label":{}},
    "parent_experiment":{"@explicit":True, "label":{}},
    "sub_parent":{"@explicit":True, "label":{}},
}

In [16]:
experiment_list = experiments.frame(frame2)
# or 
# experiment_list = cmipld.jsonld.frame(corpus,frame2)["@graph"]


print(f'Number of experiments found: {len(experiment_list)}\n')

for i in experiment_list:
    print(i['id'])

Number of experiments found: 50

cmip6plus:experiment/1pctco2
cmip6plus:experiment/abrupt-4xco2
cmip6plus:experiment/amip
cmip6plus:experiment/esm-hist
cmip6plus:experiment/esm-picontrol
cmip6plus:experiment/esm-picontrol-spinup
cmip6plus:experiment/esm-up2p0
cmip6plus:experiment/esm-up2p0-gwl2p0
cmip6plus:experiment/esm-up2p0-gwl2p0-50y-dn2p0
cmip6plus:experiment/esm-up2p0-gwl4p0
cmip6plus:experiment/esm-up2p0-gwl4p0-50y-dn2p0
cmip6plus:experiment/esm-up2p0-gwl4p0-50y-dn2p0-gwl2p0
cmip6plus:experiment/fut-aer
cmip6plus:experiment/fut-ghg
cmip6plus:experiment/fut-lu
cmip6plus:experiment/fut-sol
cmip6plus:experiment/fut-totalo3
cmip6plus:experiment/fut-volc
cmip6plus:experiment/hist-aer
cmip6plus:experiment/hist-ghg
cmip6plus:experiment/hist-lu
cmip6plus:experiment/hist-nat
cmip6plus:experiment/hist-nolu
cmip6plus:experiment/hist-piaer
cmip6plus:experiment/hist-pighg
cmip6plus:experiment/hist-pisol
cmip6plus:experiment/hist-pitotalo3
cmip6plus:experiment/hist-pivolc
cmip6plus:experiment

In [17]:
jprint(experiment_list[0])

{
    "id": "cmip6plus:experiment/1pctco2",
    "type": [
        "wcrp:experiment",
        "cmip6plus"
    ],
    "activity": {
        "id": "universal:activity/cmip",
        "type": [
            "wcrp:activity",
            "universal"
        ],
        "label": "CMIP"
    },
    "description": "DECK: 1pctCO2",
    "end": -999,
    "label": "1pctCO2",
    "min_number_yrs_per_sim": 150,
    "model_realms": {
        "universal:source-type/aer": {
            "type": [
                "wcrp:source_type",
                "universal"
            ],
            "description": "aerosol treatment in an atmospheric model where concentrations are calculated based on emissions, transformation, and removal processes (rather than being prescribed or omitted entirely)",
            "is_required": [
                false,
                true
            ],
            "label": "AER"
        },
        "universal:source-type/aogcm": {
            "type": [
                "wcrp:source_type",
