### CCF API UseCase Documentation

[Click here](https://github.com/hubmapconsortium/ccf-ui/blob/main/ccf-api-usage.ipynb) to view the installation and general documentation for ccf-api python module.

In this notebook we will be demonstrating one particular usecase for which the CCF API Module can be utilized. 

#### We first import all the necessary packages from the ccf_openapi_client.

In [1]:
import ccf_openapi_client
from ccf_openapi_client.api import default_api

#### Import all other packages required to produce output as expected.
The pandas module is used for a dataframe, which is used for displaying and aggregating. <br>
The csv module is used for writing all the data into a csv file.

In [2]:
import pandas as pd
import csv

#### Configuration
You'll need to point the host in the configuration to the instance you'd like to work with. 
More Info [here](https://github.com/hubmapconsortium/ccf-ui/blob/main/ccf-api-usage.ipynb).

In [3]:

configuration = ccf_openapi_client.Configuration(
    host = "https://r5i95k35v5.us-east-2.awsapprunner.com/v1"
)
api_client = ccf_openapi_client.ApiClient(configuration)

api_instance = default_api.DefaultApi(api_client)

##### Check Database Status
This is a optional step which can be used to reduce wait times for other methods.

In [4]:
db_ready = False
result = None
while not db_ready:
    result = api_instance.db_status()
    if result['status'] == 'Ready':
        db_ready = True
    else:
        print('Database not ready yet! Retrying...', result)
        time.sleep(2)
print('Database ready!\n', result)

Database ready!
 {'checkback': 3600000,
 'load_time': 169445,
 'message': 'Database successfully loaded',
 'status': 'Ready'}


### Our use case is using Kidney as the reference Organ.
We are using the ontology link for kidney directly: http://purl.obolibrary.org/obo/UBERON_0002113

The scene method gives us information about the Anatomical Structures that the Kidney collides with. 

In [20]:
sex = "both"
ontology_terms = ["http://purl.obolibrary.org/obo/UBERON_0002113",]
sceneResult = None
try:
    sceneResult = api_instance.scene(sex=sex, ontology_terms=ontology_terms)
except ccf_openapi_client.ApiException as e:
    print("Exception when calling DefaultApi->aggregate_results: %s\n" % e)

We convert the sceneResult into a dictionary where the key is the entity_id and it contains an array of the Anatomical Structures that collide with Kidney.

In [6]:
sceneEntityASDict = {}
for scene in sceneResult:
    if 'entity_id' in scene:
        if scene['entity_id'] not in sceneEntityASDict:
            sceneEntityASDict[scene['entity_id']] = []
        sceneEntityASDict[scene['entity_id']].extend(scene['ccf_annotations'])

We then use a SPARQL query to get the AS label, CT Label and the CT iri using the AS iri's we obtained in the last step. 

In [7]:
query = '''PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ccf: <http://purl.org/ccf/>

SELECT DISTINCT (STR(?asLabel) as ?as_label) (STR(?qlabel) as ?cell_label) ?as_iri ?cell_iri WHERE {
  ?cell_iri ccf:ccf_located_in ?as_iri .
  ?cell_iri rdfs:label ?qlabel .
  ?as_iri rdfs:label ?asLabel .

  FILTER (?as_iri in (%s))
}'''

purlIris = set()
# Collecting all unique AS iri's
for k in sceneEntityASDict.keys():
    for v in sceneEntityASDict[k]:
        purlIris.add(v)
        

purlString = ", ".join("<" + s + ">" for s in purlIris)
queryResponse = None
try:
    queryResponse = api_instance.sparql(query=query % purlString, format='application/json')
except ccf_openapi_client.ApiException as e:
    print("Exception when calling DefaultApi->aggregate_results: %s\n" % e)

We now reverse the sceneEntityASDict to now have AS: entity_id relation

In [8]:
def reverseObject(input_object):
    reversed_object = {}

    for key, values in input_object.items():
        for value in values:
            if value not in reversed_object:
                reversed_object[value] = set()
            reversed_object[value].add(key)
    return reversed_object

ASEntityDict = reverseObject(sceneEntityASDict)

Using the method `Tissue Blocks` we get donor label information for each entity_id

In [25]:
sex = "both"
ontology_terms = ["http://purl.obolibrary.org/obo/UBERON_0002113",]
tissueBlockResult = None
try:
    tissueBlockResult = api_instance.tissue_blocks(sex=sex, ontology_terms=ontology_terms)
except ccf_openapi_client.ApiException as e:
    print("Exception when calling DefaultApi->aggregate_results: %s\n" % e)

# Convert the TissueBlock Object to a dict
tissueBlockResultList = list(map(lambda b: b.to_dict(), tissueBlockResult))

To make it easier to find the donor labels we convert the above result to a dictionary that related the entity_id to the donor_label.

In [28]:
blockDonorLabel = {}
for tissueBlock in tissueBlockResultList:
    blockDonorLabel[tissueBlock['@id']] = tissueBlock['donor']['label']

Now we create the report, using data from the above methods.

In [11]:
#Can't use set because - unhashable type: 'dict'
mergedData = []
for response in queryResponse:
    entity_ids = ASEntityDict[response['as_iri']]
    for e in entity_ids:
        newResponse = response.copy()
        newResponse['block_id'] = e
        newResponse['donor_label'] = blockDonorLabel[e]
        if newResponse not in mergedData:
            mergedData.append(newResponse)

This will save the output as a csv file. <br>
You can change the name of the file in first line of the next block. 

In [29]:
fileName = "exampleFileName"
import csv
def save_to_csv(data, headers, filename):
     with open(filename, 'w', newline='') as file:
        writer = csv.writer(file)

        # Write the headers
        writer.writerow(headers)

        # Write the data
        for obj in data:
            row = [obj.get(headers[header], '') for header in headers]
            writer.writerow(row)

header_mapping = {
    'Block Id': 'block_id',
    'AS': 'as_iri',
    'AS Label': 'as_label',
    'CT': 'cell_iri',
    'CT Label': 'cell_label',
    'Donor Label':'donor_label'
}
save_to_csv(mergedData, header_mapping, f'${fileName}.csv')

In [13]:
df = pd.DataFrame(mergedData)
df.head(len(df))

Unnamed: 0,cell_label,as_label,cell_iri,as_iri,block_id,donor_label
0,"""T cell""","""kidney""",http://purl.obolibrary.org/obo/CL_0000084,http://purl.obolibrary.org/obo/UBERON_0002113,https://entity.api.hubmapconsortium.org/entiti...,"Male, Age 65"
1,"""T cell""","""kidney""",http://purl.obolibrary.org/obo/CL_0000084,http://purl.obolibrary.org/obo/UBERON_0002113,https://entity.api.hubmapconsortium.org/entiti...,"Female, Age 57"
2,"""T cell""","""kidney""",http://purl.obolibrary.org/obo/CL_0000084,http://purl.obolibrary.org/obo/UBERON_0002113,https://entity.api.hubmapconsortium.org/entiti...,"Male, Age 66, BMI 29.1"
3,"""T cell""","""kidney""",http://purl.obolibrary.org/obo/CL_0000084,http://purl.obolibrary.org/obo/UBERON_0002113,https://entity.api.hubmapconsortium.org/entiti...,"Male, Age 41"
4,"""T cell""","""kidney""",http://purl.obolibrary.org/obo/CL_0000084,http://purl.obolibrary.org/obo/UBERON_0002113,https://entity.api.hubmapconsortium.org/entiti...,"Female, Age 45, BMI 22.6"
...,...,...,...,...,...,...
6106,"""capsule mesenchymal stromal cell""","""kidney capsule""",https://purl.org/ccf/ASCTB-TEMP_capsule-mesenc...,http://purl.obolibrary.org/obo/UBERON_0002015,https://entity.api.hubmapconsortium.org/entiti...,"Male, Age 66, BMI 31.4"
6107,"""capsule mesenchymal stromal cell""","""kidney capsule""",https://purl.org/ccf/ASCTB-TEMP_capsule-mesenc...,http://purl.obolibrary.org/obo/UBERON_0002015,https://entity.api.hubmapconsortium.org/entiti...,"Male, Age 28, BMI 19.2"
6108,"""capsule mesenchymal stromal cell""","""kidney capsule""",https://purl.org/ccf/ASCTB-TEMP_capsule-mesenc...,http://purl.obolibrary.org/obo/UBERON_0002015,https://entity.api.hubmapconsortium.org/entiti...,"Female, Age 55"
6109,"""capsule mesenchymal stromal cell""","""kidney capsule""",https://purl.org/ccf/ASCTB-TEMP_capsule-mesenc...,http://purl.obolibrary.org/obo/UBERON_0002015,https://entity.api.hubmapconsortium.org/entiti...,"Female, Age 69, BMI 49.1"


In [14]:
# Summary -- texttt
AsCounts = df['as_iri'].value_counts()
CtCounts = df['cell_iri'].value_counts()
blockCounts = df['block_id'].value_counts()

print(f'SUMMARY:\n\
Number of Blocks: {len(blockCounts)},\n\
Number of Unique ASs identified: {len(AsCounts)},\n\
Number of Unique CTs identified: {len(CtCounts)}')

SUMMARY:
Number of Blocks: 83,
Number of Unique ASs identified: 8,
Number of Unique CTs identified: 68
