# Hackathon Mini Challenges Answer Key

1. Find all datasets with _assay_display_name_ `Slideseq [Salmon]` that have a female donor between the ages 40 and 45

2. Find all the data products for all of these datasets and create a file matching format of HuBMAP CLT manifest.

3. Find datasets for a celltype with the Cells API.

4. Create or explore your own widgets!

In [None]:
# !pip install numpy pandas requests wheel hubmap_template_helper 
# !pip install hubmap_api_py_client

In [None]:
# Importing the required packages
import requests
import json
from hubmap_api_py_client import Client

from hubmap_template_helper import uuids as hth_uuids

## Mini Challenge #1: Find datasets

_In this challenge, we will use the search API to find specific datasets._

**Task: Find all datasets in the Portal with the assay_display_name “Slideseq [Salmon]” that have a female donor between the ages of 45 and 55.**

_Hint: This should give 12 datasets._

In [None]:
search_api = "https://search.api.hubmapconsortium.org/v3/portal/search"

hits = json.loads(
    requests.post(
        search_api,
        json={
            "size": 100,
            "query": {
                "bool": {
                    "must": [
                        {
                            "term": {
                                "assay_display_name.keyword": "Slideseq [Salmon]"
                            }
                        },
                        {
                            "term": {
                                "donor.mapped_metadata.sex.keyword": "Female"
                            }
                        },
                        {
                            "term": {
                                "donor.mapped_metadata.age_unit.keyword": "years"
                            }
                        }
                        ,{
                            "range": {
                                "donor.mapped_metadata.age_value": {
                                    "gte": 45,
                                    "lte": 55
                                }
                            }
                        }
                    ],
                    "filter": [
                        {
                            "bool": {
                                "must_not": {
                                    "exists": {
                                        "field": "next_revision_uuid"
                                    }
                                }
                            }
                        },
                        {
                            "term": {
                                "entity_type.keyword": "Dataset"
                            }
                        }
                    ]
                }
            },
            "_source": [
                "hubmap_id",
                "assay_display_name",
                "donor.mapped_metadata.sex",
                "donor.mapped_metadata.age_unit",
                "donor.mapped_metadata.age_value",
                "files"
            ],
            "sort": [
                {
                    "donor.mapped_metadata.age_value": {
                        "order": "asc"
                    }
                }
            ]
        }
    ).text
)['hits']['hits']

len(hits)

In [None]:
uuids = [hit['_id'] for hit in hits]
uuids

## Mini Challenge #2: Create manifest for CLT

_The HuBMAP CLT (Command Line Transfer) allows for easy downloading of files. For this, we need a manifest file with the desired files. In this challenge, we will create a manifest.txt. We won’t use the CLT, but you can try this out later._

_A manifest file looks like this:_

>HBM738.KGBN.464   leiden_cluster_rna.pdf

>HBM745.HTBD.332   expr.h5ad

>HBM277.SBVV.838   /					<- this downloads all files in the dataset


**Task: For the datasets in Challenge #1, find all files that are a data product and create a manifest file.**

_Hint: This should give 36 data products._


In [None]:
data_products = []
for hit in hits:
    for file in hit["_source"]["files"]:
        # print(file)
        if "is_data_product" in file.keys():
            if file["is_data_product"]: 
                data_products.append([hit["_source"]["hubmap_id"], file["rel_path"]])
       

In [None]:
data_products[0:5]

In [None]:
with open("manifest.txt", "w") as f:
    for data_product in data_products:
        f.write(f"{data_product[0]} {data_product[1]}")
        f.write("\n")

## Mini Challenge #3: Use the Cells API

_In this challenge we will use the cells API: https://github.com/hubmapconsortium/hubmap-api-py-client/blob/main/examples/select_celltypes.md_

_Create a client like this_

`Client = Client(‘https://cells.api.hubmapconsortium.org/api/’)`


**Task: Find all datasets that have celltype “CL:000057”. For each dataset and find all the cell types that this dataset contains. Also find the assay type of each dataset. Create a heatmap of these results.**

_Hint: there should be 43 datasets that contain this celltype_
_Hint: use the search API to find the assay type_


In [None]:
# endpoints
client = Client('https://cells.api.hubmapconsortium.org/api/')

In [None]:
# get the first cell type (CL:000057)
cell_type = client.select_celltypes().get_list()[0]['grouping_name']

# get the datasets with this celltype
uuids = [hit["uuid"] for hit in list(client.select_datasets(where="celltype",has=[cell_type]).get_list())]

In [None]:
# for all datasets, get the cell types
mapping = {}
for uuid in uuids[0:2]: 
    cell_type_uuids = list(client.select_celltypes(where="dataset", has=[uuid]).get_list())
    cells = [cell["grouping_name"] for cell in cell_type_uuids]
    mapping[uuid] = cells
mapping

## Mini Challenge #4: Widgets
Find your favourite widget or create a new widget (e.g., with anywidget) and share it with others.

You can find existing widgets created with anywidget on 

https://anywidget.dev/en/community/
