# **Tutorial 2: Get the Location of a Cell Type with Atlasapprox API**

This tutorial demonstrates the process of displaying the location/organ where a cell type is found using our [Atlasapprox API](https://atlasapprox.readthedocs.io/en/latest/python/index.html). In this guide, we will walk you through the step-by-step methodology required.

<hr>

## **Requirments and Installation**

To use the atlasapprox API, please use `pip` command to install the following Python packages in your terminal:
- `requests`

- `pandas`

In [None]:
pip install pandas requests

To install *atlasapprox*, please use `pip` command as followed in your terminal.

In [None]:
pip install atlasapprox



## **Getting Started**



Firstly, please import `atlasapprox` and `pandas`, then instantiate the `API` project.

In [None]:
import atlasapprox
import pandas as pd

# instantiate
api = atlasapprox.API()

: 

### **Example 1: Average Gene Expression in Human Lung Tissue**

In this sample code, we  we will demonstrate how to use the `average` function to generate average gene expression data for five selected genes (*COL13A1*, *COL14A1*, *TGFBI*, *PDGFRA*, and *GZMA*) in human lung tissue.

In this example, we will use *gene_expression* as measurement type.

In [None]:
avg_expr_lung = api.average(
    organism = "h_sapiens", 
    organ = "lung", 
    features = ["COL13A1", "COL14A1", "TGFBI", "PDGFRA", "GZMA"], 
    measurement_type = 'gene_expression'
)

# Display the result
avg_expr_lung

Unnamed: 0,neutrophil,basophil,monocyte,macrophage,dendritic,B,plasma,T,NK,plasmacytoid,...,capillary,CAP2,lymphatic,fibroblast,alveolar fibroblast,smooth muscle,vascular smooth muscle,pericyte,mesothelial,ionocyte
COL13A1,0.0,0.222863,0.0,0.000711,0.0,0.0,0.002205,0.0,0.029147,0.0,...,0.003937,0.0,0.0,0.005113,0.446961,0.0,0.131642,0.06796,0.0,0.0
COL14A1,0.0,0.0,0.001422,0.001362,0.0,0.0,0.002607,0.0,0.0,0.0,...,0.007525,0.026666,0.059648,1.110076,1.226022,1.033389,2.10846,0.03358,0.0,0.0
TGFBI,0.06515,0.111107,1.802062,1.252701,2.190132,0.0,0.083882,0.10046,0.32661,4.492828,...,0.045932,0.06761,0.521915,0.393191,0.175393,0.311884,0.258512,0.11901,0.404976,0.032419
PDGFRA,0.0,0.0,0.000965,0.002414,0.003172,0.0,0.0,0.005035,0.0,0.0,...,0.011427,0.00292,0.0,1.772957,3.724075,0.128634,0.059852,0.0,0.332479,0.0
GZMA,0.013437,0.142837,0.174047,0.029326,0.020453,0.025113,0.063292,9.006065,19.687157,0.0,...,0.044351,0.042996,0.073877,0.029919,0.081036,0.119041,0.0,0.460141,0.044982,0.058806


**output:** The **avg_expr_lung** variable holds a *Pandas.DataFrame* that represents the average gene expression levels for the specified genes in various cell types within human lung tissue. Each column represents a **cell type**, and each row represents a **gene name**. This output is referred to as the "original DataFrame" in the next sections.

These values are typically normalized counts, such as counts per ten thousand (cptt).


### **Example 2. Average Chromatin Accessibility In Human Lung Tissue**

In this sample code, we  we will demonstrate how to use the `average` function to generate average chromatin accessibility data on two locations on chromosomes (*chr1:9955-10355, chr10:122199710-122200110*) in human lung tissue.

In this example, we will use *cromatin_accessibility* as measurement type.

In [None]:
avg_chr_acc_lung = api.average(
    organism = "h_sapiens", 
    organ = "lung", 
    features = ["chr1:9955-10355", "chr10:122199710-122200110"], 
    measurement_type = 'chromatin_accessibility'
)

# Display the result
avg_chr_acc_lung

Unnamed: 0,mast,macrophage,alveolar macrophage,B,plasma,T,NK,AT1,AT2,club,ciliated,capillary,lymphatic,fibroblast,smooth muscle,vascular smooth muscle,pericyte,mesothelial,neuroendocrine
chr1:9955-10355,0.0,0.0079,0.01819,0.003077,0.017543,0.008494,0.014113,0.011353,0.010183,0.004265,0.0,0.008808,0.040392,0.009133,0.0,0.0,0.008808,0.0,0.0
chr10:122199710-122200110,0.0,0.001852,0.003558,0.006129,0.0,0.001337,0.0,0.00659,0.006129,0.001437,0.005302,0.002567,0.0,0.000776,0.0,0.0,0.004423,0.0,0.0


**output:** The **avg_chr_acc_lung** variable holds a *Pandas.DataFrame* that represents the average chromatin accessibility for the specified chromosome locations in various cell types within human lung tissue. Each column represents a **cell type**, and each row represents a **chromosome location**. 

These values are typically normalized counts, such as counts per ten thousand (cptt).

This output can also be used as original data in the later steps. Please remember, atlasapprox API only holds chromatin accessibility for human (*h_sapiens*) currently.


## **Get started**

First, instantiate the API object:

In [5]:
api = atlasapprox.API()

To get the organs/locations where a cell type is found, we need the following parameters:

- organism (the organism to query).

- cell_type (The cell type to get markers for).

- measurement_type.

In [16]:
celltype_lacation = api.celltype_location(
    organism = "h_sapiens",
    cell_type =  "macrophage", 
    measurement_type = "gene_expression"
)

df_organs = pd.DataFrame(celltype_lacation, columns=['Organs'])
df_organs.index = df_organs.index + 1

df_organs

Unnamed: 0,Organs
1,bladder
2,blood
3,eye
4,fat
5,heart
6,kidney
7,liver
8,lung
9,lymphnode
10,mammary


This code returns a pandas.DataFrame containing a list of organs that include macrophages.

## **Useful methods**

### **search if a specific organ is included**

In [25]:
# Search for a specific string in the 'Organs' column
search_organ = 'lymphnode'

# map the target organ, if it is in the list, return organ name.
mask = df_organs.map(lambda x: search_organ in str(x))
display_result = df_organs[mask.any(axis=1)]

display_result

Unnamed: 0,Organs
9,lymphnode
