# `AnnData` Conversion

The purpose of this notebook is to convert the cell table to a [`AnnData`](https://anndata.readthedocs.io/en/latest/index.html) Object.

`AnnData` stands for Annotated Data, and is a data structure well suited for single cell data. It is a multi-faceted object composed of matrices and DataFrames

In [2]:
from dask.distributed import Client
from ark.utils.data_utils import ConvertToAnnData, load_anndatas
import os

In [3]:
Client(threads_per_worker = 2)

0,1
Connection method: Cluster object,Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status,

0,1
Dashboard: http://127.0.0.1:8787/status,Workers: 5
Total threads: 10,Total memory: 64.00 GiB
Status: running,Using processes: True

0,1
Comm: tcp://127.0.0.1:62476,Workers: 5
Dashboard: http://127.0.0.1:8787/status,Total threads: 10
Started: Just now,Total memory: 64.00 GiB

0,1
Comm: tcp://127.0.0.1:62491,Total threads: 2
Dashboard: http://127.0.0.1:62498/status,Memory: 12.80 GiB
Nanny: tcp://127.0.0.1:62479,
Local directory: /var/folders/fy/q2szypn9325d_0g0nq049k300000gq/T/dask-scratch-space/worker-uwk8w2vo,Local directory: /var/folders/fy/q2szypn9325d_0g0nq049k300000gq/T/dask-scratch-space/worker-uwk8w2vo

0,1
Comm: tcp://127.0.0.1:62490,Total threads: 2
Dashboard: http://127.0.0.1:62496/status,Memory: 12.80 GiB
Nanny: tcp://127.0.0.1:62481,
Local directory: /var/folders/fy/q2szypn9325d_0g0nq049k300000gq/T/dask-scratch-space/worker-tt9z3os7,Local directory: /var/folders/fy/q2szypn9325d_0g0nq049k300000gq/T/dask-scratch-space/worker-tt9z3os7

0,1
Comm: tcp://127.0.0.1:62493,Total threads: 2
Dashboard: http://127.0.0.1:62494/status,Memory: 12.80 GiB
Nanny: tcp://127.0.0.1:62483,
Local directory: /var/folders/fy/q2szypn9325d_0g0nq049k300000gq/T/dask-scratch-space/worker-f213i94_,Local directory: /var/folders/fy/q2szypn9325d_0g0nq049k300000gq/T/dask-scratch-space/worker-f213i94_

0,1
Comm: tcp://127.0.0.1:62492,Total threads: 2
Dashboard: http://127.0.0.1:62495/status,Memory: 12.80 GiB
Nanny: tcp://127.0.0.1:62485,
Local directory: /var/folders/fy/q2szypn9325d_0g0nq049k300000gq/T/dask-scratch-space/worker-97arutno,Local directory: /var/folders/fy/q2szypn9325d_0g0nq049k300000gq/T/dask-scratch-space/worker-97arutno

0,1
Comm: tcp://127.0.0.1:62489,Total threads: 2
Dashboard: http://127.0.0.1:62497/status,Memory: 12.80 GiB
Nanny: tcp://127.0.0.1:62487,
Local directory: /var/folders/fy/q2szypn9325d_0g0nq049k300000gq/T/dask-scratch-space/worker-3v0p5osu,Local directory: /var/folders/fy/q2szypn9325d_0g0nq049k300000gq/T/dask-scratch-space/worker-3v0p5osu


In [4]:
base_dir = "../data/example_dataset/"

## 0. Download the Example Dataset

Here we are using the example data located in `/data/example_dataset/input_data`. To modify this notebook to run using your own data, simply change `base_dir` to point to your own sub-directory within the data folder.

* `base_dir`: the path to all of your imaging data. This directory will contain all of the data generated by this notebook, as well as the data previously generated by segmentation and cell clustering.

In [5]:
from ark.utils.example_dataset import get_example_dataset

get_example_dataset(dataset="post_clustering", save_dir= base_dir, overwrite_existing=True)



## 1. Convert the Cell Table to `AnnData` Objects

- `cell_table_path`: The path to the cell table that you wish to convert to `AnnData` objects. 

In [None]:
cell_table_path = os.path.join(base_dir, "segmentation/cell_table/cell_table_size_normalized_cell_labels.csv")

- `markers`: These are the names of the markers that you wish to extract from the Cell Table. You can specify each marker that you would like to use, or you may set it to `None` in order to grab all markers.
- `extra_obs_parameters`: These are the names of the extra columns in the Cell Table that you wish to extract. You can specify each parameter that you would like to use, or you may set it to `None` in order to grab all parameters.

In [None]:
# markers = ["CD14", "CD163", "CD20", "CD3", "CD31", "CD4", "CD45", "CD68", "CD8", "CK17", "Collagen1", "ECAD",
#               "Fibronectin", "GLUT1", "H3K27me3", "H3K9ac", "HLADR", "IDO", "Ki67", "PD1", "SMA", "Vim"]
markers = None

In [None]:
convert_to_anndata = ConvertToAnnData(cell_table_path, markers=markers)

In [None]:
anndata_save_dir = os.path.join(base_dir, "anndata")

In [None]:
fov_adata_paths = convert_to_anndata.convert_to_adata(save_dir=anndata_save_dir)

In [None]:
fovs_ac = load_anndatas(anndata_dir=anndata_save_dir, join_obsm="inner")