# Tutorial 1 (Basic): How to get the whole cell dataset, whole FOV dataset, or whole feature dataset

The quilt data package (hipsc_single_cell_image_dataset) contains 216062 single cells segmented from 18186 field-of-view (FOV) with selected features calculated for each cell. In this tutorial, we will show 
* (1) how to get everything (ALERT! >13 TB), 
* (2) how to get all the single cell data (raw and segmentation),
* (3) how to get all FOV data (raw and segmentation),
* (4) how to get all the feature data

In [None]:
######### FOR google COLAB user only #########
### install necessary packages if in colab ###
##############################################

############################################################
### make sure to restart runtime after running this step ###
############################################################
def run_subprocess_command(cmd):
    process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE)
    for line in process.stdout:
        print(line.decode().strip())

import sys, subprocess

IN_COLAB = "google.colab" in sys.modules
colab_requirements = [
    "pip install urllib3==1.25.4",
    "pip install PyYAML==5.1",
    "pip install quilt3",
]
if IN_COLAB:
    for i in colab_requirements:
        run_subprocess_command(i)

In [None]:
import pandas as pd
import quilt3
from pathlib import Path

In [None]:
# connect to quilt
pkg = quilt3.Package.browse("aics/hipsc_single_cell_image_dataset", registry="s3://allencell")
meta_df = pkg["metadata.csv"]()

In [None]:
# a quick look at what are the columns 
print(meta_df.columns)

## Example 1: get everything

In [None]:
# Large file alert! The package size is > 13 TB
save_path = "C:Projects/allen_cell_data/"
pkg.fetch(save_path)

## Example 2: Get all single cell data (no FOV images)

In [None]:
save_path = Path("C:Projects/allen_cell_data/")

# download single cell raw images (cell membrane dye, dna dye, structure)
pkg["crop_raw"].fetch(save_path / Path("crop_raw"))

# download single cell segmentation images (cell seg, nucleus seg, and structure seg)
pkg["crop_seg"].fetch(save_path / Path("crop_seg"))

# download the meta information
meta_df.to_csv(save_path / "meta_info.csv")

## Example 3: Get all FOV data (raw images and segmentations)

In [None]:
save_path = Path("C:Projects/allen_cell_data/")

# download FOV images (cell membrane dye, dna dye, structure, brightfield)
pkg["fov_path"].fetch(save_path / Path("fov_path"))

# download cell and nuclear segmentation of each FOV
pkg["fov_seg_path"].fetch(save_path / Path("fov_seg_path"))

# download structure segmentation of each FOV
pkg["struct_seg_path"].fetch(save_path / Path("struct_seg_path"))

# download the meta information
meta_df.to_csv(save_path / "meta_info.csv")

## Example 4: Get all feature data (without downloading any images)

In [None]:
save_path = Path("C:Projects/allen_cell_data/")

# download the features 
# (extra columns about image filepath will be included, but can be easily ignored when analyzing the features)
meta_df.to_csv(save_path / "features.csv")