# Archive MERFISH Experiment

## Goal
- Prepare a tar.gz file for single MERFISH experiment
- Compress active csv files
- Save transcripts in HDF5 format
- Save TIFF image in zarr format with proper chunks

## Manual Prepare Steps Before Start
### Put files in place
1. Create a directory with MERFISH experiment name
2. In this directory, create two sub-directories
    - raw: contains the exact content of the raw dir produced by the machine
    - output: contains region dir produced by the default analysis pipeline

### Example
A real example before any `merfishing` processing
```
# Experiment dir
202205231554_MouseSagittalM3S1_VMSC01101/
├── output/
│   └── region_0/
│       ├── 202205231554_MouseSagittalM3S1_VMSC01101_region_0.vzg
│       ├── cell_boundaries/  # contains HDF files for cell boundries
│       ├── cell_by_gene.csv
│       ├── cell_metadata.csv
│       ├── detected_transcripts.csv
│       ├── images/  # contains all the TIFF files for DAPI, PolyT and smFISH, if any
│       └── summary.png
│   └── region_1/
│       ├── ... # your experiment may contains multiple regions if you circled multiple
└── raw/
    ├── data/  # contains the very raw DAX files
    ├── low_resolution/
    ├── seg_preview/
    ├── settings/
    ├── analysis.json
    ├── codebook_0_MouseGene500V1_VA00117.csv  # the codebook used for this experiment
    ├── dataorganization.csv
    ├── EXPERIMENT_FINISHED
    ├── experiment.json
    ├── RAW_DATA_SYNC_FINISHED
    └── settings  # contains the experiment settings
```

```{note}
The `analysis` dir contains intermediate files created by MERLin pipeline from the raw, we don't archive it.
```

## Archive Process

In [None]:
from merfishing import ArchiveMerfishExperiment

In [None]:
# this is a small test dataset
experiment_dir = '/home/qzeng/project/merfish/analysis/202210271225_MouseHanqingB22_VMSC0110/'

```{caution}
The code below will instantly start archive process, data in the experiment_dir will be modified.
```

In [None]:
# This step take ~16 hours to run on a real 500-gene 1cm2 experiment
ArchiveMerfishExperiment(experiment_dir)

## After Archive Process Finished

The archive code above will generate a tar.gz file located in `{experiment_dir}/{experiment_name}.tar.gz`. Archive this file for long term data storage.

```{important}
Once the `tar.gz` file is successfully generated, the raw data will be deleted automatically to save space. Make sure you archive the `tar.gz` file properly.
```