# Data loading

Here we demonstrate how chromatin tracing data are loaded and stored in memory. Briefly, data are loaded and stored by chromosome: for each chromosome, a `AnnData` object with $n$ rows and $p$ columns is created, where $n$ is the number of traces, and $p$ is the number of imaging loci on the chromosome.

The data used in this notebook is the 25Kb subset from [Takei et al., 2021](https://www.science.org/doi/10.1126/science.abj1966). The formatted data (in FOF_CT-core format, the 4DN standard chromatin tracing data format, read more at [here](https://fish-omics-format.readthedocs.io/en/latest/)) can be downloaded from the 4DN data portal with IDs: 4DNFIW4S8M6J (biological replicate 1), 4DNFI4LI6NNV (biological replicate 2), and 4DNFIDUJQDNO (biological replicate 3).

In [1]:
import arcfish as sf

## Loading one or multiple csv

Download chromatin tracing data and place them as following:
```sh
..
├── data
│   ├── takei_science_2021
│   │   ├── 4DNFIW4S8M6J.csv
│   │   ├── 4DNFI4LI6NNV.csv
│   │   └── 4DNFIDUJQDNO.csv
├── tutorial
│   └── load_storage.ipynb
```
Then load the data with `sf.pp.FOF_CT_Loader`. 

In [2]:
loader = sf.pp.FOF_CT_Loader({
    "rep1": "../data/takei_science_2021/4DNFIW4S8M6J.csv",
    "rep2": "../data/takei_science_2021/4DNFI4LI6NNV.csv",
    "rep3": "../data/takei_science_2021/4DNFIDUJQDNO.csv",
}, nm_ratio={"X": 103, "Y": 103, "Z": 250})

In [3]:
loader.info["rep1"]

{'FOF-CT_version': 'v0.1',
 'Table_namespace': '4dn_FOF-CT_core',
 'genome_assembly': 'GRCm38/mm10',
 'XYZ_unit': 'micron',
 'Software_Title': 'dna-seqfish-plus-tissue',
 'Software_Type': 'preprocess+process+decode',
 'Software_Authors': 'Takei, Yodai; Pierson, Nico; Shah, Sheel; White, Jonathan; Cai, Long"',
 'Software_Description': 'dna-seqfish-plus-tissue software was developed for processing the images and barcode calling for the DNA seqFISH+ experiment in tissue sections.',
 'Software_Repository': 'https://github.com/CaiGroup/dna-seqfish-plus-tissue',
 'Software_PreferredCitationID': 'https://www.science.org/doi/10.1126/science.abj1966',
 'lab_name': 'Cai',
 'experimenter_name': 'Yodai Takei',
 'experimenter_contact': 'ytakei@caltech.edu',
 'additional_tables': '4dn_FOF-CT_quality, 4dn_FOF-CT_rna, 4dn_FOF-CT_cell"',
 'columns': ['Spot_ID',
  'Trace_ID',
  'X',
  'Y',
  'Z',
  'Chrom',
  'Chrom_Start',
  'Chrom_End',
  'Cell_ID',
  'Extra_Cell_ROI_ID']}