# ASASSN Data

## Preparation:

   - Create a folder to hold the data (`data/asaasn`) 
   - Download the V-band data (https://drive.google.com/drive/folders/1IAtztpddDeh5XOiuxmLWdLUaT_quXkug)
     Make sure you have the files in the data directory `asassn_catalog_full.csv` and `asassnvarlc_vband_complete.zip`
     (Do not unzip the light curve file!)
   - Download the g-band data (https://drive.google.com/drive/folders/1gxcIokRsw1eyPmbPZ0-C8blfRGItSOAu)
     Make sure you have the files `asassn_variables_x.csv` and `g_band_lcs-001.tar.gz`. Unzip but do not untar this file:

```bash
unzip g_band_lcs-001.tar.gz
```

In [5]:
from pathlib import Path
import numpy as np
from torch.utils.data import DataLoader
from core.dataset_multimodal import collate_fn, ASASSNVarStarDataset

In [14]:
datapath = Path("data/data/asaasn")
ds=ASASSNVarStarDataset(datapath,10,verbose=True,only_periodic=True,merge_type="inner",
                       recalc_period=True,prime=True,use_bands=["v", "g"], only_sources_with_spectra=True)

In [16]:
## what's the structure of what we just made?
for k, v in ds[0].items():
    s = v[0]
    if isinstance(s, (np.int64, int, float)):
        rez = (1,)
    elif isinstance(s, np.ndarray):
        rez = s.shape
    elif isinstance(s, list):
        if len(s) == 0:
            rez = "None"
        else:
            if isinstance(s[0], (tuple)):
                rez = ", ".join(str(x.shape) for x in s[0]) 
            elif isinstance(s[0], (str, float, int)):
                rez = f"[{len(s)}]"
            else:
                rez = ", ".join(str(x.shape) for x in s)           
    else:
        rez = "?"
    print(k, rez)

In [17]:
train_dataloader = DataLoader(ds, batch_size=2, shuffle=True,collate_fn=collate_fn, 
                              num_workers=4, pin_memory=True, multiprocessing_context="fork")

In [21]:
batch = next(iter(train_dataloader))