# Tutorial: Saving strategies

This tutorial will teach you how to save your processed variables using different *SavingStrategies*.

We will use the loaded variables from the cdf file to showcase how to save them with different formats. We will add additional variables so that we have more to save.

In [None]:
import logging
import sys
from datetime import datetime, timezone

from astropy import units as u

import el_paso as ep

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)

extraction_infos = [
    ep.ExtractionInfo(
        result_key="Epoch",
        name_or_column="Epoch_Ele",
        unit=ep.units.cdf_epoch,
    ),
    ep.ExtractionInfo(
        result_key="FEDU",
        name_or_column="FEDU",
        unit=(u.cm**2 * u.s * u.sr * u.keV) ** (-1),
    ),
    ep.ExtractionInfo(
        result_key="xGEO",
        name_or_column="Position_Ele",
        unit=u.km,
    ),
]

start_time = datetime(2017, 7, 30, tzinfo=timezone.utc)
end_time = datetime(2017, 8, 1, 23, 59, 59, tzinfo=timezone.utc)

file_name_stem = "rbspa_rel04_ect-hope-pa-l3_YYYYMMDD_.{6}.cdf"

ep.download(
    start_time,
    end_time,
    save_path=".",
    download_url="https://spdf.gsfc.nasa.gov/pub/data/rbsp/rbspa/l3/ect/hope/pitchangle/rel04/YYYY/",
    file_name_stem=file_name_stem,
    file_cadence="daily",
    method="request",
    skip_existing=True,
)

variables = ep.extract_variables_from_files(
    start_time, end_time, "daily", data_path=".", file_name_stem=file_name_stem, extraction_infos=extraction_infos
)
variables

## Single file strategy

First, we want to save the variables using the *SingleFileStrategy*. This is the simplest way to save variables, as everything is simple put into one file. Dependent on the file ending, different file formats will be saved. Possible formats are ".mat", ".pickle", and ".h5".

The units of the variables are not changed when using the SingleFileStrategy. 

In [None]:
saving_strategy = ep.saving_strategies.SingleFileStrategy("rbsp_hope_example.pickle")
ep.save(
    variables, saving_strategy=saving_strategy, start_time=start_time, end_time=end_time, time_var=variables["Epoch"]
)

Let's inspect what got saved. The variables are turned into simple numpy arrays before saving. You can see that also a *metadata* variable has been saved. We will look closer at metadata in a different tutorial.

In [None]:
import pickle

with open("rbsp_hope_example.pickle", "rb") as f:
    loaded_data = pickle.load(f)
    print("Keys: ", loaded_data.keys())
    print("Metadata: ", loaded_data["metadata"])
    print("xGEO[0,:]: ", loaded_data["xGEO"][0, :] * u.Unit(loaded_data["metadata"]["xGEO"]["unit"]))

## Data-org strategy

All data at GFZ is stored under a standard, which we call DataOrg (Data Organization). The standard inlcudes monthly files, where variables are split into separate files to reduce loading times. All files are saved as .mat files. By using the *DataOrgStrategy*, the variables are automatically sorted into the corresponding files, based on the key of the variable_dict. Additionally, the variables are converted into the units as described in the standard.

For using it, we have to store the variables in a dictionary with certain keys, as specified by the standard. 

In [None]:
variables_to_save = {
    "time": variables["Epoch"],
    "Flux": variables["FEDU"],
    "xGEO": variables["xGEO"],
}

saving_strategy = ep.saving_strategies.DataOrgStrategy(
    ".", mission="RBSP", satellite="rbspa", instrument="hope", kext="T89"
)
ep.save(variables_to_save, saving_strategy, start_time, end_time, time_var=variables["Epoch"])

Let's inspect again, what has been saved. Flux-files and xGEO-files got saved. Note how the metadata struct is again saved and that the units of xGEO got automatically transformed from km to R_E. 

In [None]:
from pathlib import Path
from scipy.io import loadmat

saved_files = sorted(Path("RBSP/rbspa/Processed_Mat_Files").glob("*.mat"))
print("Number of saved files:", len(saved_files))
print("Saved files:", saved_files)

xGEO_data = loadmat(saved_files[1], simplify_cells=True)

print("xGEO data keys:", xGEO_data.keys())
print("xGEO data metadata:", xGEO_data["metadata"]["xGEO"].keys())
print("xGEO[0,:]:", xGEO_data["xGEO"][0, :] * u.Unit(xGEO_data["metadata"]["xGEO"]["unit"]))