# Tutorial: Variable class

This tutorial will teach you how to use the *Variable* class and what it can be used for.

## Variables from the csv file

We will use the loaded variables from the example_orbit.csv file to showcase how to deal with time and space variables.

In [2]:
from datetime import datetime, timezone

from astropy import units as u

import el_paso as ep

start_time = datetime(2019, 7, 30, 17, tzinfo=timezone.utc)
end_time   = datetime(2019, 8, 3, 5, tzinfo=timezone.utc)

extraction_infos = [
    ep.ExtractionInfo(
        result_key="Epoch",
        name_or_column="DATETIME",
        unit=u.dimensionless_unscaled,
    ),
    ep.ExtractionInfo(
        result_key="alt",
        name_or_column="alt(km)",
        unit=u.km,
    ),
    ep.ExtractionInfo(
        result_key="lon",
        name_or_column="lon(deg)",
        unit=u.km,
    ),
    ep.ExtractionInfo(
        result_key="lat",
        name_or_column="lat(deg)",
        unit=u.km,
    ),
]

variables = ep.extract_variables_from_files(start_time, end_time, "single_file",
                                             data_path=".", file_name_stem="example_orbit.csv",
                                             extraction_infos=extraction_infos)
variables

Extracting variables ...
		Finished in 0.004 seconds


{'Epoch': Variable holding (1000,) data points with metadata: VariableMetadata(unit=Unit(dimensionless), original_cadence_seconds=0, source_files=['example_orbit.csv'], description='', processing_notes='', standard_name=''),
 'alt': Variable holding (1000,) data points with metadata: VariableMetadata(unit=Unit("km"), original_cadence_seconds=0, source_files=['example_orbit.csv'], description='', processing_notes='', standard_name=''),
 'lon': Variable holding (1000,) data points with metadata: VariableMetadata(unit=Unit("km"), original_cadence_seconds=0, source_files=['example_orbit.csv'], description='', processing_notes='', standard_name=''),
 'lat': Variable holding (1000,) data points with metadata: VariableMetadata(unit=Unit("km"), original_cadence_seconds=0, source_files=['example_orbit.csv'], description='', processing_notes='', standard_name='')}

Let's look at the Epoch-variable first. At the end, we want to have the time variable with the unit of posixtime (also often called timestamp, basically seconds since Epoch).
In this case, the Epoch-variable holds strings, so the first step is to convert them into datetimes. There is a designated function for this available: 

In [3]:
datetimes = ep.processing.convert_string_to_datetime(variables["Epoch"])

Next, we can convert the datetimes into timestamps and store the data into the Epoch-variable. The data of a variable should always be changed by using the set_data function. This also allows you to specify a new unit (u.posixtime in this case). If the unit should not change, you can set the second parameter to "same". 

In [4]:
import numpy as np

posix_times = [t.timestamp() for t in datetimes]
variables["Epoch"].set_data(np.asarray(posix_times), u.posixtime)

Next, we transform the GDZ coordinates to GEO, which is the standard coordinate system used in *EL-PASO*. For this, we create a new variable. Here we use the function get_data(), which should be used to retrieve the data from a variable.  

In [5]:
from IRBEM import Coords

irbem_lib_path = "../IRBEM/libirbem.so"

xGDZ_arr = np.stack((variables["alt"].get_data(), variables["lat"].get_data(), variables["lon"].get_data())).T

model_coord = Coords(path=irbem_lib_path)

# convert time_array to datenums for transform function
xGEO_var = ep.Variable(original_unit=u.RE, data=model_coord.transform(datetimes, xGDZ_arr, ep.IRBEM_SYSAXIS_GDZ, ep.IRBEM_SYSAXIS_GEO))

## Variables from the cdf file

In [6]:
extraction_infos = [
    ep.ExtractionInfo(
        result_key="Epoch",
        name_or_column="Epoch_Ele",
        unit=u.tt2000,
    ),
    ep.ExtractionInfo(
        result_key="Energy_FEDU",
        name_or_column="HOPE_ENERGY_Ele",
        unit=u.eV,
    ),
    ep.ExtractionInfo(
        result_key="FEDU",
        name_or_column="FEDU",
        unit=(u.cm**2 * u.s * u.sr * u.keV) ** (-1),
    ),
]

start_time = datetime(2017, 7, 30, tzinfo=timezone.utc)
end_time = datetime(2017, 8, 1, 23, 59,59, tzinfo=timezone.utc)

file_name_stem = "rbspa_rel04_ect-hope-pa-l3_YYYYMMDD_.{6}.cdf"

ep.download(start_time, end_time,
             save_path=".",
             download_url="https://spdf.gsfc.nasa.gov/pub/data/rbsp/rbspa/l3/ect/hope/pitchangle/rel04/YYYY/",
             file_name_stem=file_name_stem,
             file_cadence="daily",
             method="request",
             skip_existing=True)

variables = ep.extract_variables_from_files(start_time, end_time, "daily",
                                             data_path=".", file_name_stem=file_name_stem,
                                             extraction_infos=extraction_infos)


File already exists, skipping download: rbspa_rel04_ect-hope-pa-l3_20170730_v7.3.0.cdf
File already exists, skipping download: rbspa_rel04_ect-hope-pa-l3_20170731_v7.4.0.cdf
File already exists, skipping download: rbspa_rel04_ect-hope-pa-l3_20170801_v7.3.0.cdf
Extracting variables ...
Concatenating data for Epoch_Ele ...
Concatenating data for HOPE_ENERGY_Ele ...
Concatenating data for FEDU ...
Concatenating data for Epoch_Ele ...
Concatenating data for HOPE_ENERGY_Ele ...
Concatenating data for FEDU ...
		Finished in 0.191 seconds


Handling the epoch variable is sometimes troublesome, as different formats are used. Usually, in cdf files, epoch is stored in the format *tt2000*, which counts the milliseconds since epoch. To make the conversion between tt2000 and posixtime easier, tt2000 is also defined as a custom astropy unit. The conversion from one astropy unit to another on is handled by the *convert_to_unit* method.

If needed, you can get datetimes from the posixtimes afterwards.

In [7]:
variables["Epoch"].convert_to_unit(u.posixtime)
print([datetime.fromtimestamp(timestamp, timezone.utc) for timestamp in variables["Epoch"].get_data()])

[datetime.datetime(2017, 7, 30, 0, 0, 11, 507000, tzinfo=datetime.timezone.utc), datetime.datetime(2017, 7, 30, 0, 0, 34, 203000, tzinfo=datetime.timezone.utc), datetime.datetime(2017, 7, 30, 0, 0, 56, 899000, tzinfo=datetime.timezone.utc), datetime.datetime(2017, 7, 30, 0, 1, 19, 596000, tzinfo=datetime.timezone.utc), datetime.datetime(2017, 7, 30, 0, 1, 30, 945000, tzinfo=datetime.timezone.utc), datetime.datetime(2017, 7, 30, 0, 1, 42, 293000, tzinfo=datetime.timezone.utc), datetime.datetime(2017, 7, 30, 0, 2, 4, 990000, tzinfo=datetime.timezone.utc), datetime.datetime(2017, 7, 30, 0, 2, 27, 685000, tzinfo=datetime.timezone.utc), datetime.datetime(2017, 7, 30, 0, 2, 39, 34000, tzinfo=datetime.timezone.utc), datetime.datetime(2017, 7, 30, 0, 2, 50, 382000, tzinfo=datetime.timezone.utc), datetime.datetime(2017, 7, 30, 0, 3, 13, 79000, tzinfo=datetime.timezone.utc), datetime.datetime(2017, 7, 30, 0, 3, 35, 776000, tzinfo=datetime.timezone.utc), datetime.datetime(2017, 7, 30, 0, 3, 58, 4

Retrieving data of Variables is realized through the *get_data()* method. It is optional to specify the unit, which should be used for the returned data. 

In [None]:
print("Energies in eV:", variables["Energy_FEDU"].get_data()[0,0:10])
print("Energies in MeV using get_data:", variables["Energy_FEDU"].get_data("MeV")[0,0:10])

variables["Energy_FEDU"].convert_to_unit("MeV")
print("Energies in MeV after converting:", variables["Energy_FEDU"].get_data()[0,0:10])


Energies in eV_: [14.98455  16.81365  18.8538   21.17535  23.707949 26.592299 29.828398
 33.486603 37.566902 42.13965 ]
Energies in MeV using get_data: [1.4984550e-05 1.6813650e-05 1.8853800e-05 2.1175350e-05 2.3707949e-05
 2.6592299e-05 2.9828398e-05 3.3486602e-05 3.7566901e-05 4.2139647e-05]
Energies in MeV after converting: [1.4984550e-05 1.6813650e-05 1.8853800e-05 2.1175350e-05 2.3707949e-05
 2.6592299e-05 2.9828398e-05 3.3486602e-05 3.7566901e-05 4.2139647e-05]


To change the data of a variable, the *set_data()* method should be used. To keep the units in mind, you always have to specify the new units of the variable. If you are sure, that the units did not change, you can set the new unit to "same". 

In [10]:
old_data = variables["Energy_FEDU"].get_data()

# just multiply by 2 without changing units
variables["Energy_FEDU"].set_data(old_data*2, "same")
print(variables["Energy_FEDU"].get_data()[0,0:10])

# doing the conversion to keV manually
variables["Energy_FEDU"].set_data(old_data*1e3, u.keV)
print(variables["Energy_FEDU"].get_data()[0,0:10])



[0.0299691  0.0336273  0.0377076  0.0423507  0.0474159  0.0531846
 0.0596568  0.0669732  0.0751338  0.08427929]
[14.9845495 16.81365   18.8538    21.17535   23.707949  26.592299
 29.828398  33.486603  37.566902  42.139645 ]
