# How to create Data objects

The SciDataTool python module has been created to **ease the handling of scientific data**, and considerately simplify plot commands. It unifies the extraction of relevant data (e.g. slices), whether they are stored in the time/space or in the frequency domain. The call to Fourier Transform functions is **transparent**, although it still can be parameterized through the use of a dictionary.

This tutorial explains the **structure** of the `Data` classes, then shows **how to create axes and fields objects**.

The following example demonstrates the syntax to **quickly create a 2D data field** (airgap radial flux density) depending on time and angle:

In [1]:
# Add SciDataTool to the Python path
import sys
sys.path.append('../..')

# Import useful packages
from os.path import join
from numpy import pi, squeeze
from pandas import ExcelFile, read_excel

# Import SciDataTool modules
from SciDataTool import Data1D, DataTime

# Import scientific data
from SciDataTool.Tests import DATA_DIR
xls_file = ExcelFile(join(DATA_DIR, "tutorials_data.xlsx"))
time = read_excel(xls_file, sheet_name="time", header=None, nrows=1, squeeze=True).to_numpy()
angle = read_excel(xls_file, sheet_name="angle", header=None, nrows=1, squeeze=True).to_numpy()
field = read_excel(xls_file, sheet_name="Br", header=None, nrows=2016, squeeze=True).to_numpy()

#---------------------------------------------------------------
# Create Data objects
Time = Data1D(name="time", unit="s", values=time)
Angle = Data1D(name="angle", unit="rad", values=angle)
Br = DataTime(
            name="Airgap radial flux density",
            unit="T",
            symbol="B_r",
            axes=[Time, Angle],
            values=field,
        )
#---------------------------------------------------------------

Your `Data`objects have been successfully created. Other features of the `SciDataTool` package are also available:
- reduce storage if an axis is regularly spaced
- reduce storage if the field presents a symmetry along one of its axes
- store a field in the frequency domain
- specifiy normalizations

These functionalities are described in the following sections.

## 1. Data class structure
The `Data` class is composed of:
- classes describing **axes**: `Data1D`, or `DataLinspace` if the axis is regularly spaced (see [section 2](#How-to-reduce-storage-if-an-axis-is-regularly-spaced))
- classes describing **fields** stored in the time/space domain (`DataTime`) or in the frequential domain (`DataFreq`)

The following UML summarizes this structure:

<div>
<img src="_static/UML_Data_Object.png" width="450"/>
</div>

The attributes in red are **mandatory**, those in gray are **optional**. To correctly fill the mandatory attributes, it is advised to follow these principles:
- `values` is a **numpy array**
- `axes` is a **list** of `Data1D` or `DataLinspace`
- `name` is **string** corresponding to a short description of the field, or the 
- `symbol` is a **string** giving the symbol of the field in LaTeX format
- `unit` is a **string** among the list: `[dimless, m, rad, °, g, s, min, h, Hz, rpm, degC, A, J, W, N, C, T, G, V, F, H, Ohm, At, Wb, Mx]`, with a prefix `[k, h, da, d, c, m, etc.]`. Composed units are also available (e.g. `mm/s^2`). It is best to use such a LaTeX formatting for axis labelling. Other units can be added in [conversions.py](https://github.com/Eomys/SciDataTool/blob/master/Functions/conversions.py).
- for `Data1D` and `DataLinspace`, `name` + `[unit]` can be used to label axes
- for `DataTime` and `DataFreq`, `name` can be used as plot title, and `symbol` + `[unit]` as label

The following sections explain how to use the optional attributes to optimize storage.

## 2. How to reduce storage if an axis is regularly spaced
Axes often have a **regular distribution**, so that the use of `DataLinspace` allows to reduce the storage.

A `DataLinspace` object has five properties instead of the `values` array: `initial`, `final`, `step` and `number` allow to define the linspace vector (3 out of these 4 suffice), and `include_endpoint` is a boolean used to indicate whether the final point should be included or not (default `False`).

In the following example, the angle vector is defined as a linspace:

In [2]:
from SciDataTool import DataLinspace

#---------------------------------------------------------------
# Create Data objects
Angle = DataLinspace(
            name="angle",
            unit="rad",
            symmetries={},
            initial=0,
            final=2*pi,
            number=2016,
        )
#---------------------------------------------------------------

## 3. How to reduce storage if a field presents a symmetry/periodicity
If a signal shows a **symmetry** or a **periodicity** along one or several of its axes, it is possible to store only the relevant part of the signal, and save the information necessary to rebuild it within the optional attribute `symmetries`. A repeting signal can either be periodic: $f(t+T)=f(t)$, or antiperiodic: $f(t+T)=-f(t)$. Indeed, we can consider that a symmetric signal is a periodic signal of period $T=N/2$.

`symmetries` is a dictionary containing the **name of the axis** and a **dictionary** of its symmetry (`{"period": n}` or `{"antiperiod": n}`, with *n* the number of periods in the complete signal. Note that the symmetries dictionary must be shared with the field itself (`DataTime` or `DataFreq`).

In the following example, the time vector and the field are reduced to one third before being stored.

In [3]:
time_reduced = time[0:time.size//3]
field_reduced = field[0:time.size//3,:]

#---------------------------------------------------------------
# Create Data objects
Time_reduced = Data1D(name="time", unit="s", symmetries={"time": {"period": 3}}, values=time_reduced)
Br_reduced = DataTime(
            name="Airgap radial flux density",
            unit="T",
            symbol="B_r",
            axes=[Time, Angle],
            values=field,
            symmetries={"time": {"period": 3}},
        )
#---------------------------------------------------------------

## 4. How to store a field in the frequency domain
If one prefers to store data in the frequency domain, for example because most postprocessings will handle spectra, or because a small number of harmonics allow to reduce storage, the `DataFreq` class can be used.

The definition is similar to the `DataTime` one, with the difference that the axes now have to be **frequencies** or **wavenumbers** and a `DataFreq` object is created.

Since we want to be able to go back to the time/space domain, there must exist a corresponding axis name. For the time being, the existing **correspondances** are:
  + `"time"` &harr; `"freqs"`
  + `"angle"` &harr; `"wavenumber"`

This list is to be expanded, and a possibility to manually add a correspondance will be implemented soon.

In the following example, a field is stored in a `DataFreq` object.

In [5]:
from SciDataTool import DataFreq

# Import scientific data
freqs = read_excel(xls_file, sheet_name="freqs", header=None, nrows=1, squeeze=True).to_numpy()
wavenumber = read_excel(xls_file, sheet_name="wavenumber", header=None, nrows=1, squeeze=True).to_numpy()
field_fft2 = read_excel(xls_file, sheet_name="Br_fft2", header=None, nrows=2016, squeeze=True).to_numpy()

#---------------------------------------------------------------
# Create Data objects
Freqs = Data1D(name="freqs", unit="Hz", values=freqs)
Wavenumber = Data1D(name="wavenumber", unit="dimless", values=wavenumber)
Br_fft = DataFreq(
            name="Airgap radial flux density",
            unit="T",
            symbol="B_r",
            axes=[Freqs, Wavenumber],
            values=field_fft2,
        )
#---------------------------------------------------------------

## 5. How to specify normalizations (axes or field)
If you plan to **normalize** your field or its axes during certain postprocessings (but not all), you might want to store the normalizations values. To do so, you can use the `normalizations` attribute, which is a dictionaray:
- for a normalization of the **field**, use `"ref"` (e.g. `{"ref": 0.8}`)
- for a normalization of an **axis**, use the name of the normalized axis unit (e.g. `{"elec_order": 60}`). There is no list of predefined normalized axis units, you simply must make sure to request it when you extract data (see [How to extract slices](https://github.com/Eomys/SciDataTool/tree/master/Tutorials/tuto_Slices.ipynb))
- to **convert** to a unit which does not exist in the predefined units, and if there exists a proportionality relation, it is also possible to add it in the `normalizations` dictionary (e.g. `{"nameofmyunit": 154}`)

This dictionary can also be updated later.

See below examples of use of `normalizations`:

In [6]:
#---------------------------------------------------------------
Br = DataTime(
            name="Airgap radial flux density",
            unit="T",
            symbol="B_r",
            axes=[Time, Angle],
            normalizations={"ref": 0.8, "elec_order": 60},
            values=field,
        )
Br.normalizations["space_order"] = 3
#---------------------------------------------------------------

Now that the `Data` objects have been created, we can:
- [extract slices](https://github.com/Eomys/SciDataTool/blob/master/SciDataTool/Tutorials/tuto2_Slices.ipynb)
- [compare several fields](https://github.com/Eomys/SciDataTool/blob/master/SciDataTool/Tutorials/tuto3_Compare.ipynb)
- [perform advanced Fourier Transforms](https://github.com/Eomys/SciDataTool/blob/master/SciDataTool/Tutorials/tuto4_Fourier.ipynb)