Last edit: 2024-07-22
# Example notebook - How to use 🦜 in a jupyter notebook
Here you can see how *parrot* can be used to read-in continous recorded data (contained in the folder `example_data`) and how this data can be processed and plotted.

The second example notebook processes the same dataset in a more compact fashion.

In [1]:
# To locate the resources
from importlib import resources
# To read-in the raw data
import h5py
# To make intermedia plots
import matplotlib.pyplot as plt
from matplotlib.ticker import EngFormatter

import parrot

Activate inline plotting (can be changed to `%matplotlib qt5` or `%matplotlib notebook`)

In [2]:
%matplotlib inline

To get more information, we activate the `debug`-mode of *parrot*. You can of course leave it also on `False`. Maybe see the other jupyter notebook for a more compact way of processing the data.

In [3]:
parrot.config.set_debug(True)

In [4]:
parrot.config.logger

<RootLogger root (INFO)>

Our example files are stored in the efficient binary format HDF-5, which can be read-in by Python, Matlab and other programs.
If you want to learn more about this data format, check-out this YouTube-Tutorial series:
https://www.youtube.com/watch?v=S74Kc8QYDac&list=PLPyhR4PdEeGYWHRhzmCP5stzfIha8bqVg

But for our current purpose, this knowledge is not necessary, since we just read the files in:

Check if example files in `example_data` are available, otherwise download from zenodo.org

In [5]:
parrot.example.init.run()

###
Could not find file light.h5.
Downloading file light.h5...
Downloaded correct file: light.h5
###
Could not find file dark1.h5.
Downloading file dark1.h5...
Downloaded correct file: dark1.h5
###
Could not find file dark2.h5.
Downloading file dark2.h5...
Downloaded correct file: dark2.h5


In [7]:
parrot.example.init

<module 'parrot.example.init' from 'C:\\Users\\Tim\\AppData\\Local\\anaconda3\\Lib\\site-packages\\parrot\\example\\init.py'>

In [6]:
def get_data(file_name):
    """We load the downloaded example files inside the parrot-module."""
    my_file = (resources.files(parrot.example.example_data) / file_name)
    with h5py.File(my_file, "r") as f:
        time = f["time"][:]
        position = f["position"][:]
        signal = f["signal"][:]
        data_dict = {"time": time, "position": position, "signal": signal}
    return data_dict

light = get_data("light.h5")
dark1 = get_data("dark1.h5")
dark2 = get_data("dark2.h5")

FileNotFoundError: [Errno 2] Unable to open file (unable to open file: name = 'C:\Users\Tim\AppData\Local\anaconda3\Lib\site-packages\parrot\example\example_data\light.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

Each file (`light`, `dark1`, `dark2`) contains three arrays `time`, (shaker) `position` , and `signal`.
It is important to have the square brackets with the colon to not only have a view on the data (which is gone when the file is close) but fully read-in the array.

Each file was stored inside a Python dictionary with three keys and three corresponding numpy arrays containg the values.

In [None]:
light

Let us take a closer look on the data and plot the postion vs. time and the signal vs. time:

In [None]:
fig, ax = plt.subplots()
ax.plot(light["time"], light["position"], color="tab:blue")
ax.grid(True)
ax.xaxis.set_major_formatter(EngFormatter("s"))
ax.yaxis.set_major_formatter(EngFormatter("V"))
ax.set_ylabel("Position", color="tab:blue")
ax.set_xlabel("Lab time")

ax2 = ax.twinx()
ax2.plot(light["time"], light["signal"], color="tab:orange")
ax2.yaxis.set_major_formatter(EngFormatter("V"))
ax2.set_ylabel("Signal", color="tab:orange")
plt.show(block=False)

We see that we have a 60 s measurement, but besides this is the data density too high. Let us zoom-in a bit more to see single THz traces:

In [None]:
fig, ax = plt.subplots()
ax.plot(light["time"], light["position"], color="tab:blue")
ax.grid(True)
ax.xaxis.set_major_formatter(EngFormatter("s"))
ax.yaxis.set_major_formatter(EngFormatter("V"))
ax.set_ylabel("Position", color="tab:blue")
ax.set_xlabel("Lab time")

ax2 = ax.twinx()
ax2.plot(light["time"], light["signal"], color="tab:orange")
ax2.yaxis.set_major_formatter(EngFormatter("V"))
ax2.set_ylabel("Signal", color="tab:orange")

ax.set_xlim([0,0.2])
plt.show(block=False)

We can detect the sinusoidal moving pattern (blue) of the oscillating delay line. It needs 100 ms for one period, meaning the shaker was running at 10 Hz. Since we see a THz trace when the shaker is moving forward as well backward, we get double the amount of traces, thus 20 THz traces / s.
As expected, the signal is mirrored when the shaker moves backward. Additionally, there is some "bending" of the baseline of the THz trace, which is repeatable. *parrot* can compensate these systematic errors.

This systematic error correction works best, when two dark measurements are supplied alongside the light measurement. A dark measurement has all the same recording settings, just the THz beam is blocked. This measures the noise floor and one can not only assess the system performance, but it also helps to correct systematic errors.

Since we alredy read-in our data, we don't need the `Load`-class of *parrot*. Instead, we can directly process the data with the `process`-module.

There are multiple methods available to the user, depending on which dataset is available:
1. `thz_and_two_darks`
2. `thz_and_dark`
4. `thz_only`
5. `dark_only`

The only missing information is the conversion between the voltage recorded for the position channel and the corresponding delay in light time.
A `scale`-factor needs to be supplied to facilitate [V] -> [s]. The oscillating delay line used for this example data is the [APE scanDelay 50 ps](https://www.ape-berlin.de/en/optical-delay/#1500043794885-e05f0280-1597).

It has a calibrated voltage output of +-10 V (20 V peak-peak) for a corresponding delay of 50 ps.

In [None]:
scale = 50e-12 / 20

The last setting to discuss is the `debug`-parameter. At various positions inside *parrot* a message with various priorities (`DEBUG`, `INFO`, and `WARNING`) is passed to an internal logger. When the parameter is not speficied, only log-messages of the kind `WARNING` will be passed to the user. When selecting `debug=True`, log-messages of the kind `INFO` and higher will be displayed.

Parrot will first analyze the light dataset, where the THz data from the forward/backward movement of the delay stage is processed. 
A delay on the positional data will be adjusted, so that the single traces overlap and the standard deviation of all traces is minimized.

In [None]:
data = parrot.process.thz_and_two_darks(light, 
                                        dark1, 
                                        dark2, 
                                        scale=scale, 
                                        debug=True)

Afterwards, the `data` dictionary contains three keys for the three different measurements as well as one "helping" key, to keep track what function is applied to the data (so far no function):

In [None]:
data.keys()

All available information is combined within each key. The most important ones are `light_time`, `single_traces` and `average`. The first one is a 1-D numpy array, the second element is a 2-D numpy array, consisting of the interpolated sampling number times the number of single traces extracted from the continuous measurement. The last one, `average` is another python dictionary.

In [None]:
data["light"].keys()

The key `average` contains the averaged datset in `time_domain` and `frequency_domain`:

In [None]:
data["light"]["average"].keys()

In [None]:
fig, ax = plt.subplots()
for mode in ["light", "dark1", "dark2"]:
    ax.plot(data[mode]["light_time"], data[mode]["average"]["time_domain"], label=mode)
ax.grid(True)
ax.xaxis.set_major_formatter(EngFormatter("s"))
ax.yaxis.set_major_formatter(EngFormatter("V"))
ax.set_ylabel("Signal")
ax.set_xlabel("Light time")
ax.legend(loc="upper left")
plt.show(block=False)

### When using the `plot`-module of *parrot*, we receive an **error**:

In [None]:
parrot.plot.simple_multi_cycle(data)

*parrot* makes us aware, that we have supplied two dark measurements but did not apply the systematic error correction from the `post_process_data`-module to our data. 

## Let us fix this problem:

In [None]:
data = parrot.post_process_data.correct_systematic_errors(data)

As we can see, another key was added to our `data`-dictionary, simply called `dark`. This `dark` dataset as well as the `light` dataset were corrected for systematic errors:

In [None]:
data.keys()

When taking a look to the key `applied_functions`, we can see that the applied function was also added to this list:

In [None]:
data["applied_functions"]

In [None]:
parrot.plot.simple_multi_cycle(data)

The three subplots show the time domain and two times the frequency domain, once the amplitude on a normalized linear scale and once the power spectrum on a logarithmic scale in dB. Nevertheless, *parrot* is warning us, that we didn't window the data, yet. 

The logarithmic frequency domain highlights the introduced artifacts: The amplitude at higher frequencies fluctuates less than the dark traces. Furthermore, there is a slight offset im amplitude between THz data and dark data.

## Let us fix this problem, too:

In [None]:
data = parrot.post_process_data.window(data)

In [None]:
data["applied_functions"]

In [None]:
parrot.plot.simple_multi_cycle(data)