# Calculating Integrated Power
In this notebook we describe how we calculate integrated power for total, LFE and HFE integrated power. The calculation is a two step process for effeciency:
- Perform linear interpolation of measured akr flux from Waters 20223 mask between each measured frequency
- Integrate each line that falls within the chosen frequency range and sum them

## Linear Interpolation
In this example we will use the linear_in_chunks function which wraps linear_segments and performs it over chunks of sweeps to handle memory problems. linear_segments peforms the linear interpolation per sweep.

The arguments of linear_in_chunks are the following:
- hdf5_file: this is the path and filename for the Waters mask dataset downloaded in the Data_Download.ipynb in a pandas hdf format.
- output_file: the path and filename for the new hdf5 file containing the equations of each linear function
- chunk_size: how many sweeps do you want to do at once. Larger chunks are faster but are heavier on memory.
- in_key: what key is used for the Waters mask dataset
- out_key: what key do you want to use fo the new dataset
- linear_segments_kwargs: provide extra keyword arguments that will be passed to the linear segments function

Possible arguments that can be passed in the linear_segments function:
- time: the string denoting the column name that contains time
- frequency: the string denoting the column name that contains frequency
- sweep: the string denoting the the column name that contains sweep number
- flux: the string denoting the column name that contains the flux data
- preserve_cols: what columns do you wish to keep
- preserve_funcs: how would you like to preserve the columns. i.e. what function would you like applied to the sweep grouping to the preserve_cols. This is because after linear interpolation sweep numbers only appear once. min, max and median can be provided as a string. 
- preserve_col_suffix: what would like to add to the preserve_cols column names. This is important if you want to apply multiple functions to the same columns

In this example we provide the same file path but a different key so the Waters mask and linear interpolation are in the same file but are seperate dataframes as they have different keys.

We also use chunks of 10,000 sweeps.


### Warning
The linear_in_chunks function appends to the hdf5 file at output_file and out_key. If that combination of file and key exist then it will append to that dataframe. If don't wish this to happen either delete the existing one or provide a different combination.

In [None]:
from Wind_Waves.integration_tools import linear_in_chunks
linear_in_chunks('../../Example_Data/Waters_mask.hdf5', '../../Example_Data/Revision_run_through/Waters_mask.hdf5',
                chunk_size=10_000, in_key='main', out_key='linear_fit',
                time='Date_UTC', frequency='freq', flux='akr_flux_si_1au',
                preserve_cols=['Date_UTC', 'Burst_Number'], preserve_funcs=['min'], preserve_col_suffix=[''])

Looping SWEEP chunks:   0% (0 of 166) |  | Elapsed Time: 0:00:00 ETA:  --:--:--
Looping SWEEP chunks:   0% (1 of 166) |  | Elapsed Time: 0:00:05 ETA:   0:15:54
Looping SWEEP chunks:   1% (2 of 166) |  | Elapsed Time: 0:00:10 ETA:   0:14:13
Looping SWEEP chunks:   1% (3 of 166) |  | Elapsed Time: 0:00:16 ETA:   0:15:02
Looping SWEEP chunks:   2% (4 of 166) |  | Elapsed Time: 0:00:21 ETA:   0:14:29
Looping SWEEP chunks:   3% (5 of 166) |  | Elapsed Time: 0:00:27 ETA:   0:14:26
Looping SWEEP chunks:   3% (6 of 166) |  | Elapsed Time: 0:00:32 ETA:   0:14:50
Looping SWEEP chunks:   4% (7 of 166) |  | Elapsed Time: 0:00:38 ETA:   0:13:41
Looping SWEEP chunks:   4% (8 of 166) |  | Elapsed Time: 0:00:43 ETA:   0:13:25
Looping SWEEP chunks:   5% (9 of 166) |  | Elapsed Time: 0:00:48 ETA:   0:12:58
Looping SWEEP chunks:   6% (10 of 166) | | Elapsed Time: 0:00:52 ETA:   0:12:05
Looping SWEEP chunks:   6% (11 of 166) | | Elapsed Time: 0:00:57 ETA:   0:12:03
Looping SWEEP chunks:   7% (12 of 166) |

'../../Example_Data/Revision_run_through/Waters_mask.hdf5'

## Total Integrated Power
Now that we have the linear functions to provide a continual measure of AKR flux we can integrate the functions.

First we integrate across the AKR frequency limits found from the Fogg 2024 burst list. We will use the dataframe made in Calculating_Frequency_Extension.ipynb notebook that also includes the LFE and HFE thresholds for each substorm list

Just like for the linear interpolation we have a chunking wrapper called integrate_in_chunks that wraps the integrate function and performs it over chunks of sweeps at a time to prevent memory issues.

There are the following arguments:
- lin_fit_hdf5: this is the path and filename for the hdf file containing the linear fits performed in the previous step
- output_file: this is the path and filename for the hdf file that will contain the new calculate integrated power
- flims: this defines the frequency limits. A tuple can be provide in the form (fmin, fmax) if constant frequency limits are to be used. Alternatively, as done in this example, a dataframe that contains a sweep number column and columns for fmin and fmax can be provided that defines variable frequency limits for the integration with different limits for each sweep.
- chunk_size: this defines the size of sweep chunks to be used. Higher values will be quicker but more memory heavy.
- in_key: this is the key for the lin_fit_hdf5 file that contains the linear fits
- out_key: this is the key for output_file where the integrated power will be placed
- integrate_kwargs: provide extra keyword arguments to be passed to integrate

Possible arguments that be provided to integrate:
- sweep: the string denoting the the column name that contains sweep number
- fmin: the string denoting the the column name that contains the upper frequency limits in the flims dataframe (unused if providing static flimits)
- fmax: the string denoting the the column name that contains the lower frequency limits in the flims dataframe (unused if providing static flimits)
- distance: What distance is the flux dataset normalised to (the default is set to one astronomical unit which is the standard)

In [None]:
from Wind_Waves.integration_tools import integrate_in_chunks
import pandas as pd
flims= pd.read_csv('../../Example_Data/Frequency_Extension.csv', parse_dates=['Date_UTC'])
integrate_in_chunks('../../Example_Data/Waters_mask.hdf5',
                            '../../Example_Data/Waters_mask.hdf5', flims, chunk_size=10_000, 
                            in_key='linear_fit', out_key='integrated_power_total', fmin='fmin', fmax='fmax',
                            sweep='SWEEP')

Looping SWEEP chunks:   0% (0 of 166) |  | Elapsed Time: 0:00:00 ETA:  --:--:--
Looping SWEEP chunks:   0% (1 of 166) |  | Elapsed Time: 0:00:07 ETA:   0:20:34
Looping SWEEP chunks:   1% (2 of 166) |  | Elapsed Time: 0:00:14 ETA:   0:18:56
Looping SWEEP chunks:   1% (3 of 166) |  | Elapsed Time: 0:00:20 ETA:   0:16:36
Looping SWEEP chunks:   2% (4 of 166) |  | Elapsed Time: 0:00:26 ETA:   0:16:16
Looping SWEEP chunks:   3% (5 of 166) |  | Elapsed Time: 0:00:32 ETA:   0:16:06
Looping SWEEP chunks:   3% (6 of 166) |  | Elapsed Time: 0:00:37 ETA:   0:13:03
Looping SWEEP chunks:   4% (7 of 166) |  | Elapsed Time: 0:00:41 ETA:   0:11:26
Looping SWEEP chunks:   4% (8 of 166) |  | Elapsed Time: 0:00:46 ETA:   0:11:28
Looping SWEEP chunks:   5% (9 of 166) |  | Elapsed Time: 0:00:51 ETA:   0:12:49
Looping SWEEP chunks:   6% (10 of 166) | | Elapsed Time: 0:00:55 ETA:   0:12:20
Looping SWEEP chunks:   6% (11 of 166) | | Elapsed Time: 0:01:00 ETA:   0:13:05
Looping SWEEP chunks:   7% (12 of 166) |

'../../Example_Data/Revision_run_through/Waters_mask.hdf5'

## Integrate LFEs and HFEs
Now we integrate over the LFE and HFE frequency ranges for each substorm based on the thresholds we found in the Calculating_Frequency_Extension.ipynb notebook. Each substorm has a different key for the lfe and hfe integrated power datasets produced. We include printing the substorm list name to monitor progress

In [None]:
for sub_list in ['combined', 'newell', 'ohtani', 'sophie']:
    print(sub_list)
    integrate_in_chunks('../../Example_Data/Waters_mask.hdf5',
                            '../../Example_Data/Waters_mask.hdf5', flims, chunk_size=10_000, 
                            in_key='linear_fit', out_key=f'integrated_power_lfe_{sub_list}', fmin='fmin', fmax=f'{sub_list}_fmax',
                            sweep='SWEEP')
    integrate_in_chunks('../../Example_Data/Waters_mask.hdf5',
                            '../../Example_Data/Waters_mask.hdf5', flims, chunk_size=10_000, 
                            in_key='linear_fit', out_key=f'integrated_power_hfe_{sub_list}', fmin=f'{sub_list}_fmin', fmax='fmax',
                            sweep='SWEEP')

combined


Looping SWEEP chunks:   0% (0 of 166) |  | Elapsed Time: 0:00:00 ETA:  --:--:--
Looping SWEEP chunks:   0% (1 of 166) |  | Elapsed Time: 0:00:06 ETA:   0:19:12
Looping SWEEP chunks:   1% (2 of 166) |  | Elapsed Time: 0:00:14 ETA:   0:20:15
Looping SWEEP chunks:   1% (3 of 166) |  | Elapsed Time: 0:00:20 ETA:   0:17:46
Looping SWEEP chunks:   2% (4 of 166) |  | Elapsed Time: 0:00:27 ETA:   0:17:12
Looping SWEEP chunks:   3% (5 of 166) |  | Elapsed Time: 0:00:33 ETA:   0:16:05
Looping SWEEP chunks:   3% (6 of 166) |  | Elapsed Time: 0:00:38 ETA:   0:13:22
Looping SWEEP chunks:   4% (7 of 166) |  | Elapsed Time: 0:00:42 ETA:   0:11:46
Looping SWEEP chunks:   4% (8 of 166) |  | Elapsed Time: 0:00:46 ETA:   0:11:06
Looping SWEEP chunks:   5% (9 of 166) |  | Elapsed Time: 0:00:51 ETA:   0:12:45
Looping SWEEP chunks:   6% (10 of 166) | | Elapsed Time: 0:00:56 ETA:   0:12:57
Looping SWEEP chunks:   6% (11 of 166) | | Elapsed Time: 0:01:01 ETA:   0:12:30
Looping SWEEP chunks:   7% (12 of 166) |

newell


Looping SWEEP chunks:   0% (0 of 166) |  | Elapsed Time: 0:00:00 ETA:  --:--:--
Looping SWEEP chunks:   0% (1 of 166) |  | Elapsed Time: 0:00:07 ETA:   0:21:08
Looping SWEEP chunks:   1% (2 of 166) |  | Elapsed Time: 0:00:15 ETA:   0:21:26
Looping SWEEP chunks:   1% (3 of 166) |  | Elapsed Time: 0:00:22 ETA:   0:17:54
Looping SWEEP chunks:   2% (4 of 166) |  | Elapsed Time: 0:00:28 ETA:   0:17:15
Looping SWEEP chunks:   3% (5 of 166) |  | Elapsed Time: 0:00:34 ETA:   0:17:06
Looping SWEEP chunks:   3% (6 of 166) |  | Elapsed Time: 0:00:40 ETA:   0:13:58
Looping SWEEP chunks:   4% (7 of 166) |  | Elapsed Time: 0:00:44 ETA:   0:11:50
Looping SWEEP chunks:   4% (8 of 166) |  | Elapsed Time: 0:00:49 ETA:   0:12:02
Looping SWEEP chunks:   5% (9 of 166) |  | Elapsed Time: 0:00:54 ETA:   0:14:16
Looping SWEEP chunks:   6% (10 of 166) | | Elapsed Time: 0:00:59 ETA:   0:13:43
Looping SWEEP chunks:   6% (11 of 166) | | Elapsed Time: 0:01:05 ETA:   0:14:49
Looping SWEEP chunks:   7% (12 of 166) |

ohtani


Looping SWEEP chunks:   0% (0 of 166) |  | Elapsed Time: 0:00:00 ETA:  --:--:--
Looping SWEEP chunks:   0% (1 of 166) |  | Elapsed Time: 0:00:07 ETA:   0:20:27
Looping SWEEP chunks:   1% (2 of 166) |  | Elapsed Time: 0:00:13 ETA:   0:17:29
Looping SWEEP chunks:   1% (3 of 166) |  | Elapsed Time: 0:00:19 ETA:   0:16:04
Looping SWEEP chunks:   2% (4 of 166) |  | Elapsed Time: 0:00:25 ETA:   0:15:14
Looping SWEEP chunks:   3% (5 of 166) |  | Elapsed Time: 0:00:30 ETA:   0:14:26
Looping SWEEP chunks:   3% (6 of 166) |  | Elapsed Time: 0:00:35 ETA:   0:12:50
Looping SWEEP chunks:   4% (7 of 166) |  | Elapsed Time: 0:00:39 ETA:   0:10:52
Looping SWEEP chunks:   4% (8 of 166) |  | Elapsed Time: 0:00:43 ETA:   0:10:38
Looping SWEEP chunks:   5% (9 of 166) |  | Elapsed Time: 0:00:48 ETA:   0:12:11
Looping SWEEP chunks:   6% (10 of 166) | | Elapsed Time: 0:00:53 ETA:   0:12:20
Looping SWEEP chunks:   6% (11 of 166) | | Elapsed Time: 0:00:57 ETA:   0:12:01
Looping SWEEP chunks:   7% (12 of 166) |

sophie


Looping SWEEP chunks:   0% (0 of 166) |  | Elapsed Time: 0:00:00 ETA:  --:--:--
Looping SWEEP chunks:   0% (1 of 166) |  | Elapsed Time: 0:00:07 ETA:   0:19:51
Looping SWEEP chunks:   1% (2 of 166) |  | Elapsed Time: 0:00:13 ETA:   0:17:37
Looping SWEEP chunks:   1% (3 of 166) |  | Elapsed Time: 0:00:19 ETA:   0:15:38
Looping SWEEP chunks:   2% (4 of 166) |  | Elapsed Time: 0:00:25 ETA:   0:15:18
Looping SWEEP chunks:   3% (5 of 166) |  | Elapsed Time: 0:00:30 ETA:   0:15:02
Looping SWEEP chunks:   3% (6 of 166) |  | Elapsed Time: 0:00:35 ETA:   0:12:14
Looping SWEEP chunks:   4% (7 of 166) |  | Elapsed Time: 0:00:39 ETA:   0:10:39
Looping SWEEP chunks:   4% (8 of 166) |  | Elapsed Time: 0:00:43 ETA:   0:10:52
Looping SWEEP chunks:   5% (9 of 166) |  | Elapsed Time: 0:00:48 ETA:   0:12:08
Looping SWEEP chunks:   6% (10 of 166) | | Elapsed Time: 0:00:52 ETA:   0:11:48
Looping SWEEP chunks:   6% (11 of 166) | | Elapsed Time: 0:00:57 ETA:   0:12:01
Looping SWEEP chunks:   7% (12 of 166) |