# Tutorial for Solar Wind Time Series Viewer

## Installation

- create conda enviroment

    `conda create --name tsv1 --file spec-file.txt`
    
- Note that CDF library may be required

- Download the package

    `git clone https://github.com/huangzesen/Solar-Wind-Time-Series-Viewer`

## Initalization

In [11]:
%load_ext autoreload
%autoreload 2
%matplotlib tk

import sys
# set pyspedas path (not necessary, as long as you have installed spedas)
sys.path.insert(0,"../pyspedas")
# add Solar Wind Time Series Viewer Path (path to the package)
sys.path.insert(0,"./Solar-Wind-Time-Series-Viewer")

import pyspedas
from pytplot import get_data
from pyspedas.utilities import time_string
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import clear_output
from pathlib import Path
import os
import pickle
from gc import collect

# from TSUtilities import FindIntervalInfo
from TimeSeriesViewer import TimeSeriesViewer

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## Initialize the TimeSeriesViewer object

- sc: spacecraft code
    - sc = 0 : PSP (use spedas, ./psp_data required)
    - sc = 1 : Solar Orbiter (use spedas, ./solar_orbiter_data required)
    - sc = 2 : Helios-1
    - sc = 3 : Helios-2
    - sc = 4 : Ulysses (use spedas ./ulysses_data required)
    
    for the ones using spedas, local spedas folder is required

- rolling_rate: rolling window for averaged quantities, like V and B

In [2]:
t00 = pd.Timestamp("1975-07-01")
t10 = pd.Timestamp("1975-10-01")

# credentials for PSP, if you have one, fill them at the Nones
# credentials = {'psp':
#          {
#              'fields': {'username': None, 'password': None},
#              'sweap': {'username': None, 'password': None}
#          }
#     }
credentials = None

tsv = TimeSeriesViewer(sc = 2, 
                       start_time_0 = t00, 
                       end_time_0 = t10, 
                       rolling_rate = '1H', 
                       resolution = 5,
                       credentials = credentials
                      )

Initializing...
Current Directory: /Users/huangzesen/work/projects/select_turbulence_intervals/Tutorials
Preloading raw dataframe... This may take some time...
Current Settings...
verbose : True
Loading Helios-1 data from CDAWEB...


  'Vth': 0.128487*np.sqrt(data['Tp']), # vth[km/s] = 0.128487 * √Tp[K]



Done.
Final Settings...
verbose : True
Vx not in columns...!
Vy not in columns...!
Vz not in columns...!
Done.
Done.


## Make the plot

- t0, t1: the start and end time of the current interactive window

- tsv.p_funcs: the functions that are initialized when you click p within the interval
    - Should be a dictionary, where keys are the programs you want to use
    - the values works as settings for the specific program
    - currently supporting:
        - "PSD": calculate the PSD, and leave a PSD dictionary in the selected intervals
        - "Struc_Func" calculate the 1st order structure function, and leave a struc_funcs dictionary in the selected intervals
        - more to come...

- tsv.resample_rate is the resolution of the interactive window

- tsv.connect() connect your self to the interactive window

Note that the time series can be accessed via: tsv.dfts

In [6]:
plt.close('all')
tsv.resample_rate = '10min'
t0 = pd.Timestamp("1975-08-01")
t1 = pd.Timestamp("1975-09-01")
# initialize the window
tsv.InitFigure(t0, t1)
# programs for key "p"
tsv.p_funcs = {'PSD':1, 'Struc_Func':1}
tsv.connect()

Preparing Time Series....
Finding Corresponding Time Series...
Processing the Time Series...
SC SW data not exist! Use RTN instead!
Done.
Load High Res Mag data for sc = 2
Required tstart = 1975-08-01 00:00:00, tend = 1975-09-01 00:00:00
Input tstart = 1975-07-31 14:00:00+00:00, tend = 1975-09-01 10:00:00+00:00
Returned tstart = 1975-07-31 15:03:18, tend = 1975-08-19 23:28:48
Final tstart = 1975-08-01 15:18:06, tend = 1975-08-19 23:28:48


## How to interactive with the window and select intervals

**HAVE TO BE PATIENT, THE WINDOW CAN TAKE SOME TIME TO REACT**

- *left click* on the interactive figure will create a red dashed line, two red dashed lines will enclose a red shaded area, indicating **selected** interval.

- *right click* on the interactive figure will create a green dashed line, two green dashed lines will enclose a area to **zoom in**.

- press *backspace* will return to the previous window (not recommended, can be buggy, reinitialize window is recommended)

- press *d* <u> when the cross hair is inside the selected interval </u> will **delete** the interval

- press *p* <u> when the cross hair is inside the selected interval </u> will initialize the desired **program** specified by tsv.p_funcs to derive diagnostics of the interval
    - when intervals overlaps with each other, the program will automatically choose the shortest one
    - for each **program**, the derived diagnostics will be saved in the tsv.selected_intervals, which is a <u> list of dictionaries</u>.
    
- when the cross hair is inside the interval, the floating text will show some diagnostics of the selected interval

![](./figures/Figure_1.png)    
    

## Print Diagnostics of the selected intervals

In [82]:
for i1 in range(len(tsv.selected_intervals)):
    try:
        rolling_std = np.sqrt(np.sum((
            tsv.selected_intervals[i1]['dfmag']['Btot'] - tsv.selected_intervals[i1]['dfmag']['Btot'].rolling('1H').mean()
        )**2)/len(tsv.selected_intervals[i1]['dfmag']['Btot']))

        std_np = np.sqrt(np.sum((
            tsv.selected_intervals[i1]['TimeSeries']['np'] - tsv.selected_intervals[i1]['TimeSeries']['np'].rolling('1H').mean()
        )**2)/len(tsv.selected_intervals[i1]['TimeSeries']['np']))/tsv.selected_intervals[i1]['TimeSeries']['np'].mean()

        print("<dBmod/Bmod> = %.4f, <sigma_c> = %.4f, <std_np> = %.4f, <vsw> = %.0f km/s, <vsw_std> = %.4f, <Rmax/Rmin> = %.2f" %(
            (rolling_std)/(tsv.selected_intervals[i1]['dfmag']['Btot'].mean()),
            tsv.selected_intervals[i1]['TimeSeries']['sigma_c'].mean(),
            std_np,
            tsv.selected_intervals[i1]['TimeSeries']['vsw'].mean(),
            tsv.selected_intervals[i1]['TimeSeries']['vsw'].std()/tsv.selected_intervals[i1]['TimeSeries']['vsw'].mean(),
            tsv.selected_intervals[i1]['TimeSeries']['Dist_au'].max()/tsv.selected_intervals[i1]['TimeSeries']['Dist_au'].min()
            )
        )
    except:
        pass

<dBmod/Bmod> = 0.2490, <sigma_c> = 0.3332, <std_np> = 0.1461, <vsw> = 423 km/s, <vsw_std> = 0.1514, <Rmax/Rmin> = 1.07


## Save the intervals

The selected intervals are stored in

*tsv.selected_intervals*

which is a **list of dictionaries**. Each dictionary is comprised of many keys, where ['rects', 'lines1', 'lines2'] are matplotlib objects, and are not supposed to be saved (unless you wish to keep them, can be very problematic). Other keys are very useful.

- spacecraft: spacecraft code
- start_time, end_time: start and end time of the interval
- TimeSeries: tsv.dfts for the current interval
- dfmag: full resolution magnetic field data
- LTSWsettings: LoadTimeSeriesWrapper settings, used by PSP because of different instruments are used at different times...
- PSD, struc_funcs: diagnostics data

In [5]:
print(tsv.selected_intervals[0].keys())

dict_keys(['spacecraft', 'start_time', 'end_time', 'TimeSeries', 'rects', 'lines1', 'lines2', 'dfmag', 'LTSWsettings', 'PSD', 'struc_funcs'])


In [7]:
# extract the useful keys
useful_keys = ['spacecraft', 'start_time', 'end_time', 'TimeSeries', 
               'dfmag', 'LTSWsettings', 'PSD', 'struc_funcs']

# save in a new dictionary
d = {}
for k in useful_keys:
    d[k] = tsv.selected_intervals[0][k]

then save d somewhere using pickle.dump

In [9]:
os.makedirs("intervals", exist_ok = True)
with open("intervals/tsv.pkl", 'wb') as f:
    pickle.dump(d, f)

## Import the intervals

The intervals can be imported to resume your work, or to export time series plots. 

**Note that if you press p again for a selected interval, the diagnostics will be replaced with new ones**

### read the intervals

In [8]:
d = pd.read_pickle("intervals/tsv.pkl")

In [9]:
plt.close('all')
tsv.resample_rate = '5min'
t0 = pd.Timestamp("1975-08-01")
t1 = pd.Timestamp("1975-10-01")
tsv.InitFigure(t0, t1)
# this will import your existing intervals and plot them on top
tsv.ImportSelectedIntervals([d])
tsv.p_funcs = {'PSD':1, 'Struc_Func':1}
tsv.connect()

Preparing Time Series....
Finding Corresponding Time Series...
Processing the Time Series...
SC SW data not exist! Use RTN instead!
Done.
Load High Res Mag data for sc = 2
Required tstart = 1975-08-01 00:00:00, tend = 1975-10-01 00:00:00
Input tstart = 1975-07-31 14:00:00+00:00, tend = 1975-10-01 10:00:00+00:00
Returned tstart = 1975-07-31 15:03:18, tend = 1975-10-01 08:12:48
Final tstart = 1975-08-01 15:18:06, tend = 1975-09-30 23:59:54


invalid command name "140239509012096delayed_destroy"
    while executing
"140239509012096delayed_destroy"
    ("after" script)


## Export Intervals Captures

In [10]:
plt.close('all')
tsv.resample_rate = '5min'

# change t0 to appropriate start time for the given interval
t0 = d['start_time']-pd.Timedelta('1d')
# change t1 to appropriate end time for the given interval
t1 = d['end_time']+pd.Timedelta('1d')

tsv.InitFigure(t0, t1)
# this will import your existing intervals and plot them on top
tsv.selected_intervals = []
tsv.ImportSelectedIntervals([d])

# save figure
os.makedirs("figures", exist_ok = True)
tsv.fig.set_constrained_layout(True)
tsv.fig.savefig("figures/example.png", dpi = 300)

# note that all the axes in the figure can be accessed via:
# tsv.axes
# it is possible to change the aesthetics

Preparing Time Series....
Finding Corresponding Time Series...
Processing the Time Series...
SC SW data not exist! Use RTN instead!
Done.
Load High Res Mag data for sc = 2
Required tstart = 1975-08-10 17:20:00, tend = 1975-08-18 12:41:00
Input tstart = 1975-08-10 07:20:00+00:00, tend = 1975-08-18 22:41:00+00:00
Returned tstart = 1975-08-10 15:45:42, tend = 1975-08-15 21:36:18
Final tstart = 1975-08-10 17:20:00, tend = 1975-08-15 21:36:18


## About PSP

In [None]:
t00 = pd.Timestamp("2022-05-01")
t10 = pd.Timestamp("2022-07-01")
credentials = {'psp':
         {
             'fields': {'username': None, 'password': None},
             'sweap': {'username': None, 'password': None}
         }
    }
tsv = TimeSeriesViewer(sc = 0, 
                       start_time_0 = t00, 
                       end_time_0 = t10, 
                       resample_rate = '5min', 
                       rolling_rate = '1H', 
                       resolution = 5,
                       credentials = credentials,
                       LTSWsettings = {'must_have_qtn':False, 'particle_mode':'span_only'}
                      )

The LTSWsettings have two main components
{'must_have_qtn':False, 'particle_mode':'span_only'}
there are three particle_mode:
- span_only
- spc_only
- empirical
the empirical mode:

empirical use of data
encounter date: https://sppgway.jhuapl.edu/index.php/encounters
before encounter 9 (Perihelion: 2021-08-09/19:11) use SPC for solar wind speed
at and after encounter 8, mix SPC and SPAN for solar wind speed
prioritize QTN for density, and fill with SPC, and with SPAN

the 'must_have_qtn' option will force you to have qtn data available for the interval