## Description of Notebook

The notebook is used to look closer into VDFs and the fields files in order to identify the stable and unstable periods. This notebook is also used to prepare the data for the machine learning (yet without the partitioning of the data into the train and test data sets which will be done separately)

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import kineticsim_reader as kr
import pickle
import os
import random
from scipy.signal import savgol_filter
from tqdm import tqdm
from matplotlib.animation import FuncAnimation

In [2]:
simfiles = ['particles.d11_A0.5Hepp_beta0.5eps1e-4_256',\
    'particles.d11_A0.75Hepp_beta1_256',\
    'particles.d11_E11Ap3.3Aa2.0Vd0.42',\
    'particles.d11_E11Ap4.3Aa1.6',\
    'particles.d11_E11Ap4.3Aa1.6Vd0.32',\
    'particles.d11_E12Ap1.86Aa1.0Vd0.32_256_256x256',\
    'particles.d11_E12Ap1.86Aa1.0Vd0.32_512_256x256',\
    'particles.d11_He++A10_256_iden0eps0',\
    'particles.d11_He++v2_256_iden0eps1e-4t600',\
    'particles.d11_He++vd1.5_256_iden0eps1e-4',\
    'particles.d11_pv1.5_128_64_iden0eps1e-4_dx0.75_long',\
    'particles.d11_pv1Ap2Apb2betac0.214betab0.858_128_128x2_dx0.75_t3000',\
    'particles.d11_pv2a_128x3_iden0eps1e-4_dx0.75',\
    'particles.d11_pv2Ap1Ab1betac0.429betab0.858_128_128x2_dx0.75_t3000',\
    'particles.d11_pv2Ap1Ab2betac0.429betab0.858_128_128x2_dx0.75_t3000',\
    'particles.d11_pv2Ap2Apb2betac0.214betab0.858_128_128x2_dx0.75_t3000',\
    'particles.d11_pv2av2.3_128x3_iden0eps1e-4_dx0.75',\
    'particles.d11_pv2av2Ap1Aa1beta0.429_128_128x2_dx0.75_t3000',\
    'particles.d11_pv2av2_rdna0.03375_128x3_iden0eps1e-4_dx0.75_t6000',\
    'particles.d11_vap1.2Ap1Aa0.75_rdna_0.05',\
    'particles.d11_vap1.2Ap3.35Aa2.05rdna_0.007',\
    'particles.d11_vap1.5Ap1.5Aa1rdna_0.007']

fldfiles = ['fields.d10_A0.5Hepp_beta0.5eps1e-4_256',\
    'fields.d10_A0.75Hepp_beta1_256',\
    'fields.d10_E11Ap3.3Aa2.0Vd0.42',\
    'fields.d10_E11Ap4.3Aa1.6',\
    'fields.d10_E11Ap4.3Aa1.6Vd0.32',\
    'fields.d10_E12Ap1.86Aa1.0Vd0.32_256_256x256',\
    'fields.d10_E12Ap1.86Aa1.0Vd0.32_512_256x256',\
    'fields.d10_He++A10_256_iden0eps0',\
    'fields.d10_He++v2_256_iden0eps1e-4t600',\
    'fields.d10_He++vd1.5_256_iden0eps1e-4',\
    'fields.d10_pv1.5_128_64_iden0eps1e-4_dx0.75_long',\
    'fields.d10_pv1Ap2Apb2betac0.214betab0.858_128_128x2_dx0.75_t3000',\
    'fields.d10_pv2a_128x3_iden0eps1e-4_dx0.75',\
    'fields.d10_pv2Ap1Ab1betac0.429betab0.858_128_128x2_dx0.75_t3000',\
    'fields.d10_pv2Ap1Ab2betac0.429betab0.858_128_128x2_dx0.75_t3000',\
    'fields.d10_pv2Ap2Apb2betac0.214betab0.858_128_128x2_dx0.75_t3000',\
    'fields.d10_pv2av2.3_128x3_iden0eps1e-4_dx0.75',\
    'fields.d10_pv2av2Ap1Aa1beta0.429_128_128x2_dx0.75_t3000',\
    'fields.d10_pv2av2_rdna0.03375_128x3_iden0eps1e-4_dx0.75_t6000',\
    'fields.d10_vap1.2Ap1Aa0.75_rdna_0.05',\
    'fields.d10_vap1.2Ap3.35Aa2.05rdna_0.007',\
    'fields.d10_vap1.5Ap1.5Aa1rdna_0.007']

## MORE INSPECTION NEEDED !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

## Determination of stable and unstable VDFs

This overall remains an open question. Inspections of the VDFs and time parameters revealed the cases when the magnetic energy remains stable, but the anisotropies and temperatures change/exchange.

Given the definition uncertainties above, we will implement several labeling approaches. First, the separate labelings will be developed for the magnetic energy and anisotropy following these thresholds:
1. The simulation runs are classified: if the change of anisotropies or perpendicular magnetic energy is more than 0.1% per one gyroperiod, the VDFs are called unstable.
2. The simulation runs are classified: if the change of anisotropies or perpendicular magnetic energy is more than 0.5% per one gyroperiod, the VDFs are called unstable.
3. The simulation runs are classified: if the change of anisotropies or perpendicular magnetic energy is more than 1.0% per one gyroperiod, the VDFs are called unstable.
4. The simulation runs are NOT classified. Instead, the regression problem will be solved for both the anisotropies and magnetic energies.

The figures seem to be reasonable, except for the following cases where the labeling has to be adjusted:
- Simulation run 'particles.d11_A0.5Hepp_beta0.5eps1e-4_256': magnetic energy should become stable after the first 4 simulation data points. The oscillations in magnetic energy are numerical.
- Simulation run 'particles.d11_A0.75Hepp_beta1_256': magnetic energy should become stable after the first 2 simulation data points. The oscillations in magnetic energy are numerical.

In [11]:
def prepare_mldata_vdfmoments(simfile, fieldsfile):
    
    timep_array = np.load('./processing_results/' + simfile + '.timep_array.npy')
    anisotropies_p = np.load('./processing_results/' + simfile + '.anisotropies_p.npy')
    moments_p = np.load('./processing_results/' + simfile + '.moments_p.npy')
    anisotropies_he = np.load('./processing_results/' + simfile + '.anisotropies_he.npy')
    moments_he = np.load('./processing_results/' + simfile + '.moments_he.npy')
    
    if (simfile == 'particles.d11_pv1.5_128_64_iden0eps1e-4_dx0.75_long'):
        vdfp_array_p1 = np.load('./processing_results/' + simfile + '_p1.vdfp_array.npy')
        vdfhe_array_p1 = np.load('./processing_results/' + simfile + '_p1.vdfhe_array.npy')
        vdfp_array_p2 = np.load('./processing_results/' + simfile + '_p2.vdfp_array.npy')
        vdfhe_array_p2 = np.load('./processing_results/' + simfile + '_p2.vdfhe_array.npy')
        vdfp_array = np.concatenate((vdfp_array_p1, vdfp_array_p2))
        vdfhe_array = np.concatenate((vdfhe_array_p1, vdfhe_array_p2))
    else:
        vdfp_array = np.load('./processing_results/' + simfile + '.vdfp_array.npy')
        vdfhe_array = np.load('./processing_results/' + simfile + '.vdfhe_array.npy')
    
    timep_array_fields = np.load('./processing_results/' + fieldsfile + '.timing.npy')[:,1]
    me_perp = np.load('./processing_results/' + fieldsfile + '.me_perp.npy')
    me_tot = np.load('./processing_results/' + fieldsfile + '.me_tot.npy')
    # applying smoothing to remove a periodic signal
    dtime = timep_array_fields[1] - timep_array_fields[0]
    npoints = int(len(me_tot)/10)
    if (npoints % 2 == 0): npoints = npoints + 1
    me_tot = savgol_filter(me_tot - me_tot[0], npoints, 3)
    me_perp = savgol_filter(me_perp, npoints, 3)
    
    # resampling to the timing of the VDFs
    me_tot = np.interp(timep_array,timep_array_fields,me_tot)
    me_perp = np.interp(timep_array,timep_array_fields,me_perp)
    
    # time derivatives (relative)
    dt_anisotropies_p = (anisotropies_p[1:]-anisotropies_p[:-1])/(timep_array[1:]-timep_array[:-1])
    dt_anisotropies_p = 2*(dt_anisotropies_p)/(anisotropies_p[1:]+anisotropies_p[:-1])
    dt_anisotropies_he = (anisotropies_he[1:]-anisotropies_he[:-1])/(timep_array[1:]-timep_array[:-1])
    dt_anisotropies_he = 2*(dt_anisotropies_he)/(anisotropies_he[1:]+anisotropies_he[:-1])
    dt_me_perp = (me_perp[1:]-me_perp[:-1])/(timep_array[1:]-timep_array[:-1])
    dt_me_perp = 2*(dt_me_perp)/(me_perp[1:]+me_perp[:-1])
    dt_me_tot = (me_tot[1:]-me_tot[:-1])/(timep_array[1:]-timep_array[:-1])
    dt_me_tot = 2*(dt_me_tot)/(me_tot[1:]+me_tot[:-1])
    
    # declaring feature vectors and moments
    featurevector_allmoments = []
    labels_allmoments_01 = []
    labels_allmoments_05 = []
    labels_allmoments_10 = []
    
    
    
    
    for i in range (0, len(timep_array)-1, 1):
        subvector = []
        subvector.append(moments_p[i,0,0])
        subvector.append(moments_p[i,0,1])
        subvector.append(moments_p[i,1,0])
        subvector.append(moments_p[i,1,1])
        subvector.append(moments_p[i,2,0])
        subvector.append(moments_p[i,2,1])
        subvector.append(moments_p[i,3,0])
        subvector.append(moments_p[i,3,1])
        subvector.append(moments_he[i,0,0])
        subvector.append(moments_he[i,0,1])
        subvector.append(moments_he[i,1,0])
        subvector.append(moments_he[i,1,1])
        subvector.append(moments_he[i,2,0])
        subvector.append(moments_he[i,2,1])
        subvector.append(moments_he[i,3,0])
        subvector.append(moments_he[i,3,1])
        subvector.append(anisotropies_p[i])
        subvector.append(anisotropies_he[i])
        
        # HERE IS A PLACE FOR MORE DESCRIPTORS
        
        
        subvector.append(np.log10(np.sum(vdfp_array[i,:,:])))
        subvector.append(np.log10(np.sum(vdfhe_array[i,:,:])))
        subvector.append(1)
        
        
        # HERE IS A PLACE FOR MORE LOGIC ON THRESHOLDS
        
        # omitting initial time moments for some simulations
        if ((simfile == 'particles.d11_A0.5Hepp_beta0.5eps1e-4_256') and (timep_array[i] < 200)):
            labels_allmoments.append(0)
            featurevector_allmoments.append(subvector)
            continue
        if ((simfile == 'particles.d11_A0.75Hepp_beta1_256') and (timep_array[i] < 200)):
            labels_allmoments.append(0)
            featurevector_allmoments.append(subvector)
            continue
        if ((simfile == 'particles.d11_vap1.2Ap1Aa0.75_rdna_0.05') and (timep_array[i] < 200)):
            labels_allmoments.append(0)
            featurevector_allmoments.append(subvector)
            continue
        if ((simfile == 'particles.d11_vap1.2Ap3.35Aa2.05rdna_0.007') and (timep_array[i] < 200)):
            labels_allmoments.append(0)
            featurevector_allmoments.append(subvector)
            continue
        if ((simfile == 'particles.d11_vap1.5Ap1.5Aa1rdna_0.007') and (timep_array[i] < 200)):
            labels_allmoments.append(0)
            featurevector_allmoments.append(subvector)
            continue
        # instability condition
        if ((np.abs(dt_anisotropies_p[i]) > 0.001) or (np.abs(dt_anisotropies_he[i]) > 0.001) \
            or (np.abs(dt_me_perp[i]) > 0.001)):
            labels_allmoments.append(1)
            featurevector_allmoments.append(subvector)
        else:
            labels_allmoments.append(0)
            featurevector_allmoments.append(subvector)
            
    featurevector_allmoments = np.array(featurevector_allmoments, dtype='float')
    labels_allmoments = np.array(labels_allmoments, dtype='float')
    return featurevector_allmoments, labels_allmoments, timep_array
    
    
simfiles = ['particles.d11_A0.5Hepp_beta0.5eps1e-4_256',\
            'particles.d11_A0.75Hepp_beta1_256',\
            'particles.d11_E11Ap3.3Aa2.0Vd0.42',\
            'particles.d11_E11Ap4.3Aa1.6',\
            'particles.d11_E11Ap4.3Aa1.6Vd0.32',\
            'particles.d11_E12Ap1.86Aa1.0Vd0.32_256_256x256',\
            'particles.d11_E12Ap1.86Aa1.0Vd0.32_512_256x256',\
            'particles.d11_He++A10_256_iden0eps0',\
            'particles.d11_He++v2_256_iden0eps1e-4t600',\
            'particles.d11_He++vd1.5_256_iden0eps1e-4',\
            'particles.d11_pv1.5_128_64_iden0eps1e-4_dx0.75_long',\
            'particles.d11_pv2a_128x3_iden0eps1e-4_dx0.75',\
            'particles.d11_pv2av2_rdna0.03375_128x3_iden0eps1e-4_dx0.75_t6000',\
            'particles.d11_pv2av2.3_128x3_iden0eps1e-4_dx0.75',\
            'particles.d11_vap1.2Ap1Aa0.75_rdna_0.05',\
            'particles.d11_vap1.2Ap3.35Aa2.05rdna_0.007',\
            'particles.d11_vap1.5Ap1.5Aa1rdna_0.007']

fldfiles = ['fields.d10_A0.5Hepp_beta0.5eps1e-4_256',\
            'fields.d10_A0.75Hepp_beta1_256',\
            'fields.d10_E11Ap3.3Aa2.0Vd0.42',\
            'fields.d10_E11Ap4.3Aa1.6',\
            'fields.d10_E11Ap4.3Aa1.6Vd0.32',\
            'fields.d10_E12Ap1.86Aa1.0Vd0.32_256_256x256',\
            'fields.d10_E12Ap1.86Aa1.0Vd0.32_512_256x256',\
            'fields.d10_He++A10_256_iden0eps0',\
            'fields.d10_He++v2_256_iden0eps1e-4t600',\
            'fields.d10_He++vd1.5_256_iden0eps1e-4',\
            'fields.d10_pv1.5_128_64_iden0eps1e-4_dx0.75_long',\
            'fields.d10_pv2a_128x3_iden0eps1e-4_dx0.75',\
            'fields.d10_pv2av2_rdna0.03375_128x3_iden0eps1e-4_dx0.75_t6000',\
            'fields.d10_pv2av2.3_128x3_iden0eps1e-4_dx0.75',\
            'fields.d10_vap1.2Ap1Aa0.75_rdna_0.05',\
            'fields.d10_vap1.2Ap3.35Aa2.05rdna_0.007',\
            'fields.d10_vap1.5Ap1.5Aa1rdna_0.007']

featurevector_allmoments, labels_allmoments, timep_array = prepare_mldata_vdfmoments(simfiles[0], fldfiles[0])
print("ML data for the simulation " + simfiles[0] + " generated")
print("Number of data points: " + str(len(labels_allmoments)))
featurevector_allmoments_all = np.copy(featurevector_allmoments)
labels_allmoments_all = np.copy(labels_allmoments)
timep_array_all = np.copy(timep_array)
np.save('./mldata_vdfmoments/'+simfiles[0]+'.mldata_moments.npy', featurevector_allmoments)
np.save('./mldata_vdfmoments/'+simfiles[0]+'.mldata_labels.npy', labels_allmoments)
np.save('./mldata_vdfmoments/'+simfiles[0]+'.mldata_timep.npy', timep_array)

for i in range (1, 17, 1):
    featurevector_allmoments, labels_allmoments, timep_array = prepare_mldata_vdfmoments(simfiles[i], fldfiles[i])
    print("ML data for the simulation " + simfiles[i] + " generated")
    print("Number of data points: " + str(len(labels_allmoments)))
    np.save('./mldata_vdfmoments/'+simfiles[i]+'.mldata_moments.npy', featurevector_allmoments)
    np.save('./mldata_vdfmoments/'+simfiles[i]+'.mldata_labels.npy', labels_allmoments)
    np.save('./mldata_vdfmoments/'+simfiles[i]+'.mldata_timep.npy', timep_array)
    featurevector_allmoments_all = np.concatenate((featurevector_allmoments_all, featurevector_allmoments))
    labels_allmoments_all = np.concatenate((labels_allmoments_all, labels_allmoments))
    timep_array_all = np.concatenate((timep_array_all, timep_array))
    
np.save('./mldata_vdfmoments/allsimulations.mldata_moments.npy', featurevector_allmoments_all)
np.save('./mldata_vdfmoments/allsimulations.mldata_labels.npy', labels_allmoments_all)
np.save('./mldata_vdfmoments/allsimulations.mldata_timep.npy', timep_array_all)

ML data for the simulation particles.d11_A0.5Hepp_beta0.5eps1e-4_256 generated
Number of data points: 80
ML data for the simulation particles.d11_A0.75Hepp_beta1_256 generated
Number of data points: 48
ML data for the simulation particles.d11_E11Ap3.3Aa2.0Vd0.42 generated
Number of data points: 48
ML data for the simulation particles.d11_E11Ap4.3Aa1.6 generated
Number of data points: 48
ML data for the simulation particles.d11_E11Ap4.3Aa1.6Vd0.32 generated
Number of data points: 48
ML data for the simulation particles.d11_E12Ap1.86Aa1.0Vd0.32_256_256x256 generated
Number of data points: 50
ML data for the simulation particles.d11_E12Ap1.86Aa1.0Vd0.32_512_256x256 generated
Number of data points: 50
ML data for the simulation particles.d11_He++A10_256_iden0eps0 generated
Number of data points: 48
ML data for the simulation particles.d11_He++v2_256_iden0eps1e-4t600 generated
Number of data points: 92
ML data for the simulation particles.d11_He++vd1.5_256_iden0eps1e-4 generated
Number of d