### Prerequisites

Before running this notebook, ensure the following:

- The files `classes.ipynb` and `utilities.py` are located in the same directory as this notebook.
- The following libraries are installed in your Python environment:

  - `importlib`
  - `bisect`
  - `sys`
  - `pickle`
  - `time`
  - `numpy`
  - `seaborn`
  - `matplotlib.pyplot`
  - `mplcursors`
  - `pandas`
  - `scipy`

In [None]:
import importlib
import utilities
%run classes.ipynb       
import bisect

import numpy as np
import random
import matplotlib.pyplot as plt
import mplcursors
import pandas as pd
from matplotlib.lines import Line2D
from scipy.stats import linregress 
from scipy.fft import fft, rfft,fftshift
from scipy.fft import fftfreq, rfftfreq
import scipy.signal as sig
from scipy.fft import fft, rfft,fftshift
from scipy.fft import fftfreq, rfftfreq
import sys
import pickle
import bisect
import time


# Please indicate here the path to your data

In [None]:
set_path( "/Users/gangler/data/GRS1915+105/classified_lcs") #<my_path> 

### Dataset

The dataset used in this notebook can be downloaded from the following link:  
[GRS 1915+105 Hand-Annotated RXTE Light Curves](https://figshare.com/articles/dataset/GRS_1915_105_Hand-Annotated_RXTE_Light_Curves/4220409?file=6886539)

For more details about the dataset, please refer to the `README.md` file.

> **Note:** Information about the dataset will be added to the README or documented in a dedicated notebook.  
### TODO: Add dataset documentation in the README or a separate notebook.

---

### Precomputed Mappings

The JSON files corresponding to the dictionaries `class2key` and `key2class` are already included in this repository.

- `class2key.json`: maps each class label to a list of light curve indices belonging to that class.
- `key2class.json`: maps each light curve index to its corresponding class label.

These files were generated from the dataset annotations and are provided here for convenience. You can load them directly to avoid rebuilding these mappings from scratch.


In [None]:
key2class=utilities.loadjson("key2class")
class2key=utilities.loadjson("class2key")

### Visualization Setup

Now, we initialize an object from the `Visualizer` class. This class contains all the plotting functions and is completely **independent** from the data computations.

In [None]:
visualizer=Visualizer()

### Data Loading and Normalization

The following code performs two main tasks:

1. **Load and store data efficiently:**  
   To avoid loading data multiple times, we create a dictionary `data` that stores the `Data` objects for all light curve indices in the `"rho"` class.

2. **Normalize the data:**  
   For each light curve in the `"rho"` class, we calculate a normalization factor called `alpha` using a utility function. Then, we create a dictionary `normalizeddata` that stores the normalized data corresponding to each light curve index.

This approach ensures that data loading and normalization are done once, improving efficiency when working with the dataset.

In [None]:
#saving data in one dictionary to not load each time
data={}
for i in class2key["rho"]:
    data[i]=Data(i)
#saving normalized data and the factor of normalization that i called alpha
alpha={}
normalizeddata={}
for i in class2key["rho"]:
    alpha[i]=utilities.findalpha(data[i].getcolumn("low"))
    normalizeddata[i]=normalizedData(i)

## Data

### before starting please make sure that the file name;data storage file, in the class notebook is correct

In [None]:
lc181=data[181]

In [None]:
#get low column:
lc181low=lc181.low()

In [None]:
#draw the low column of lc 181
utilities.show(lc181low)

## Shaplets

In [None]:
# parameters:
#lcnumber,column,start,length,alpha=-1,mean=-1,isaclean=False,series=None,isnormalized=False

#alpha is the coefficient used for z normalization, 1 / (standard deviation)
#mean is the mean of the shaplet

#alpha are mean are saved in the shaplet in case its normalized and one needs to retrieve the previous paramters

#is a clean means is set to be True if the shaplet is averaged or not
#isnomralized is set to be True if the shaplet is normalized

#for this tutorial i ll leave everything that can be defaulted to the default value
sh=Shaplet(181,"low",0,300)

## Comparator

if there is any important class in the class notebook,its gonna be this one
this is where all the computations are done,mainly TO COMPARE the shaplet and the comparator


In [None]:
#comp takes for input (Data,column,shaplet) Data object,the column of data on which the comparison will be done,and a shaplet object
comp=Comparator(lc181,"low",sh)

#Main comparator functionalities: 

#get the chi2 difference: 
chi2=comp.chi2()

#getting the minimas:
minimas=comp.minimas()

#finding the minimas positions: 
minimaspositions=comp.minimaspositions()

#NOTE: THE MOST COSTLY OPERATION IS THE CHI2 MEASUREMENT,AND ONCE ITS DONE,ITS SAVED IN THE CLASS
#      ONE CAN DIRECTLY CALL MINIMASPOSITIONS OR MINIMAS,THE FUNCTIONS WILL CALL CHI2 MEASUREMENT BY THEMSELFS

## Visualizer: 

In [None]:
#Once the chi2 is computed,we superpose the shaplet on the lightcurve at the ressanblance points
#for this task,we use the superpose function in visualizer,we pass the comparator to it
visualizer.superpose(comp)

## Dictionary building

In [None]:
# here the code for dictionary building shaplet definition and lc curves encoding analysis

In [None]:
#length of the shaplet to be defined and set,its one of the free parameters,
#for more about the free parameters,check free paramters table below
shapletlength=300

#shaplet list that i ll fill up: 
shaplets:list[Shaplet]=[]

#a shaplet 0 initialized,one can choose any shaplet,the paramters lcnumber and start can be choosen randomly
#In fact,the algorithm should work regardless of the initial shaplet
lcnumber=432
start=0
shaplet0=Shaplet(lcnumber,"low",start,shapletlength,isnormalized=True) #column,start,length,isaclean=False,series=None):

In [None]:
utilities.show(shaplet0.series)

In [None]:
# Dictionary to track intervals where shaplets were actually taken
# Algorithm can be optimized by saving only the two sup,inf points defining avoidance regions
# and going through each interval rather in a simple algorithm to check if the point is in one of those


avoidanceregion={}
from scipy.signal import argrelmin
from collections import defaultdict

def builddictionarynew(shaplets: list, k: list,comparators: dict):
    print("Size of the dictionary:", len(shaplets))

    if len(shaplets) > 50:
        print("More than 50 shaplets reached.")
        return

    global selected_intervals
    key = -1
    best_shaplet = None
    best_chi2_list = None
    best_interval = None

    for i in class2key["rho"]:
        
        if i not in comparators:
            comparators[i] = []

        if i not in avoidanceregion:
            avoidanceregion[i] = set()
            
            
        comparator = Comparator(normalizeddata[i], "low", shaplets[-1])
        comparators[i].append(comparator)

        
        comparatorlist = comparators[i]
        chi2list = [comparator.chi2() for comparator in comparatorlist]
        chi2matrix = np.array(chi2list)

        # building the NEW MINCHI2 = min(chi2_1,chi2_2,...)
        minimumchi2 = chi2matrix[np.argmin(chi2matrix, axis=0), np.arange(chi2matrix.shape[1])]

        # finding minimas on the NEWLY made minchi2
        min_window = shapletlength // 3
        minimaindeces = argrelmin(minimumchi2, order=min_window)[0]
    #    minimaindeces = [i for i in minimaindeces if minimumchi2[i] < 3]   #THE CUT IS 3

        # Add boundaries if needed
  #      if len(minimaindeces) < 2:
  #          continue
     #   minima = np.insert(minima, 0, 0)
    #    minima = np.append(minima, len(minimumchi2) - 1)

        
        #avoidance region length:
        avoidancesize=len(shaplets[0])//3
        
        for minim in minimaindeces:
            avoidanceregion[i] = avoidanceregion[i] | set( range(max(0,minim-avoidancesize) ,min(minim+avoidancesize,len(minimumchi2))) )
       
        valid_indices = [j for j in range(len(minimumchi2)) if j not in avoidanceregion[i]] # chnage and add minus sets
        validminchi2 = [minimumchi2[j] for j in valid_indices]

        
        
        chi2max=-1
        if len(validminchi2)>0:
            local_max_idx_valid = np.argmax(validminchi2) # need to get bac
            chi2max = validminchi2[local_max_idx_valid]
            local_max_idx = valid_indices[local_max_idx_valid]
        
        # minchi2[]
        #  GET BACK TO THE MINCHI2 : minimaindeces = [i for i in minimaindeces if minimumchi2[i] < 3]

        #
        if chi2max < 3:
            continue  # Skip low-significance results

        if chi2max > key:
            key = chi2max
            best_chi2_list = chi2matrix
            mean = getmean(i, local_max_idx, 300)
            alpha = utilities.findalpha(data[i].getcolumn("low"))
            best_shaplet = Shaplet(i, "low", local_max_idx, shapletlength, mean=mean, alpha=alpha, isnormalized=True)

    if best_shaplet:
        print("Selected shaplet with chi2:", key)
        shaplets.append(best_shaplet)
        chi2selected.append(best_chi2_list)
        k.append(key)
        builddictionary(shaplets, k,comparators)
    else:
        print("No suitable shaplet found in this pass.")
