# Importing Data

In [1]:
import pandas as pd

In [2]:
FILE_PATH = '/content/drive/MyDrive/01 - Iniciação Científica/02 - Datasets/csv_files/EN2_STAR_CHR_0102890318_20070206T133547_20070402T070302.csv'

In [3]:
df = pd.read_csv(FILE_PATH)
df.head()

Unnamed: 0,DATEBARTT,WHITEFLUXSYS
0,54138.073885,219929.3
1,54138.079811,220816.39
2,54138.085737,220129.64
3,54138.091662,219876.34
4,54138.097588,219744.33


In [4]:
import numpy as np

x = df.DATEBARTT.to_numpy()
y = df.WHITEFLUXSYS.to_numpy()

### Before results

In [14]:
import plotly.express as px

fig = px.line(df, x=x, y=y, title='Light Curve before filtering')
fig.show()

# Data normalization

Before applying Butterworth filter, we have to ...
- Create artificial borders
- Multiply the data by $(-1)^{i}$
- Apply Zero Padding algorithm

## 1. Artificial borders

Our first step is to add artificial borders to the array. After some studies and test it is notable that, for some reason, when the Butterworth filter is used, it can create some distorcion on the first and last values of the array.

Intend to avoid this unexpected values, we add some points on the begining and at the end of the array, so the distorcion will occur on this artificial borders that we create. Then, we just have to cut them off to return to the original array.

In order to not modify the array too much, the function _artificial_borders_ will add 15 points on the begining that have the same value as the first element, y[0] and, just like at the begining, at the end, we will add 15 points that have the same value as the lastest element, y[-1].

In [9]:
def artifical_borders(array, num):
  aux_pre = np.zeros(num)
  aux_pos = np.zeros(num)
  i = 0
  for i in range(num):
    aux_pre[i] = array[0]
    aux_pos[i] = array[-1]
  
  return np.concatenate((aux_pre, array, aux_pos)).ravel()

In [None]:
# Defining the numbers of points to be added

param = 15

In [13]:
x_artifical_borders = artifical_borders(x, param)
y_artifical_borders = artifical_borders(y, param)

print(len(x_artifical_borders) == len(y_artifical_borders)) # Should be True

True


## 2. Multiplying data by $(-1)^{i}$

And then, we must multiply each value of the array by the factor $(-1)^{i}$, where i is the position index. 

Intending to comprehend it, we have to look at some properties of the Fourier transform.

It is know that:
$$
f(x)\cdot e^{i 2 \pi (u_0x / M) } \iff F(u-u_0)  \text{ and }  (x-x_0) \iff F(u)\cdot e^{i 2 \pi (u_0x / M)}
$$


where $\iff$ is a ____


Taking $u_{0} = \text{M}/2$, we have:
$$
e^{i 2 \pi (u_0x / M)} = e^{i 2 \pi (\frac{Mx}{2M})} = e^{i \pi (x)}
$$

By Euler's Identity: $ e^{i \pi} + 1 = 0$, we have:
$$
e^{i 2 \pi (u_0x / M)} = (-1)^{x} 
$$  <br />

Therfore,

$$
f(x)\cdot (-1)^{x} \iff F(u-M/2) \text{ and } f(x-M/2) \iff F(u) \cdot (-1)^{x} 
$$

<br /> Since this properties, we do multiply the date by $(-1)^{i}$ before calculating Fourier transform to centralize all the high frequencies in the center of the graphic, that is, the (0, 0) on frequencies domain appeared on the center and the highest frequencies stay on higher distances to that center.


In [6]:
def multiplying_by_minus_one_to_index(array):
  i = 0
  new_array = np.ones(len(array))

  for i in range(len(array)):
    new_array[i] = array[i] * ( (-1)**(i) )
  
  return new_array

In [15]:
# It makes no sense to apply this function to our X value since we want to filter our y array, 
#but to make it easier, we will call the 'new value of x' as x_multiplied

x_multiplied = x_artifical_borders
y_multiplied = multiplying_by_minus_one_to_index(y_artifical_borders)

# Plotting the grafic
fig = px.scatter(x=x_multiplied, y=y_multiplied, title='Light Curve Multiplied')
fig.show()

## 3. Zero Padding

One good practice before filtering data with Butterworth is to apply a procedure called Zero Padding. It consists of extending a signal with zeros, that is, it return a lenght $N$ signal to a lenght $M > N$ signal, but $N$ can not be divisible by $M$.


Zero Padding, $f_{zp}$, has the following definition: <br />

<br />
$$
f_{zp}(x) \triangleq \begin{cases}
   f(x), &\text{if } |m| < N/2 \\
   0, &\text{otherwise }
\end{cases}

$$

where $m = 0, \pm1, \pm2, \pm3, ...\pm M_{h}$, with $M_h \triangleq (M-1)/2 $ for $M$ odd, and $M/2-1$ for $M$ even.

<br /> Padding remove the implicit periodicity of the funcion that can appeared on the filtered data.
In other terms, the padding prevents the convolution of two functions from generating unexpected (periodic) results.

In [16]:
def padding(array):
  return np.append(array, np.zeros(len(array)))

In [45]:
x_padding = padding(x_multiplied)
y_padding = padding(y_multiplied)

print("The original length was:", len(x), ", and now we have:", len(x_padding))

print(len(x_padding) == len(y_padding)) # Should be True

The original length was: 9228 , and now we have: 18516
True


In [29]:
# Plotting the graphic

fig = px.scatter(x=x_padding, y=y_padding, title='Light Curve Padded')
fig.show()

# We can see that it was created a lot of point equals to (0, 0)

Now, our date is ready to be filtered.

The next step is to calculate the Fourier Transform.

# Fourier Transform


$$ 

\hat{f}(\omega) \equiv F(\omega) \equiv \mathcal{F}\{f(t)\} = \int_{-\infty}^{\infty} f(t)\space e^{-j\space \omega t} dt

$$

In [25]:
def fourier_transform(array):
  return np.fft.fft(array)

In [33]:
x_fft = x_padding
y_fft = fourier_transform(y_padding)

# Butterworth Filter

Finally, we must multiply the results of Fourier transform by the filter's funcion, in this case, by the Butterworth Transfer funcion $ H_{n} (j\space\omega) $.

$$ 

G_{n}(\omega)  = |H_{n}(j\space\omega)| = \frac{1}{ \sqrt{ 1 + \big(\frac{\omega}{\omega_c}\big)^{2n} } } 

$$

where:

- G is the gain of an n-order Butterworth low-pass filter
- H is transfer funcion
- j is imaginary number
- n is the order of the filter
- ω  is the angular frequency [rad/s],
- $\omega_c$ is the cutoff frequency [rad/s].

In [35]:
x_butter = x_fft
y_butter = y_fft

# Inverse Fourier Transform

To go back to spacial domain, we apply the inverse Fourier Transform, $ f(\omega) $, to the product $ \mathcal{F}\{f(t)\} \cdot H_{n}(j\space\omega) $.

$$ 

f(\omega)  \equiv \mathcal{F}^{-1}\{f(t)\} = \frac{1}{2\pi} \int_{-\infty}^{\infty} f(t)\space e^{i\space \omega t} d\omega

$$

where:
- $\omega = 2\pi f$ is the angular frequency

In [36]:
def inverse_fourier_transform(array):
  return np.fft.ifft(array)

In [40]:
x_ifft = x_fft
y_ifft = np.real(inverse_fourier_transform(y_butter))

# Comment that we get only the real part

And now, we need to undo all the modifications we did on 1

# Undoing normalization procedures

## Re-multiplying data by $(-1)^{i}$

In [41]:
x1 = x_ifft
y1 = multiplying_by_minus_one_to_index(y_ifft)

## Removing Zero Padding

In [46]:
x2 = x1[:int(len(x1)/2)]
y2 = y1[:int(len(y1)/2)]

print("The before length was:", len(x1), ", and now we have:", len(x2))

print(len(x2) == len(y2)) # Should be True

The before length was: 18516 , and now we have: 9258
True


## Cuting artificial borders

In [49]:
param = param # equals 15, but we might change that value

In [50]:
x3 = np.delete(x2, np.s_[:param])
x4 = np.delete(x3, np.s_[-param:])

y3 = np.delete(y2, np.s_[:param])
y4 = np.delete(y3, np.s_[-param:])

# Filtering results

In [52]:
# Plotting the graphic

fig = px.line(x=x4, y=y4, title='Light Curve Filtered')
fig.show()