# Quantile Regression instead of Fourier Transform

Anton Antonov   
December 2019
December 2024

-----

## Introduction

As stated in [the previous answer](https://mathematica.stackexchange.com/a/191675/34008):

> I am guessing that you are trying to fit the data to a sum of sinusoids, and use frequency analysis to provide guesses for the parameters.


Below is an answer that is somewhat of a brute force identification of significant Sin and Cos expansion terms using Quantile Regression. 
(The coefficients of the basis functions found by Quantile Regression are used instead of, say, `Periodogram`.)

The computations are done with the [package "Regressionizer"](https://pypi.org/project/Regressionizer/).

-------

## Setup

In [None]:
from Regressionizer import *
from OutlierIdentifiers import *

import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import plotly.subplots as sp

import inspect

In [None]:
template='plotly_dark'
data_color='darkgray'

----

## Data

In [None]:
dirName = "./"
fileName = dirName + "/fourier-transform-data.csv.zip"
dfRawData = pd.read_csv(fileName, compression='zip')
dfRawData = dfRawData.sort_values(by=dfRawData.columns[0])

dfRawData

-----

## Procedure outline

1. Make a reference fit with an appropriate B-spline basis.

2. Compute a Quantile Regression fit with a large enough Sin/Cos basis functions.  

   -  Use suitable ranges for frequency factors and phase offsets.

3. Find the most significant contributors to the fit of step 2.

   - Pick the obvious outliers.

4. Compute Quantile Regression fit with the Sin/Cos functions found in the previous step. 

5. Examine the results and if needed re-iterate steps 2-5 with different function bases or Quantile Regression parameters.


-----

## Fit with B-splines

In this section we do a fit with B-splines basis for a references.

In [None]:
qrObj = (Regressionizer(dfRawData.to_numpy())
   .echo_data_summary()
   .quantile_regression(70, [0.5,])
   .plot(template = template, width = 1000, height = 400)
   )

In [None]:
qrObj.take_value().show()

Here we take the fitted regression quantile:

In [None]:
qFunc = qrObj.take_regression_quantiles().get(0.5)
qFunc

# Search for Sin/Cos model

Let us a make a large number of basis functions based on Fouriers expansion:

In [None]:
bFuncs = [lambda x: 1] + [func for h in np.arange(1.3, 6, 0.14) for b in np.arange(0, 1.1, 0.5) for func in (lambda x, b=b, h=h, f='sin': np.sin(b + h * x), lambda x, b=b, h=h, f='cos': np.cos(b + h * x))]

len(bFuncs)

Here is a fit with selected basis.

In [None]:
qrObj2 = (Regressionizer(dfRawData.to_numpy())
   .echo_data_summary()
   .quantile_regression_fit(funcs=bFuncs, probs=[0.5,])
   .plot(template = template, width = 1000, height = 400)
   )

In [None]:
qrObj2.take_value().show()

Here we take regression function from the monad object:

In [None]:
qFunc2 = qrObj2.take_regression_quantiles().get(0.5)
qFunc2

Here we can examine the most significant terms of the fit with the Sin/Cos basis:

In [None]:
x = np.linspace(0, len(bFuncs)-1, len(bFuncs))
y = [abs(t) for t in qrObj2.take_lp_solutions()[0]]

# Create the scatter plot
fig = go.Figure()
fig.add_trace(go.Scatter(x=x, y=y, mode='markers', marker=dict(color='Orange', size=10)))

# Update layout
fig.update_layout(title='Coefficients',
                  xaxis_title='X-axis',
                  yaxis_title='Y-axis',
                  template = template
                  )

# Show the plot
fig.show()


Let us compare the two fits:

In [None]:
# Uniform the x-values
x = np.linspace(dfRawData.iloc[0, 0], dfRawData.iloc[dfRawData.shape[0]-1,0], 100)

# Create the traces
trace1 = go.Scatter(
    x=x,
    y=[qFunc(t) for t in x],
    mode='lines',
    name='B-splines fit',
    line=dict(color='blue')
)

trace2 = go.Scatter(
    x=x,
    y=[qFunc2(t) for t in x],
    mode='lines',
    name='Sin/Cos fit',
    line=dict(color='red')
)

# Create the figure object
fig = go.Figure(data=[trace1, trace2])

# Update layout
fig.update_layout(title='Comparison plot',
                  xaxis_title='x',
                  yaxis_title='y',
                  template = template, width = 1000, height = 400
                  )

# Show the plot
fig.show()

# Re-do the fit with a more informed basis

Here we select the Sin/Cos terms with the largest factors: 

In [None]:
yAbs = [abs(t) for t in qrObj2.take_lp_solutions()[0]]
pos = np.argsort(yAbs)[-8:]
print("pos: ", pos)
bFuncs3 = [bFuncs[i] for i in pos]

In [None]:
bFuncs3

Here is a "manually" made basis:

In [None]:
bFuncs4 = [
    lambda x: 1, 
    lambda t: np.sin(0. + 1.8*t), lambda t: np.sin(0. + 2.5*t), 
    lambda t: np.cos(1. + 1.8*t), lambda t: np.sin(0.5 + 1.8*t), lambda t: np.cos(1. + 4.1*t)]

Here we do the fit:

In [None]:
qrObj3 = (Regressionizer(dfRawData.to_numpy())
   .echo_data_summary()
   .quantile_regression_fit(funcs=bFuncs3, probs=[0.5,])
   .plot(template = template, width = 1000, height = 400)
   )

In [None]:
qrObj3.take_value().show()

Take the fitted regression quantile:

In [None]:
qFunc3 = qrObj3.take_regression_quantiles().get(0.5)
qFunc3

Again, let us compare with the reference fit:

In [None]:
# Create the traces
trace1 = go.Scatter(
    x=x,
    y=[qFunc(t) for t in x],
    mode='lines',
    name='B-splines fit',
    line=dict(color='blue')
)

trace2 = go.Scatter(
    x=x,
    y=[qFunc3(t) for t in x],
    mode='lines',
    name='Largest Sin/Cos fit',
    line=dict(color='red')
)

# Create the figure object
fig = go.Figure(data=[trace1, trace2])

# Update layout
fig.update_layout(title='Comparison plot',
                  xaxis_title='x',
                  yaxis_title='y',
                  template = template, width = 1000, height = 400
                  )

# Show the plot
fig.show()

-----

## Extension

In [None]:
# Create the traces
trace1 = go.Scatter(
    x=dfRawData["X"].to_numpy(),
    y=dfRawData["Y"].to_numpy(),
    mode='markers',
    name='data',
    line=dict(color='blue')
)

# Uniform the x-values
xLonger = np.linspace(dfRawData.iloc[0, 0], 1.5 * dfRawData.iloc[dfRawData.shape[0]-1,0], 100)

trace2 = go.Scatter(
    x=xLonger,
    y=[qFunc3(t) for t in xLonger],
    mode='lines',
    name='Largest Sin/Cos fit',
    line=dict(color='red')
)

# Create the figure object
fig = go.Figure(data=[trace1, trace2])

# Update layout
fig.update_layout(title='Extension plot',
                  xaxis_title='x',
                  yaxis_title='y',
                  template = template, width = 1000, height = 400
                  )

# Show the plot
fig.show()

----

## References

[MSE1] ["Fourier Transform to help guess with NonLinearModelFit"](https://mathematica.stackexchange.com/q/191617/34008), 
(2019), 
[Mathematica.StackExchange](https://mathematica.stackexchange.com/).