# Chebyshev polynomials

Anton Antonov  
June 2024   
December 2024

-----

## Introduction

### TL;DR

- Chebyshev polynomials can be exactly computed
- The "Math::Fitting" package produces functors
- The fitting is done with a function basis
- Matrix formulas are used to compute the fit (linear regression)
- Real life example is shown with wether temperature data

Let us list the full set of features and corresponding packages:

- ["JavaScript::Google::Charts"](https://raku.land/zef:antononcube/JavaScript::Google::Charts)
    - Scatter plots
    - Time series data visualization
- ["Math::Polynomial::Chebyshev"](https://raku.land/zef:antononcube/Math::Polynomial::Chebyshev)
    - Polynomial basis
    - Both recursive and trigonometric methods of computation
    - The recursive method provides exact (bignum) integers for the numerators and denominators
- ["Math::Fitting"](https://raku.land/zef:antononcube/Math::Fitting)
    - Linear regression (i.e. fitting) with function bases
    - Gives functors as results
    - Multiple properties of the functors can be retrieved

- ["Data::TypeSystem"](https://raku.land/zef:antononcube/Data::TypeSystem)
    - Summary of data types

- ["Data::Summarizers"](https://raku.land/zef:antononcube/Data::Summarizers)
    - Summary of data values

-----

## Setup

In [158]:
use Math::Matrix;
use Math::Polynomial::Chebyshev;
use Math::Fitting;

use Data::Reshapers;
use Data::Summarizers;
use Data::Generators;
use Data::Importers;

use JavaScript::D3;
use JavaScript::Google::Charts;

use Hash::Merge;
use LLM::Configurations;

### Google Charts

In [159]:
#% javascript
google.charts.load('current', {'packages':['corechart']});
google.charts.load('current', {'packages':['gauge']});
google.charts.load('current', {'packages':['wordtree']});
google.charts.load('current', {'packages':['geochart']});
google.charts.load('current', {'packages':['table']});
google.charts.load('current', {'packages':['line']});
google.charts.setOnLoadCallback(function() {
    console.log('Google Charts library loaded');
});


In [160]:
my $format = 'html';
my $titleTextStyle = { color => 'Ivory', fontSize => 16 };
my $backgroundColor = '#1F1F1F';
my $legendTextStyle = { color => 'Silver' };
my $legend = { position => "none", textStyle => {fontSize => 14, color => 'Silver'} };

my $hAxis = { title => 'x', titleTextStyle => { color => 'Silver' }, textStyle => { color => 'Gray'}, logScale => False, format => 'decimal'};
my $vAxis = { title => 'y', titleTextStyle => { color => 'Silver' }, textStyle => { color => 'Gray'}, logScale => False, format => 'decimal'};

my $annotations = {textStyle => {color => 'Silver', fontSize => 10}};
my $chartArea = {left => 50, right => 50, top => 50, bottom => 50, width => '90%', height => '90%'};

{bottom => 50, height => 90%, left => 50, right => 50, top => 50, width => 90%}

-------

## Computation granularity

In [161]:
chebyshev-t(3, 0.3)

-0.792

In [162]:
my $k = 12;

# Whatever goes to 'recursive'
my $method = 'recursive'; # 'trig'

my @x = (-1.0, -0.99 ... 1.0);
say '@x.elems : ', @x.elems;

my @data  = @x.map({ [$_, chebyshev-t($k, $_, :$method)]});
my @data1 = chebyshev-t($k, @x);

say deduce-type(@data);
say deduce-type(@data1);

@x.elems : 201
Vector((Any), 201)
Vector((Any), 201)


In [163]:
sink records-summary(@data.map(*.tail) <<->> @data1)

+-------------+
| numerical   |
+-------------+
| Median => 0 |
| Mean   => 0 |
| 3rd-Qu => 0 |
| Min    => 0 |
| Max    => 0 |
| 1st-Qu => 0 |
+-------------+


-----

## Precision

We can compute the exact Chebyshev polynomial values at given points using `FatRat` numbers:

In [164]:
my $v = chebyshev-t(100, <1/4>.FatRat, method => 'recursive')

0.9908630290911637341902191815340830456

In [165]:
say $v.numerator;
say $v.denominator;

2512136227142750476878317151377
2535301200456458802993406410752


-----

## Plots

### Single polynomial

In [166]:
#%html
my $n = 6;
my @data = chebyshev-t(6, (-1, -0.98 ... 1).List);
js-google-charts('LineChart', @data, 
    title => "Chebyshev-T($n) polynomial", 
    :$titleTextStyle, :$backgroundColor, :$chartArea, :$hAxis, :$vAxis,
    width => 800, 
    div-id => 'poly1', :$format,
    :png-button)

### Basis

In [167]:
my $n = 8;
my @data = (-1, -0.98 ... 1).map(-> $x { [x => $x, |(0..$n).map({ $_.Str => chebyshev-t($_, $x, :$method) }) ].Hash });

deduce-type(@data):tally;

Tuple([Assoc(Atom((Str)), Atom((Int)), 10) => 1, Struct([0, 1, 2, 3, 4, 5, 6, 7, 8, x], [Int, Rat, Rat, Rat, Rat, Rat, Rat, Rat, Rat, Rat]) => 100], 101)

In [168]:
#%html
js-google-charts('LineChart', @data,
    column-names => ['x', |(0..$n)».Str],
    title => 'Chebyshev T polynomials, 0 .. ' ~ $n,
    :$titleTextStyle,
    width => 800, 
    height => 400,
    :$backgroundColor, :$hAxis, :$vAxis,
    legend => merge-hash($legend, %(position => 'right')),
    chartArea => merge-hash($chartArea, %(right => 100)),
    format => 'html', 
    div-id => 'cheb' ~ $n,
    :$format,
    :png-button)

-----

## Text plot

*Text plots always work!*

But we have to convert the data into [long form](https://en.wikipedia.org/wiki/Wide_and_narrow_data) first:

In [169]:
my @dataLong = to-long-format(@data, <x>).sort(*<Variable x>);
deduce-type(@dataLong):tally

Tuple([Struct([Value, Variable, x], [Int, Str, Int]) => 9, Struct([Value, Variable, x], [Int, Str, Rat]) => 100, Struct([Value, Variable, x], [Rat, Str, Rat]) => 800], 909)

Here is a sample:

In [170]:
#% html
@dataLong.pick(10)
==> {.sort(*<Variable x>)}()
==> to-html(field-names => <Variable x Value>)

Variable,x,Value
0,-0.54,1.0
0,0.88,1.0
2,-0.82,0.3448
2,-0.4,-0.68
3,-0.44,0.979264
4,-0.86,-0.5407347
4,0.76,-0.9518259
6,-0.7,0.059968
6,-0.42,0.8572349358
8,0.02,0.9872255836193


In [171]:
to-pretty-table(@dataLong.pick(6), field-names => <Variable x Value>)

+----------+-----------+-----------+
| Variable |     x     |   Value   |
+----------+-----------+-----------+
|    7     |  0.360000 | -0.534332 |
|    6     |  0.380000 |  0.694685 |
|    2     |  0.540000 | -0.416800 |
|    3     |  0.440000 | -0.979264 |
|    6     |  0.360000 |  0.596241 |
|    7     | -0.120000 |  0.745996 |
+----------+-----------+-----------+

Here is the text plot:

In [172]:
my @chebInds = 1, 2, 3, 4;
my @dataLong3 = @dataLong.grep({ $_<Variable>.Int ∈ @chebInds }).classify(*<Variable>).map({ $_.key => $_.value.map(*<x Value>).Array }).sort(*.key)».value;
say @chebInds Z=> <* □ ▽ ❍>; 
text-list-plot(@dataLong3, width => 100, height => 25)

(1 => * 2 => □ 3 => ▽ 4 => ❍)


+----+---------------------+---------------------+---------------------+---------------------+-----+      
|                                                                                                  |      
+    ❍                  ▽▽▽▽▽▽▽▽               ❍❍❍❍❍❍                                       *❍     +  1.00
|     □              ▽▽▽        ▽▽▽         ❍❍❍      ❍❍❍                               ***** □     |      
|      □□          ▽▽              ▽▽     ❍❍            ❍                          ****    □□▽     |      
|     ❍  □        ▽▽                 ▽▽▽ ❍               ❍❍                   *****       □ ▽❍     |      
|         □      ▽                     ❍❍▽                 ❍❍             *****          □         |      
+          □□   ▽                     ❍  ▽▽                  ❍        ****             □□  ▽       +  0.50
|      ❍    □  ▽                     ❍     ▽                  ❍  *****                □   ▽ ❍      |      
|            ▽▽                    ❍❍

-----

## Fitting

Here we generate "measurements data" with noise:

In [173]:
my @temptimelist = 0.1, 0.2 ... 20;
my @tempvaluelist = @temptimelist.map({ sin($_) / $_ }) Z+ (1..200).map({ (3.rand - 1.5) * 0.02 });
my @data1 = @temptimelist Z @tempvaluelist;
@data1 = @data1.deepmap({ .Num });

deduce-type(@data1)

Vector(Vector(Atom((Numeric)), 2), 200)

In [174]:
my @data2 = @data1.map({ my @a = $_.clone; @a[0] = @a[0] / max(@temptimelist); @a });

deduce-type(@data2)

Vector(Vector(Atom((Numeric)), 2), 200)

Here is a plot of that data:

In [175]:
#% html
js-google-charts("Scatter", @data2, 
    title => 'Measurements data with noise',
    :$backgroundColor, :$hAxis, :$vAxis,
    :$titleTextStyle, :$chartArea,
    width => 800, 
    div-id => 'data', :$format,
    :png-button)

Make a function that rescales from $[0, 1]$ to $[-1, 1]$:

In [176]:
my &rescale = { ($_ - 0.5) * 2 };

-> ;; $_? is raw = OUTER::<$_> { #`(Block|3693696912712) ... }

Here is a list of basis functions:

In [177]:
my @basis = (^16).map({ chebyshev-t($_) o &rescale });
@basis.elems

16

**Remark:** Function composition operator `o` is used above. Before computing the Chebyshev polynomial value the argument is rescaled.

Here we compute a linear model fit with those functions:

In [178]:
my &lm = linear-model-fit(@data2, :@basis)

Math::Fitting::FittedModel(type => linear, data => (200, 2), response-index => 1, basis => 16)

Here are the best fit parameters:

In [179]:
&lm('BestFitParameters')

[0.18229835566379843 -0.33649703967905475 0.29275703577089335 -0.20205204191706475 0.12107000357769826 0.0015976053691418679 -0.04995003901105369 0.0961501998214358 -0.06441480527740266 -0.03247239885679499 0.032679354019726145 0.009192362514135112 -0.013115829627142082 0.0015899116057931514 0.0023856644901223347 0.0025034755988654614]

Here is a plot of those parameters:

In [180]:
#% html
js-google-charts("Bar", &lm('BestFitParameters'), 
    :!horizontal,
    title => 'Best fit parameters',
    :$backgroundColor, 
    hAxis => merge-hash($hAxis, {title => 'Basis function index'}), 
    vAxis => merge-hash($hAxis, {title => 'Coefficient'}), 
    :$titleTextStyle, :$chartArea,
    width => 800, 
    div-id => 'bestFitParams', :$format,
    :png-button)

We can see from the plot that using more the 12 basis functions for that data is not improving the fit, since the coefficients after the 12th index are very small.

Now, let us plot the data and the fit. First we prepare the plot data:

In [181]:
my @fit = @data2.map(*.head)».&lm;
my @plotData = transpose([@data2.map(*.head).Array, @data2.map(*.tail).Array, @fit]);
@plotData = @plotData.map({ <x data fit>.Array Z=> $_.Array })».Hash;

deduce-type(@plotData)

Vector(Assoc(Atom((Str)), Atom((Numeric)), 3), 200)

Here is the plot:

In [182]:
#% html
js-google-charts('ComboChart', 
    @plotData, 
    title => 'Data and fit',
    column-names => <x data fit>,
    :$backgroundColor, :$titleTextStyle :$hAxis, :$vAxis,
    seriesType => 'scatter',
    series => {
        0 => {type => 'scatter', pointSize => 2, opacity => 0.1, color => 'Gray'},
        1 => {type => 'line'}
    },
    legend => merge-hash($legend, %(position => 'bottom')),
    :$chartArea,
    width => 800, 
    div-id => 'fit1', :$format,
    :png-button)

Compute the residuals of the last fit:

In [183]:
sink records-summary( (@fit <<->> @data2.map(*.tail))».abs )

+----------------------------------+
| numerical                        |
+----------------------------------+
| Median => 0.014927930525971196   |
| Max    => 0.03793943510040818    |
| 3rd-Qu => 0.0212995075359793     |
| 1st-Qu => 0.008590129662035885   |
| Mean   => 0.014966829595828597   |
| Min    => 0.00021515849622921746 |
+----------------------------------+


----

## Condition number

The formula with which the [Ordinary Least Squares (OLS)](https://en.wikipedia.org/wiki/Ordinary_least_squares) fit is computed is:

$$
\beta = (X^T \cdot X)^{-1} \cdot X^T \cdot y
$$

Let us look into the condition number of the "normal matrix" (or "Gram matrix") $X^T \cdot X$ . First, we get the design matrix:

In [184]:
my @a = &lm.design-matrix();
my $X = Math::Matrix.new(@a);
$X.size

(200 16)

Here is the Gram matrix:

In [185]:
my $g = $X.transposed dot $X;
$g.size

(16 16)

And here is the [condition number](https://en.wikipedia.org/wiki/Condition_number) of that matrix:

In [186]:
$g.condition

88.55110861577737

We conclude that we are fine to use that design matrix.

**Remark:** For a system of linear equations in matrix form $A x = b$, the condition number of $A$, $\kappa (A)$, is defined to be the maximum ratio of the relative error in $x$ to the relative error in $b$.

**Remark:** Typically, if the condition number is $\kappa (A)=10^{d}$, we can expect to lose as many as $d$ digits of accuracy 
in addition to any loss caused by the numerical method (due to precision issues in arithmetic calculations.)

**Remark:** A very "Raku-way" to define ill-conditioned matrix as "almost is not of full rank," or "if its inverse does not exist."
 

-----

## Temperature data

Let us redo the whole workflow with a real life data -- weather temperature data for 4 consecutive years of Greenville, South Carolina, USA. 
(Where the [Perl and Raku Conference 2025](https://www.perl.com/article/get-ready-for-the-2025-perl-and-raku-conference/) is going to be held.)

Here we ingest the time series data:

In [212]:
#my $url = 'https://raw.githubusercontent.com/antononcube/RakuForPrediction-blog/refs/heads/main/Data/dsTemperature-Greenville-SC-USA.csv';
my $url = '/Volumes/Macintosh HD/Users/antonov/RakuForPrediction-blog/Data/dsTemperature-Greenville-SC-USA.csv';
my @dsTemperature = data-import($url, headers => 'auto');
@dsTemperature = @dsTemperature.deepmap({ $_ ~~ / ^ \d+ '-' / ?? DateTime.new($_) !! $_.Numeric });
deduce-type(@dsTemperature):tally

Tuple([Struct([AbsoluteTime, Date, Temperature], [Int, DateTime, Int]) => 91, Struct([AbsoluteTime, Date, Temperature], [Int, DateTime, Rat]) => 1371], 1462)

Show data summary:

In [213]:
sink records-summary(@dsTemperature, field-names => <Date AbsoluteTime Temperature>)

+--------------------------------+----------------------+----------------------+
| Date                           | AbsoluteTime         | Temperature          |
+--------------------------------+----------------------+----------------------+
| Min    => 2018-01-01T00:00:37Z | Min    => 3723753600 | Min    => -5.72      |
| 1st-Qu => 2019-01-01T00:00:37Z | 1st-Qu => 3755289600 | 1st-Qu => 10.5       |
| Mean   => 2020-01-01T12:00:37Z | Mean   => 3786868800 | Mean   => 17.0535499 |
| Median => 2020-01-01T12:00:37Z | Median => 3786868800 | Median => 17.94      |
| 3rd-Qu => 2021-01-01T00:00:37Z | 3rd-Qu => 3818448000 | 3rd-Qu => 24.11      |
| Max    => 2022-01-01T00:00:37Z | Max    => 3849984000 | Max    => 29.89      |
+--------------------------------+----------------------+----------------------+


Here is a plot:

In [220]:
#% html
js-google-charts("Scatter", @dsTemperature.map(*<Date Temperature>), 
    title => 'Temperature of Greenville, SC, USA',
    :$backgroundColor,
    hAxis => merge-hash($hAxis, {title => 'Time', format => 'M/yy'}), 
    vAxis => merge-hash($hAxis, {title => 'Temperature, ℃'}), 
    :$titleTextStyle, :$chartArea,
    width => 1200, 
    height => 400, 
    div-id => 'tempData', :$format,
    :png-button)

Here is a fit -- note the rescaling:

In [190]:
my ($min, $max) = @dsTemperature.map(*<AbsoluteTime>).Array.&{ (.min, .max) }();

(3723753600 3849984000)

In [191]:
my &rescale-time = { -($max + $min) / ($max - $min) + (2 * $_) / ($max - $min)};
my @basis = (^16).map({ chebyshev-t($_) o &rescale-time });
@basis.elems

16

In [192]:
my &lm-temp = linear-model-fit(@dsTemperature.map(*<AbsoluteTime Temperature>), :@basis)

Math::Fitting::FittedModel(type => linear, data => (1462, 2), response-index => 1, basis => 16)

Her is a plot of the time series and the fit:

In [222]:
#% html
my @fit = @dsTemperature.map(*<AbsoluteTime>)».&lm-temp;
my @plotData = transpose([@dsTemperature.map(*<Date>).Array, @dsTemperature.map(*<Temperature>).Array, @fit]);
@plotData = @plotData.map({ <x data fit>.Array Z=> $_.Array })».Hash;

js-google-charts('ComboChart', 
    @plotData, 
    title => 'Temperature data and Least Squares fit',
    column-names => <x data fit>,
    :$backgroundColor, :$titleTextStyle,
    hAxis => merge-hash($hAxis, {title => 'Time', format => 'M/yy'}), 
    vAxis => merge-hash($hAxis, {title => 'Temperature, ℃'}), 
    seriesType => 'scatter',
    series => {
        0 => {type => 'scatter', pointSize => 2, opacity => 0.1, color => 'Gray'},
        1 => {type => 'line'}
    },
    legend => merge-hash($legend, %(position => 'bottom')),
    :$chartArea,
    width => 1200, 
    height => 400, 
    div-id => 'tempDataFit', :$format,
    :png-button)

-----

## Conclusion

At this point it should be clear that Raku is fully equipped to do regression analysis.