# Linear Fitting

**Problem:** Find `a` and `b` in the equation `y=a+b×x`, given the vectors `x` and `y`, using the [least-squares method](https://en.wikipedia.org/wiki/Linear_least_squares). Also, determine the corresponding [R-squared](https://en.wikipedia.org/wiki/Coefficient_of_determination) value.

Test data:

In [124]:
x←⍳25
y←10+5×x
]plot y x

This data is too perfect. Let's add some noise:

In [125]:
y+←(5-⍨10×?0⍴⍨≢y)
]plot y x

The slope is easy to determine if we already know what is the value of `a`:

In [126]:
⊢b←(y-10)⌹x

We can also use `⌹` to fit several variables. To fit the value of `a`, we consider an additional variable with a constant value of 1, as explained in the [docuentation for `⌹`](https://help.dyalog.com/latest/index.htm#Language/Primitive%20Functions/Matrix%20Divide.htm). We just need to modify our right argument from

In [127]:
(10↑x),'...'

into

In [128]:
(10↑1,⍪x)⋄'...'

(a vector is treated by `⌹` as a column vector). Then:

In [129]:
⊢a b←y⌹1,⍪x

We still need to determine the [R-squared](https://en.wikipedia.org/wiki/Coefficient_of_determination) value, which can be calculated as one minus the ratio between the square of the differences between the real and estimated value and between the real value and the average of all values:

In [130]:
1-(+/×⍨y-a+b×x)÷+/×⍨(⊢-+/÷≢)y

Our function to find `a`, `b` and `rsq` could be defined as:

In [131]:
LF←{(1-(+/×⍨⍺-a+b×⍵)÷+/×⍨(⊢-+/÷≢)⍺),⍨a b←⍺⌹1,⍪⍵}
y LF x

We may want to calculate R-square values when fitting other functions too (how to fit those other functions is left as an exercise for the reader). In those cases, a `_R2` operator might be more convenient:

In [132]:
_R2←{1-(+/×⍨⍺-⍺⍺⍵)÷+/×⍨(⊢-+/÷≢)⍺}
LF←{(⍺(a+b∘×)_R2⍵),⍨a b←⍺⌹1,⍪⍵}
y LF x

It could also be useful to be able to perform fittings in which the value of `a` is known a-priori (the most obvious use case would be to force the curve to pass over the 0,0 point). That sounds like a task for an optional left argument, but our function already takes a left argument. Let's first modify it to take both arguments from its right, using the reduce operator:

In [133]:
LF←{((a+b∘×)_R2/⍵),⍨a b←⊃⌹∘(1,⍪⍤⊢)/⍵}
⊢a b _←LF y x
]plot ((a+b×x) y) x -type=Line

With a left argument, we will use it as the value of the ordinate at the origin:

In [134]:
LF←{⍺,((⍺+b∘×)_R2/⍵),⍨b←⌹/⍵-⍺ 0}
⊢ad bd _←10 LF y x
⊢az bz _← 0 LF y x
]plot ((ad+bd×x) (az+bz×x) y y) x

After combining these two functions into one that takes an optional argument, this is our final result:

In [135]:
_R2←{1-(+/×⍨⍺-⍺⍺⍵)÷+/×⍨(⊢-+/÷≢)⍺}
LF←{F←{((a+b∘×)_R2/⍺),⍨a b←⍵} ⋄ 0=⎕NC'⍺': ⍵F⊃⌹∘(1,⍪⍤⊢)/⍵ ⋄ ⍵F⍺(⌹/⍵-⍺ 0)}
10 LF y x
   LF y x