# Corrected Convex Nonparametric Least Squares (C2NLS)

Corrected  nonparametric  least  squares (`C2NLS`) is  a  new  nonparametric  variant  
of the COLS model in which nonparametric least squares subject to monotonicity and 
concavity constraints replace thefirst-stage parametric OLS regression. The `C2NLS`
model assumes that the regression $f$ is monotonic increasing and globally concave, the 
inefficiencies $\varepsilon$ are identically and independently distributed with 
mean $\mu$ and a finitevariance $\sigma^2$, and that the inefficiencies :$\varepsilon$ are 
uncorrelatedwith inputs $\bf X$.

Like `COLS`, the `C2NLS` method is implemented in two stages, which can be stated as follows:

* First stage: Estimate $E(y_i|x_i)$ by solving the following CNLS problem. Denote the CNLS 
residuals by $\varepsilon^{CNLS}_i$.
    \begin{align*}
      & \underset{\alpha, \beta, \varepsilon} {min} \sum_{i=1}^n\varepsilon_i^2 \\
      & \text{s.t.} \\
      &  y_i = \alpha_i + \beta_i^{'}X_i + \varepsilon_i \quad \forall i \\
      &  \alpha_i + \beta_i^{'}X_i \le \alpha_j + \beta_j^{'}X_i  \quad  \forall i, j\\
      &  \beta_i \ge 0 \quad  \forall i \\
    \end{align*}

* Second stage: Shift the residuals analogous to the `COLS` procedure; 
the `C2NLS` efficiency estimator is
    \begin{align*}
        \hat{\varepsilon_i}^{C2NLS}= \varepsilon_i^{CNLS}− \max_h \varepsilon_h^{CNLS},
    \end{align*}

where values of $\hat{\varepsilon_i}^{C2NLS}$ range from $[0, +\infty]$ with 0 
indicating efficient performance. Similarly, we adjust the CNLS intercepts $\alpha_i$ as
    \begin{align*}
        \hat{\alpha_i}^{C2NLS}= \alpha_i^{CNLS} + \max_h \varepsilon_h^{CNLS},
    \end{align*}

where $\alpha_i^{CNLS}$ is the optimal intercept for firmi in above CNLS problem
and $\alpha_i^{C2NLS}$ is the `C2NLS` estimator. Slope coefficients $\beta_i$ 
for `C2NLS` are obtained directly as the optimal solution to the CNLS problem.

### Stage 1: Esimate $E(y_i | x_i)$ by solving the CNLS problem.

In [1]:
# import packages
from pystoned import CNLS
from pystoned.constant import CET_ADDI, FUN_PROD, OPT_LOCAL, RTS_VRS
from pystoned.dataset import load_Finnish_electricity_firm

In [2]:
# import Finnish electricity distribution firms data
data = load_Finnish_electricity_firm(x_select=['OPEX', 'CAPEX'],
                                          y_select=['Energy'])

In [3]:
# define and solve the CNLS model
model = CNLS.CNLS(y=data.y, x=data.x, z=None, cet = CET_ADDI, fun = FUN_PROD, rts = RTS_VRS)
model.optimize(OPT_LOCAL)

Optimizing locally.
Estimating the additive model locally with mosek solver
Problem
  Name                   :                 
  Objective sense        : min             
  Type                   : QO (quadratic optimization problem)
  Constraints            : 7921            
  Cones                  : 0               
  Scalar variables       : 445             
  Matrix variables       : 0               
  Integer variables      : 0               

Optimizer started.
Quadratic to conic reformulation started.
Quadratic to conic reformulation terminated. Time: 0.00    
Presolve started.
Linear dependency checker started.
Linear dependency checker terminated.
Eliminator started.
Freed constraints in eliminator : 89
Eliminator terminated.
Eliminator started.
Freed constraints in eliminator : 0
Eliminator terminated.
Eliminator - tries                  : 2                 time                   : 0.00            
Lin. dep.  - tries                  : 1                 time               

### Stage 2: Shift the residuals analogous to the COLS procedure.

In [4]:
# print the shifted residuals
print(model.get_adjusted_residual())

[ -682.18627782  -677.969821    -701.60761871 -1030.2948489
  -693.08377375  -578.36903065  -708.25622215  -693.42362295
  -680.23132592  -622.48956836  -393.87697554     0.
  -699.61389997  -749.39181381  -668.87734352  -604.78653799
  -685.9227902   -709.46028649  -719.51861143  -707.16174125
  -630.62441581  -592.37586803  -656.90311764  -655.43314542
  -681.52553922  -702.22047859  -717.24492203 -1030.45481384
  -612.52628325  -701.08668868  -688.3351596   -463.37794252
  -641.78653376  -711.19536983  -703.16858258  -880.72513246
  -748.67420776  -673.11511236  -675.62803193  -753.99127172
  -681.35866624  -691.35475106  -916.13015122  -672.80929117
  -667.72699191  -699.38787096  -746.59843122  -684.62298705
  -735.33082163  -413.86974571  -677.45967495  -697.03469196
  -674.7275703   -640.52952774  -682.28718016  -329.85557671
  -515.39337149  -644.38251161  -650.9909378   -778.51604415
  -646.59500263 -1283.42370244  -482.28284296  -657.97281652
  -706.13925932  -694.5136014   -

In [5]:
# print the shifted intercept
print(model.get_adjusted_alpha())

[656.44793389 656.39703997 656.51998647 712.844865   656.44540447
 661.63001773 656.46299411 661.58900868 656.48056444 661.51304923
 768.46812234 770.2484122  656.54873873 661.57464456 781.76269274
 661.71330543 656.41884416 656.46658428 656.47660857 661.70200268
 662.09473856 658.23705641 654.1172952  661.75208799 656.42608435
 656.53921294 656.53759745 712.84511062 656.41979367 661.58111205
 656.47576784 783.71056405 661.6725422  661.60768741 659.73033992
 661.71716281 661.68000792 656.41882376 656.39351987 661.75594332
 661.7225447  656.39134903 712.84557504 656.4622658  656.4900164
 656.44081156 661.74709981 656.39226184 661.71754466 712.83607904
 656.44142718 661.67653474 656.50863699 661.72247562 656.47130572
 701.69573976 664.99688194 656.50128979 657.28868025 661.61628086
 654.84097662 712.84522883 661.75526607 665.53138328 656.54964294
 656.41773531 661.65737976 661.75929358 661.69535426 661.71268946
 712.84519736 661.68785702 712.84557152 656.4118352  656.59474032
 664.552926