# Corrected Convex Nonparametric Least Squares (C2NLS)

Corrected  nonparametric  least  squares (`C2NLS`) is  a  new  nonparametric  variant  
of the COLS model in which nonparametric least squares subject to monotonicity and 
concavity constraints replace thefirst-stage parametric OLS regression. The `C2NLS`
model assumes that the regression $f$ is monotonic increasing and globally concave, the 
inefficiencies $\varepsilon$ are identically and independently distributed with 
mean $\mu$ and a finitevariance $\sigma^2$, and that the inefficiencies :$\varepsilon$ are 
uncorrelatedwith inputs $\bf X$.

Like `COLS`, the `C2NLS` method is implemented in two stages, which can be stated as follows:

* First stage: Estimate $E(y_i|x_i)$ by solving the following CNLS problem. Denote the CNLS 
residuals by $\varepsilon^{CNLS}_i$.
    \begin{align*}
      & \underset{\alpha, \beta, \varepsilon} {min} \sum_{i=1}^n\varepsilon_i^2 \\
      & \text{s.t.} \\
      &  y_i = \alpha_i + \beta_i^{'}X_i + \varepsilon_i \quad \forall i \\
      &  \alpha_i + \beta_i^{'}X_i \le \alpha_j + \beta_j^{'}X_i  \quad  \forall i, j\\
      &  \beta_i \ge 0 \quad  \forall i \\
    \end{align*}

* Second stage: Shift the residuals analogous to the `COLS` procedure; 
the `C2NLS` efficiency estimator is
    \begin{align*}
        \hat{\varepsilon_i}^{C2NLS}= \varepsilon_i^{CNLS}− \max_h \varepsilon_h^{CNLS},
    \end{align*}

where values of $\hat{\varepsilon_i}^{C2NLS}$ range from $[0, +\infty]$ with 0 
indicating efficient performance. Similarly, we adjust the CNLS intercepts $\alpha_i$ as
    \begin{align*}
        \hat{\alpha_i}^{C2NLS}= \alpha_i^{CNLS} + \max_h \varepsilon_h^{CNLS},
    \end{align*}

where $\alpha_i^{CNLS}$ is the optimal intercept for firmi in above CNLS problem
and $\alpha_i^{C2NLS}$ is the `C2NLS` estimator. Slope coefficients $\beta_i$ 
for `C2NLS` are obtained directly as the optimal solution to the CNLS problem.

### Stage 1: Esimate $E(y_i | x_i)$ by solving the CNLS problem.

In [1]:
# import packages
from pystoned import CNLS
from pystoned.constant import CET_ADDI, FUN_PROD, OPT_LOCAL, RTS_VRS
from pystoned.dataset import load_Finnish_electricity_firm

In [2]:
# import Finnish electricity distribution firms data
data = load_Finnish_electricity_firm(x_select=['Energy', 'Length', 'Customers'],
                                        y_select=['TOTEX'])

In [3]:
# define and solve the CNLS model
model = CNLS.CNLS(y=data.y, x=data.x, z=None, cet = CET_ADDI, fun = FUN_PROD, rts = RTS_VRS)
model.optimize(OPT_LOCAL)

Invalid email address.

Estimating the additive model locally with mosek solver
Problem
  Name                   :                 
  Objective sense        : min             
  Type                   : QO (quadratic optimization problem)
  Constraints            : 7921            
  Cones                  : 0               
  Scalar variables       : 534             
  Matrix variables       : 0               
  Integer variables      : 0               

Optimizer started.
Quadratic to conic reformulation started.
Quadratic to conic reformulation terminated. Time: 0.00    
Presolve started.
Linear dependency checker started.
Linear dependency checker terminated.
Eliminator started.
Freed constraints in eliminator : 89
Eliminator terminated.
Eliminator started.
Freed constraints in eliminator : 0
Eliminator terminated.
Eliminator - tries                  : 2                 time                   : 0.00            
Lin. dep.  - tries                  : 1                 time           

### Stage 2: Shift the residuals analogous to the COLS procedure.

In [4]:
# print the shifted residuals
print(model.get_adjusted_residual())

[-1922.52880811 -1832.15878068 -1717.35278479 -1733.50335564
 -1689.45633249 -2518.29793279 -1909.92398045 -1591.39988474
 -1031.67987726 -1781.51233128 -1460.09811629 -1990.0040647
 -1544.30460643 -1366.68924317     0.         -1716.26450761
 -1784.20685941 -1812.55485868 -1901.48390242 -2484.85213411
 -1812.64299218 -2816.00597835 -1911.46323307 -2608.34945324
 -1782.60012297 -1903.80108355 -1573.74740625 -1989.97689055
 -1148.83465403 -1934.5659302  -2012.55283775  -226.65128925
 -2152.05190696 -1384.40075066 -1922.42815817  -935.23150418
 -2648.84765592 -2215.99490563 -2054.99620092 -2429.59805568
 -1746.16723176 -1849.14869398 -1323.63295724 -2117.43007483
 -2187.25414872 -1838.36022747 -1890.36140173 -1843.08143328
 -2972.5730599  -3894.38005468 -1715.82040352 -1921.26525038
 -1622.80734428  -604.54211916 -1884.70652929 -4722.26832814
 -3262.78561332 -1920.31908072 -2439.89996428 -1697.7659896
 -2010.27234391 -1942.13616358 -1889.92072672 -2156.93426808
 -1459.003058   -2003.4436

In [5]:
# print the shifted intercept
print(model.get_adjusted_alpha())

[ 2005.27898149  1942.14738339  1942.11176381  3180.67567736
  2067.1089848   2077.48812335  1942.04825026  2029.28791305
  1998.34673801  2077.50795851  3456.81619027 26535.79658587
  2012.16426231  2019.91428478  4282.326056    2079.27506145
  1942.05534343  1976.46592326  1942.30180489  1942.05154382
  2407.97994793  2076.73934104  1927.91396222  1942.04487397
  1983.28940469  1993.56793391  1984.781446    4366.4979848
  2063.97415077  1942.03415237  1993.52784431  3187.05252087
  2077.50888475  1964.5550187   1942.05115206  2375.91219713
  2110.23900108  1942.0418049   1942.01650239  1942.05676928
  1942.18242237  1943.24249579  2077.22487213  1993.57758436
  1994.27261142  2120.89322989  1942.06735983  1894.67501003
  1942.0522202   2077.52934445  1988.11442779  2075.8600962
  1987.93939093  2077.51859963  1997.74776918  2077.5247139
  2077.50988563  2013.98175543  2005.4001376   2065.5702009
  1533.6085544   3843.23833918  2077.52783366  1942.07108152
  1968.79749763  1943.485862