# Solar Energy Improvement

***Example of experimentation for improvement***

The objective is to maximize the **collection efficiency** and the **energy delivered efficiency**.
Outcomes:

    y1 - collection efficiency
    y2 - energy delivery efficiency

And the factors (variables) considered relevant to the study are:

    A - total daily insolation (sunlight received)
    B - storage tank capacity
    C - water flow rate
    D - solar intermittency

There are 2 values (1 and -1) for each of the 4 variables. So 2^4 = 16 experiments. (It may be reduced sacrificing some accuracy...) 

The original data is here: yint.org/solar-panel-study

In [None]:
A <- c(-1, 1, -1, 1, -1, 1, -1, 1, -1, 1, -1, 1, -1, 1, -1, 1)
B <- c(-1, -1, 1, 1, -1, -1, 1, 1, -1, -1, 1, 1, -1, -1, 1, 1)
C <- c(-1, -1, -1, -1, 1, 1, 1, 1, -1, -1, -1, -1, 1, 1, 1, 1)
D <- c(-1, -1, -1, -1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1, 1, 1)

y1 <- c(43.5, 51.3, 35, 38.4, 44.9, 52.4, 39.7, 41.3, 50.2, 50.2, 37.5, 39.2, 43, 51.9, 39.9, 41.6)
y2 <- c(82, 83.7, 61.7, 100, 82.1, 84.1, 67.7, 100, 82, 86.3, 66, 100, 82.2, 89.8, 68.8, 100) 

The values ***-1*** stands for the lower tested value and ***1*** stands for the higher value tested. (Other values may also be used...)
In different cases ***-1*** and ***1*** may also be used for qualities (ex: -1 for red; 1 for green, -1 for running without jacket, 1 for running with jacket...)
These are normalized/coded values, which can be converted to the real values/qualities.

In [None]:
collection_efficiency <-  lm(y1 ~ A*B*C*D) 
summary(collection_efficiency)

And the summary given is:

In [None]:
Call:
lm(formula = y1 ~ A * B * C * D)

Residuals:
ALL 16 residuals are 0: no residual degrees of freedom!

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  43.7500         NA      NA       NA
A             2.0375         NA      NA       NA
B            -4.6750         NA      NA       NA
C             0.5875         NA      NA       NA
D             0.4375         NA      NA       NA
A:B          -0.9875         NA      NA       NA
A:C           0.4250         NA      NA       NA
B:C           0.9625         NA      NA       NA
A:D          -0.5000         NA      NA       NA
B:D           0.0375         NA      NA       NA
C:D          -0.6750         NA      NA       NA
A:B:C        -0.6500         NA      NA       NA
A:B:D         0.3000         NA      NA       NA
A:C:D         0.6875         NA      NA       NA
B:C:D         0.3250         NA      NA       NA
A:B:C:D      -0.4625         NA      NA       NA

Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared:      1,	Adjusted R-squared:    NaN 
F-statistic:   NaN on 15 and 0 DF,  p-value: NA


Which gives a model y1 = 43.75 + 2.04A - 4.6B +... (Look at the first and second columns.)

From here, different outcomes can be predicted for different factors (A, B...).

It means that factors A and B are the most important. Solar intermitency (D) is not important, as well as the interactions between factors are not important. 

In [None]:
energy_delivery_efficiency <- lm(y2 ~ A*B*C*D) 
summary(energy_delivery_efficiency)

And the summary given is:

In [None]:
Call:
lm(formula = y2 ~ A * B * C * D)

Residuals:
ALL 16 residuals are 0: no residual degrees of freedom!

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept)  8.353e+01         NA      NA       NA
A            9.462e+00         NA      NA       NA
B           -5.000e-01         NA      NA       NA
C            8.125e-01         NA      NA       NA
D            8.625e-01         NA      NA       NA
A:B          7.513e+00         NA      NA       NA
A:C         -3.250e-01         NA      NA       NA
B:C          2.875e-01         NA      NA       NA
A:D          1.750e-01         NA      NA       NA
B:D         -1.875e-01         NA      NA       NA
C:D          1.907e-15         NA      NA       NA
A:B:C       -7.750e-01         NA      NA       NA
A:B:D       -8.500e-01         NA      NA       NA
A:C:D        3.875e-01         NA      NA       NA
B:C:D       -4.000e-01         NA      NA       NA
A:B:C:D      1.250e-02         NA      NA       NA

Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared:      1,	Adjusted R-squared:    NaN 
F-statistic:   NaN on 15 and 0 DF,  p-value: NA

Which gives a model y2 = 83.53 + ... + 7.51*AB

In [None]:
library(pid)
paretoPlot(collection_efficiency) 

<img src="files/collection_efficiency.png"> 

In [None]:
library(pid)
paretoPlot(energy_delivery_efficiency)

<img src="files/energy_delivery_efficiency.png">