### Unidad II. Regresiones y reducción de dimensionalidad.

## Análisis de Regresión Logística.

- Curvas ROC.

#### Lectura de datos

Puedo leer la tabla desde un archivo [*csv (comma-separated values)*](https://en.wikipedia.org/wiki/Comma-separated_values) (por ejemplo, exportado de Excel) usando [`readtable` de *DataFrames*](http://juliastats.github.io/DataFrames.jl/stable/)

In [1]:
using DataFrames
survey = readtable("survey.csv")
head(survey)

Unnamed: 0,Sex,WrHnd,NWHnd,WHnd,Fold,Pulse,Clap,Exer,Smoke,Height,MI,Age
1,Female,18.5,18.0,Right,R on L,92.0,Left,Some,Never,173.0,Metric,18.25
2,Male,19.5,20.5,Left,R on L,104.0,Left,,Regul,177.8,Imperial,17.583
3,Male,18.0,13.3,Right,L on R,87.0,Neither,,Occas,,,16.917
4,Male,18.8,18.9,Right,R on L,,Neither,,Never,160.0,Metric,20.333
5,Male,20.0,20.0,Right,Neither,35.0,Right,Some,Never,165.0,Metric,23.667
6,Female,18.0,17.7,Right,L on R,64.0,Right,Some,Never,172.72,Imperial,21.0


Para este análisis los `NaN` ([*not a number*](https://en.wikipedia.org/wiki/NaN)) pueden ser tomados como valores faltantes: `NA` (*not available*)

In [2]:
survey = readtable("survey.csv", nastrings=["","NA","NaN"])
head(survey)

Unnamed: 0,Sex,WrHnd,NWHnd,WHnd,Fold,Pulse,Clap,Exer,Smoke,Height,MI,Age
1,Female,18.5,18.0,Right,R on L,92.0,Left,Some,Never,173.0,Metric,18.25
2,Male,19.5,20.5,Left,R on L,104.0,Left,,Regul,177.8,Imperial,17.583
3,Male,18.0,13.3,Right,L on R,87.0,Neither,,Occas,,,16.917
4,Male,18.8,18.9,Right,R on L,,Neither,,Never,160.0,Metric,20.333
5,Male,20.0,20.0,Right,Neither,35.0,Right,Some,Never,165.0,Metric,23.667
6,Female,18.0,17.7,Right,L on R,64.0,Right,Some,Never,172.72,Imperial,21.0


### [Regresión Logística](https://en.wikipedia.org/wiki/Logistic_regression)

Supongamos, que partiendo de los datos de la encuesta queremos generar un modelo que nos permita clasificar hombres y mujeres dependiendo de su altura.  
Podríamos darle un valor numérico a hombres y mujeres (`0`, `1`), para luego realizar una regresión lineal que permita predecir estos valores.  
Cómo no disponemos de la posibilidad de conseguir nuevos datos (por ejemplo, no podemos permitirnos una segunda encuesta), vamos a hacer esta primera exploración en una porción del *dataset*.  

In [3]:
subset = survey[:,[:Sex,:Height]] # Me quedo con las columnas de interés
head(subset)

Unnamed: 0,Sex,Height
1,Female,173.0
2,Male,177.8
3,Male,
4,Male,160.0
5,Male,165.0
6,Female,172.72


In [4]:
complete_cases!(subset) # Elimina las filas que tienen al menos un NA
head(subset)

Unnamed: 0,Sex,Height
1,Female,173.0
2,Male,177.8
3,Male,160.0
4,Male,165.0
5,Female,172.72
6,Male,182.88


Le doy un valor numérico a hombres y mujeres (`0`, `1`)

In [5]:
subset[:Sex] = [ sex == "Male" ? 0 : 1 for sex in subset[:Sex]]
head(subset)

Unnamed: 0,Sex,Height
1,1,173.0
2,0,177.8
3,0,160.0
4,0,165.0
5,1,172.72
6,0,182.88


In [7]:
N = nrow(subset)

208

In [61]:
div(N,30)

6

Para evitar [**hipótesis post hoc**](https://en.wikipedia.org/wiki/Testing_hypotheses_suggested_by_the_data), y para poder entrenar nuestro clasificador sin [**overfitting**](https://en.wikipedia.org/wiki/Overfitting), usaremos una estrategia de <a href=”https://en.wikipedia.org/wiki/Cross-validation_(statistics)”>**cross validation**</a>. En este caso será un 6-fold cross-validation (para poder tener cerca de 30 puntos en casa grupo de prueba) estratificado (para poder conservar la proporción de hombres y mujeres en cada grupo de datos) usando [MLBase](http://mlbasejl.readthedocs.io/en/latest/crossval.html).


In [32]:
using MLBase

In [54]:
cv = StratifiedKfold(subset[:Sex], 6)

MLBase.StratifiedKfold(208,[[111,110,58,122,129,10,62,206,142,60  …  131,97,101,104,192,144,26,37,152,132],[112,137,160,118,180,161,32,207,44,84  …  162,94,76,154,130,183,59,141,165,135]],6,[17.0,17.666666666666668])

In [55]:
índices_cv = collect(cv)

6-element Array{Any,1}:
 [1,2,3,4,5,6,7,8,9,11  …  196,197,199,200,201,202,203,204,205,208]  
 [1,2,3,4,6,8,9,10,11,12  …  198,199,200,201,203,204,205,206,207,208]
 [2,3,4,5,6,7,9,10,11,12  …  196,197,198,200,201,202,204,206,207,208]
 [1,2,3,4,5,7,8,10,11,12  …  197,198,199,200,202,203,204,205,206,207]
 [1,2,5,6,7,8,9,10,13,14  …  197,198,199,201,202,203,205,206,207,208]
 [1,3,4,5,6,7,8,9,10,11  …  199,200,201,202,203,204,205,206,207,208] 

In [56]:
índices_exploración = índices_cv[1]
exploración = subset[índices_exploración,:]

describe(exploración)

Sex
Min      0.0
1st Qu.  0.0
Median   0.0
Mean     0.4913294797687861
3rd Qu.  1.0
Max      1.0
NAs      0
NA%      0.0%

Height
Min      150.0
1st Qu.  165.0
Median   170.18
Mean     172.06820809248555
3rd Qu.  180.0
Max      200.0
NAs      0
NA%      0.0%



In [57]:
using Plots
pyplot(size=(400,200))

Plots.PyPlotBackend()

In [58]:
using GLM
using Distributions

In [59]:
lineal = glm(Sex ~ Height, exploración, Normal(), IdentityLink())

DataFrames.DataFrameRegressionModel{GLM.GeneralizedLinearModel{GLM.GlmResp{Array{Float64,1},Distributions.Normal{Float64},GLM.IdentityLink},GLM.DensePredChol{Float64,Base.LinAlg.Cholesky{Float64,Array{Float64,2}}}},Float64}

Formula: Sex ~ 1 + Height

Coefficients:
               Estimate  Std.Error  z value Pr(>|z|)
(Intercept)     6.21358   0.502169  12.3735   <1e-34
Height       -0.0332557 0.00291363 -11.4139   <1e-29


In [60]:
reg_plot = scatter(exploración, :Height, :Sex)
Plots.abline!(reg_plot, coef(lineal)[2], coef(lineal)[1])

residuos = exploración[:Sex] - predict(lineal)
normal = Normal(0,std(residuos))
res_plot = histogram(residuos, bins=10, normed=true)
plot!(res_plot, res -> pdf(normal, res), minimum(residuos), maximum(residuos))

plot(reg_plot, res_plot, legend=false)

En clase anteriores vimos que en caso de no disponer de un test estadístico en Julia, podíamos usar los tests implementados en R usando RCall. También es posible acceder desde Julia a los tests implementados en **Python** usando la biblioteca [**PyCall**](https://github.com/stevengj/PyCall.jl). En este caso por ejemplo, para testear la normalidad de los residuos usando la implementación de la [prueba de Shapiro en **SciPy**](http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.shapiro.html), y si la media de los residuos es 0 usando la [prueba T de Student para una muestra](http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.stats.ttest_1samp.html).

In [62]:
using PyCall

In [63]:
@pyimport scipy.stats as stats

In [64]:
stats.shapiro(residuos) # W, P

(0.9760165810585022,0.004342208616435528)

In [70]:
stats.ttest_1samp(residuos, 0) # T, P

(0-dimensional Array{Float64,0}:
1.46592e-14,0.9999999999999883)

Tratando de predecir una variable dependiente binaria, es fácil quebrar la hipótesis de normalidad de los residuos (como sucede en este caso). Una alternativa es reemplazar la predicción lineal por una **función sigmoide** que sirva para predecir las probabilidades de 1, de manera que todo lo que se prediga por encima de 0.5 sea un 1 y el resto un 0.  

[**Función sigmoide**](https://en.wikipedia.org/wiki/Sigmoid_function):  

$$S(t) = \frac{1}{1+e^{-t}} = \frac{e^{t}}{e^{t}+1}$$


In [117]:
S(t) = 1/(1+exp(-t))

S (generic function with 1 method)

In [121]:
plot(S,-5, 5, legend=false)

La [**función logística**](https://es.wikipedia.org/wiki/Funci%C3%B3n_log%C3%ADstica) la obtenemos reemplazando $t$ por una ecuación lineal:

$$L(x) = \frac{1}{1+e^{-(\alpha+\beta_{1}X_{1})}}$$

En este contexto la usaremos para calcular la probabilidad de nuestra variable dependiente $Y$ (`:Sex`) tome una de las categorías (*Female* en este caso):

In [169]:
Py(x,α,β) = 1/(1+exp(-(α+β*x)))

Py (generic function with 1 method)

In [170]:
anim = @animate for β=filter!(x -> x != 0, collect(-10:10))
    plot(x -> Py(x,0,β), -5, 5, label=β)
end
gif(anim, "logit_beta.gif", fps = 1)

INFO: Saved animation to /home/dzea/.julia/v0.4/AnalisisDeDatosCualitativos/material/logit_beta.gif


In [171]:
anim = @animate for α=-5:1:5
    β = 2
    inflexión = -α/β
    plot(x -> Py(x,α,β), -5, 5, label=α)
    vline!([inflexión], label=inflexión)
    hline!([0.5], label="0.5")
end
gif(anim, "logit_alpha.gif", fps = 1)

INFO: Saved animation to /home/dzea/.julia/v0.4/AnalisisDeDatosCualitativos/material/logit_alpha.gif


Los [*odds* o **chances**](https://en.wikipedia.org/wiki/Odds) de un evento se determinan dividiendo la probabilidad de que suceda el evento sobre la probabilidad de que el evento no suceda. 


$$ odds = \frac{P_{Y}}{1-P_{Y}} $$

In [184]:
odds(x,α,β) = Py(x,α,β)/(1-Py(x,α,β))

odds (generic function with 1 method)

La función [**logit**](https://en.wikipedia.org/wiki/Logit) es la inversa de la función logística, se calcula tomando el logaritmo natural de los *odds*. Nos permite transformar la función logística a la función lineal que contenía.

$$ logit = ln(odds) = \alpha+\beta_{1}X_{1} $$

In [185]:
log_odds(x,α,β) = log(odds(x,α,β)) # logit

log_odds (generic function with 1 method)

In [198]:
anim = @animate for α=-5:1:5
    β = 1
    plot(x -> Py(x,α,β), -5, 5, label="Py", ylim=(0,1)),
    plot!(x -> odds(x,α,β), -5, 5, label="odds"),
    plot!(x -> log_odds(x,α,β), -5, 5, label="logit")
end
gif(anim, "logit.gif", fps = 1)

INFO: Saved animation to /home/dzea/.julia/v0.4/AnalisisDeDatosCualitativos/material/logit.gif


Dado que nuestra variable dependiente $Y$ sólo tiene dos valores posibles (0,1), sigue una distribución de [**Bernoulli**](https://en.wikipedia.org/wiki/Bernoulli_trial). Se usará la [**link function**](https://en.wikipedia.org/wiki/Generalized_linear_model#Link_function) *logit* para hacer la transformación del valor esperado de $Y$ en nuestra [**GLM**](https://en.wikipedia.org/wiki/Logistic_regression#As_a_generalized_linear_model).

In [213]:
logística = glm(Sex ~ Height, exploración, Bernoulli(), LogitLink())

DataFrames.DataFrameRegressionModel{GLM.GeneralizedLinearModel{GLM.GlmResp{Array{Float64,1},Distributions.Bernoulli,GLM.LogitLink},GLM.DensePredChol{Float64,Base.LinAlg.Cholesky{Float64,Array{Float64,2}}}},Float64}

Formula: Sex ~ 1 + Height

Coefficients:
              Estimate Std.Error  z value Pr(>|z|)
(Intercept)      40.97   6.39326  6.40832    <1e-9
Height       -0.239204 0.0373992 -6.39597    <1e-9


In [214]:
coef(logística)

2-element Array{Float64,1}:
 40.97    
 -0.239204

In [215]:
α = coef(logística)[1]
β = coef(logística)[2]

-0.2392042418014837

In [216]:
scatter(exploración, :Height, :Sex, alpha=0.25, legend=false)
plot!(x -> Py(x,α,β))
vline!([-α/β])

In [217]:
using RCall

In [221]:
exploracion = exploración # R no soporta Unicode

R"""
logistica <- glm(Sex ~ Height, $exploracion, family = binomial(link = "logit"))
"""

RCall.RObject{RCall.VecSxp}

Call:  glm(formula = Sex ~ Height, family = binomial(link = "logit"), 
    data = `#JL`$exploracion)

Coefficients:
(Intercept)       Height  
    40.9700      -0.2392  

Degrees of Freedom: 172 Total (i.e. Null);  171 Residual
Null Deviance:	    239.8 
Residual Deviance: 142.6 	AIC: 146.6


In [222]:
R"summary(logistica)"

RCall.RObject{RCall.VecSxp}

Call:
glm(formula = Sex ~ Height, family = binomial(link = "logit"), 
    data = `#JL`$exploracion)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-2.80271  -0.48368  -0.07345   0.64121   2.13373  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  40.9700     6.1023   6.714 1.90e-11 ***
Height       -0.2392     0.0357  -6.701 2.07e-11 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 239.78  on 172  degrees of freedom
Residual deviance: 142.59  on 171  degrees of freedom
AIC: 146.59

Number of Fisher Scoring iterations: 5



### ¿Qué tan buena es nuestra predicción? [*Matriz de confusión*](https://en.wikipedia.org/wiki/Confusion_matrix)

In [273]:
predicción = predict(logística) # Py

clasificación = predicción .>= 0.5 # true: Female, false: Male

using FreqTables

tabla = freqtable(exploración[:Sex], clasificación)

2x2 NamedArray{Int64,2}:
Dim1 ╲ Dim2 │ false   true
────────────┼─────────────
0           │    69     19
1           │    14     71

|            |      Nₚ    |     Pₚ     |
|------------|------------|------------|
|   **N**        |    TN      |   FP       |
|   **P**        |    FN      |   TP       |

In [274]:
P = tabla ./ sum(tabla,2) # P(predicción|realidad)



2x2 Array{Float64,2}:
 0.784091  0.215909
 0.164706  0.835294

|            |      Nₚ    |     Pₚ     |
|------------|------------|------------|
|   **N**        |      TNR, [Especificidad](https://en.wikipedia.org/wiki/Sensitivity_and_specificity)    |   FPR      |
|   **P**        |      FNR   |   TPR, [Sensibilidad, *Recall*](https://en.wikipedia.org/wiki/Precision_and_recall)       |

In [275]:
P = tabla ./ sum(tabla,1) # P(realidad|predicción)



2x2 Array{Float64,2}:
 0.831325  0.211111
 0.168675  0.788889

|            |      Nₚ    |     Pₚ     |
|------------|------------|------------|
|   **N**        |      Negative predictive value (NPV)    |   False discovery rate (FDR)      |
|   **P**        |      False omission rate (FOR)   |   Positive predictive value (PPV), Precisión       |

### [ROC](https://en.wikipedia.org/wiki/Receiver_operating_characteristic)

In [245]:
# Pkg.clone("https://github.com/JuliaML/MLPlots.jl.git")
using ROCAnalysis
using MLPlots

In [246]:
objetivo    = predicción[ exploración[:Sex] .== 1 ]
no_objetivo = predicción[ exploración[:Sex] .== 0 ]

88-element Array{Float64,1}:
 0.173575  
 0.936869  
 0.817771  
 0.0586539 
 0.29096   
 0.735538  
 0.0586539 
 0.00996765
 0.20276   
 0.00996765
 0.102654  
 0.102654  
 0.0454961 
 ⋮         
 0.244174  
 0.980309  
 0.342646  
 0.936869  
 0.817771  
 0.817771  
 0.575736  
 0.0179795 
 0.516517  
 0.0361671 
 0.0361671 
 0.102654  

In [254]:
R = ROCAnalysis.roc(objetivo, no_objetivo)

ROCAnalysis.Roc{Float64}([1.0,0.670455,0.579545,0.443182,0.431818,0.352273,0.306818,0.295455,0.284091,0.261364  …  0.125,0.113636,0.0909091,0.0909091,0.0340909,0.0340909,0.0113636,0.0113636,0.0,0.0],[0.0,0.0,0.0117647,0.0117647,0.0235294,0.0235294,0.0352941,0.0705882,0.0705882,0.0941176  …  0.364706,0.4,0.458824,0.529412,0.635294,0.8,0.823529,0.952941,0.964706,1.0],Bool[true,true,false,true,false,true,true,false,false,false  …  false,false,false,false,true,false,true,false,true,true],[0.00103654,0.102654,0.110392,0.166819,0.173575,0.278298,0.29096,0.342646,0.398356,0.414513  …  0.686478,0.704707,0.735538,0.75814,0.817771,0.850753,0.936869,0.949624,0.980309,0.985966],[-Inf,-2.96105,-2.96105,-2.04476,-2.04476,-1.35161,0.0346856,0.0346856,0.0346856,0.0346856  …  1.19784,1.19784,1.19784,1.19784,1.19784,2.11413,2.11413,2.51959,2.51959,Inf])

In [255]:
plot(R, xlab="1 - Especificidad", ylab="Sensibilidad")

In [251]:
1- auc(R)

0.8880347593582888

#### Entrenamiento

In [295]:
coeficientes = []

entrenar(indices) = glm(Sex ~ Height, subset[indices,:], Bernoulli(), LogitLink())

function evaluar(modelo, indices)
    α = coef(modelo)[1]
    β = coef(modelo)[2]
    push!(coeficientes, [α, β]) # Guardo los coeficientes
    limite = -α/β
    real = subset[indices,:Sex] .== 0
    predicción = subset[indices,:Height] .>= limite
    100*sum(predicción & real)/sum(predicción)# PPV = TP / Pₚ
end

scores = cross_validate(
    entrenar,
    evaluar,
    nrow(subset), # N
    cv) # cross validation: 6-fold estratificado

6-element Array{Float64,1}:
  83.3333
  94.4444
  82.3529
  66.6667
 100.0   
  83.3333

In [296]:
mean_and_std(scores)

(85.02178649237473,11.514063314387228)

In [304]:
coeficientes = hcat(coeficientes...)'

6x2 Array{Float64,2}:
 40.97    -0.239204
 38.8941  -0.226742
 43.3611  -0.252254
 48.6654  -0.283875
 43.7918  -0.254528
 44.9678  -0.262037

In [317]:
α, β = (mean(coeficientes, 1)...)

(43.44169427458678,-0.2531068206454732)

In [319]:
plt = scatter(exploración, :Height, :Sex, alpha=0.25, legend=false, ylim=(0,1))
for i in 1:size(coeficientes,1)
    alpha = coeficientes[i,1]
    beta = coeficientes[i,2]
    plot!(plt, x -> Py(x,alpha,beta), alpha=0.5, c=colorant"gray")
    vline!(plt, [-alpha/beta], alpha=0.5, c=colorant"gray")
end
plot!(plt, x -> Py(x,α,β), c=colorant"green")
vline!(plt, [-α/β], c=colorant"green")

### Múltiples variables independientes

In [325]:
complete_cases!(survey)
survey[:Sex] = survey[:Sex] .== "Female"
head(survey)

Unnamed: 0,Sex,WrHnd,NWHnd,WHnd,Fold,Pulse,Clap,Exer,Smoke,Height,MI,Age
1,True,18.5,18.0,Right,R on L,92,Left,Some,Never,173.0,Metric,18.25
2,False,19.5,20.5,Left,R on L,104,Left,,Regul,177.8,Imperial,17.583
3,False,20.0,20.0,Right,Neither,35,Right,Some,Never,165.0,Metric,23.667
4,True,18.0,17.7,Right,L on R,64,Right,Some,Never,172.72,Imperial,21.0
5,False,17.7,17.7,Right,L on R,83,Right,Freq,Never,182.88,Imperial,18.833
6,True,17.0,17.3,Right,R on L,74,Right,Freq,Never,157.0,Metric,35.833


In [326]:
glm(Sex ~ Pulse + Height + Age, survey, Binomial(), LogitLink())

DataFrames.DataFrameRegressionModel{GLM.GeneralizedLinearModel{GLM.GlmResp{Array{Float64,1},Distributions.Binomial,GLM.LogitLink},GLM.DensePredChol{Float64,Base.LinAlg.Cholesky{Float64,Array{Float64,2}}}},Float64}

Formula: Sex ~ 1 + Pulse + Height + Age

Coefficients:
               Estimate Std.Error  z value Pr(>|z|)
(Intercept)     47.4692   7.29924  6.50331   <1e-10
Pulse        0.00373047 0.0186393 0.200141   0.8414
Height        -0.273027 0.0409187 -6.67242   <1e-10
Age            -0.03933  0.038796 -1.01376   0.3107


In [328]:
using RCall 

R"""
logistica <- glm(Sex ~ Pulse + Height + Age, $survey, family = binomial(link = "logit"))
"""

RCall.RObject{RCall.VecSxp}

Call:  glm(formula = Sex ~ Pulse + Height + Age, family = binomial(link = "logit"), 
    data = `#JL`$survey)

Coefficients:
(Intercept)        Pulse       Height          Age  
   47.46920      0.00373     -0.27303     -0.03933  

Degrees of Freedom: 167 Total (i.e. Null);  164 Residual
Null Deviance:	    232.9 
Residual Deviance: 124.2 	AIC: 132.2


In [329]:
R"summary(logistica)"

RCall.RObject{RCall.VecSxp}

Call:
glm(formula = Sex ~ Pulse + Height + Age, family = binomial(link = "logit"), 
    data = `#JL`$survey)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-3.08694  -0.47345   0.02795   0.64308   1.86980  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) 47.46920    7.30124   6.502 7.95e-11 ***
Pulse        0.00373    0.01864   0.200    0.841    
Height      -0.27303    0.04093  -6.671 2.55e-11 ***
Age         -0.03933    0.03880  -1.014    0.311    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 232.9  on 167  degrees of freedom
Residual deviance: 124.2  on 164  degrees of freedom
AIC: 132.2

Number of Fisher Scoring iterations: 6



### [Stepwise regression](https://en.wikipedia.org/wiki/Stepwise_regression)

> In statistics, stepwise regression includes regression models in which the choice of predictive variables is carried out by an automatic procedure.

In [331]:
R"""
logistica <- step(glm(Sex ~ Pulse + Height + Age, $survey, family = binomial(link = "logit")))
"""

Start:  AIC=132.2
Sex ~ Pulse + Height + Age

         Df Deviance    AIC
- Pulse   1   124.24 130.24
- Age     1   125.43 131.43
<none>        124.20 132.20
- Height  1   231.86 237.86

Step:  AIC=130.24
Sex ~ Height + Age

         Df Deviance    AIC
- Age     1   125.56 129.56
<none>        124.24 130.24
- Height  1   232.42 236.42

Step:  AIC=129.56
Sex ~ Height

         Df Deviance    AIC
<none>        125.56 129.56
- Height  1   232.90 234.90


RCall.RObject{RCall.VecSxp}

Call:  glm(formula = Sex ~ Height, family = binomial(link = "logit"), 
    data = `#JL`$survey)

Coefficients:
(Intercept)       Height  
    46.5011      -0.2705  

Degrees of Freedom: 167 Total (i.e. Null);  166 Residual
Null Deviance:	    232.9 
Residual Deviance: 125.6 	AIC: 129.6


#### AIC

> El criterio de información de Akaike (AIC) es una medida de la calidad relativa de un modelo estadístico, para un conjunto dado de datos. Como tal, el AIC proporciona un medio para la selección del modelo.  
AIC maneja un trade-off entre la [**bondad de ajuste del modelo**](https://en.wikipedia.org/wiki/Goodness_of_fit) y la complejidad del modelo. Se basa en la entropía de información: se ofrece una estimación relativa de la información perdida cuando se utiliza un modelo determinado para representar el proceso que genera los datos.  
Dado un conjunto de modelos candidatos para los datos, el modelo preferido es el que tiene el valor mínimo en el AIC. Por lo tanto AIC no sólo recompensa la bondad de ajuste, sino también incluye una penalidad, que es una función creciente del número de parámetros estimados. Esta penalización desalienta el [**sobreajuste**](https://en.wikipedia.org/wiki/Overfitting) (aumentando el número de parámetros libres en el modelo mejora la bondad del ajuste, sin importar el número de parámetros libres en el proceso de generación de datos).
>> - [Akaike information criterion, *wiki*](https://en.wikipedia.org/wiki/Akaike_information_criterion) 