In [None]:
!ls data

# Chapter 5 Resampling Methods
+ **Resampling** - repeatedly drawing samples from a training set and refitting a modelon each sample - to obtain additional info.

+ Common resampling methods: bootstraping and cross validation. 
+ Cross validation can be used to estimate the test error to evaluate model perfomance (**model assessment**) or to select appropriate level of flexibility (**model selection**).
+ Boostraping provide a measure of accuracy of a parameter estimates or statistical learning method.

## Cross Validation
### Validation Set Approach
+ Usually a test set is not available so a simple strategy to create one is to split the available data into training and testing (**validation or hold-out set**). 
+ Divide the data into half where the observation in each sets are randomly selected.
+ To assess the model perfomance - quantitative responses usually use MSE, for categorical can use error rate, area under the curve, F1 score, weighting of confusion matrix, etc...
+ Drawbacks of this technique: 
    + the validation estimation of the test error rate can be highly variable.
    + the validation set error rate may tend to overestimate the test error rate.



In [None]:
from IPython.display import Image
Image('images/pw41.png', width =500)

#### Example


In [None]:
import pandas as pd
import numpy as np
import sklearn.linear_model as skl_lm
import matplotlib.pyplot as plt
%matplotlib inline


In this section, we'll explore the use of the validation set approach in order to estimate the
test error rates that result from fitting various linear models on the ${\tt Auto}$ data set.

In [None]:
df1 = pd.read_csv('data/auto.csv', na_values='?').dropna()
df1.info()

We begin by using the ${\tt sample()}$ function to split the set of observations
into two halves, by selecting a random subset of 196 observations out of
the original 392 observations. We refer to these observations as the training
set.

We'll use the ${\tt random\_state}$ parameter in order to set a seed for
${\tt python}$’s random number generator, so that you'll obtain precisely the same results each time. It is generally a good idea to set a random seed when performing an analysis such as cross-validation
that contains an element of randomness, so that the results obtained can be reproduced precisely at a later time.

In [None]:
train_df = df1.sample(196, random_state = 1)
test_df = df1[~df1.isin(train_df)].dropna(how = 'all')

X_train = train_df['horsepower'].values.reshape(-1,1)
y_train = train_df['mpg']
X_test = test_df['horsepower'].values.reshape(-1,1)
y_test = test_df['mpg']

We then use ${\tt LinearRegression()}$ to fit a linear regression to predict ${\tt mpg}$ from ${\tt horsepower}$ using only
the observations corresponding to the training set.

In [None]:
lm = skl_lm.LinearRegression()
model = lm.fit(X_train, y_train)


We now use the ${\tt predict()}$ function to estimate the response for the test
observations, and we use ${\tt sklearn}$ to caclulate the MSE.

In [None]:
pred = model.predict(X_test)

from sklearn.metrics import mean_squared_error

MSE = mean_squared_error(y_test, pred)
    
print(MSE)

Therefore, the estimated test MSE for the linear regression fit is 23.36. We
can use the ${\tt PolynomialFeatures()}$ function to estimate the test error for the polynomial
and cubic regressions.

In [None]:
from sklearn.preprocessing import PolynomialFeatures

# Quadratic
poly = PolynomialFeatures(degree=2)
X_train2 = poly.fit_transform(X_train)
X_test2 = poly.fit_transform(X_test)

model = lm.fit(X_train2, y_train)
print(mean_squared_error(y_test, model.predict(X_test2)))

# Cubic
poly = PolynomialFeatures(degree=3)
X_train3 = poly.fit_transform(X_train)
X_test3 = poly.fit_transform(X_test)

model = lm.fit(X_train3, y_train)
print(mean_squared_error(y_test, model.predict(X_test3)))

These error rates are 20.25 and 20.33, respectively. If we choose a different
training set instead, then we will obtain somewhat different errors on the
validation set. We can test this out by setting a different random seed:

In [None]:
##Choose a different training set

train_df = df1.sample(196, random_state = 2)
test_df = df1[~df1.isin(train_df)].dropna(how = 'all')

X_train = train_df['horsepower'].values.reshape(-1,1)
y_train = train_df['mpg']
X_test = test_df['horsepower'].values.reshape(-1,1)
y_test = test_df['mpg']

# Linear
model = lm.fit(X_train, y_train)
print(mean_squared_error(y_test, model.predict(X_test)))

# Quadratic
poly = PolynomialFeatures(degree=2)
X_train2 = poly.fit_transform(X_train)
X_test2 = poly.fit_transform(X_test)

model = lm.fit(X_train2, y_train)
print(mean_squared_error(y_test, model.predict(X_test2)))

# Cubic
poly = PolynomialFeatures(degree=3)
X_train3 = poly.fit_transform(X_train)
X_test3 = poly.fit_transform(X_test)

model = lm.fit(X_train3, y_train)
print(mean_squared_error(y_test, model.predict(X_test3)))

These results are consistent with our previous findings: a model that
predicts ${\tt mpg}$ using a quadratic function of ${\tt horsepower}$ performs better than
a model that involves only a linear function of ${\tt horsepower}$, and there is
little evidence in favor of a model that uses a cubic function of ${\tt horsepower}$.

### Leave One Out Cross Validation
+ LOOCV has only one observation in the test set and uses all other n-1 observations to build a model. 
+ n different models are built leaving out each observation once and error is averaged over these n trials.  
$$\textrm{CV}_{(n)} = \frac{1}{n}\sum_{i=1}^n{\textrm{MSE}_i}$$
+ LOOCV is better than validation set approach. It has far less bias and tends to not overestimate the test error rate. 
+ Model is built on nearly all the data and there is no randomness in the splits since each observation will be left out once. 
+ It is computationally expensive especially with large n and a complex model.



In [None]:
Image('images/pw42.png', width =500)

#### Example

In [None]:
model = lm.fit(X_train, y_train)

from sklearn.model_selection import cross_val_score, LeaveOneOut
loo = LeaveOneOut()
X = df1['horsepower'].values.reshape(-1,1)
y = df1['mpg'].values.reshape(-1,1)
loo.get_n_splits(X)

from sklearn.model_selection import KFold

crossvalidation = KFold(n_splits=392, random_state=None, shuffle=False)

scores = cross_val_score(model, X, y, scoring="neg_mean_squared_error", cv=crossvalidation,
 n_jobs=1)

print("Folds: " + str(len(scores)) + ", MSE: " + str(np.mean(np.abs(scores))) + ", STD: " + str(np.std(scores)))


We can repeat this procedure for increasingly complex polynomial fits. 

To automate the process, we use the for() function to initiate a for loop which iteratively fits polynomial regressions for polynomials of order i = 1 to i = 5 and computes the associated cross-validation error.

In [None]:
for i in range(1,6):
    poly = PolynomialFeatures(degree=i)
    X_current = poly.fit_transform(X)
    model = lm.fit(X_current, y)
    scores = cross_val_score(model, X_current, y, scoring="neg_mean_squared_error", cv=crossvalidation,
 n_jobs=1)
    
    print("Degree-"+str(i)+" polynomial MSE: " + str(np.mean(np.abs(scores))) + ", STD: " + str(np.std(scores)))



### k-fold cross validation
+ Similar to LOOCV but this time you leave some number greater than 1 out. 
+ Here, $k$ is the number of partitions of your sample, so if you have $n=1000$ observations and k = 10, the each fold will be 100. 
+ 900 observations would be the training set and 100 observations would act as your test set. 
+ Get an MSE for each fold of these 100 observations and take the average. 
$$\textrm{CV}_{(k)} = \frac{1}{k}\sum_{i=1}^k{\textrm{MSE}_i}$$
+ LOOCV is a special case of k-fold CV whenever $k=n$.
+ Computationally inexpensive compare to LOOCV.
+ Some variability compare to LOOCV.



In [None]:
Image('images/pw43.png', width =500)

#### Example

In [None]:
crossvalidation = KFold(n_splits=10, shuffle=False)

for i in range(1,11):
    poly = PolynomialFeatures(degree=i)
    X_current = poly.fit_transform(X)
    model = lm.fit(X_current, y)
    scores = cross_val_score(model, X_current, y, scoring="neg_mean_squared_error", cv=crossvalidation,
 n_jobs=1)
    
    print("Degree-"+str(i)+" polynomial MSE: " + str(np.mean(np.abs(scores))) + ", STD: " + str(np.std(scores)))

### bias-variance tradeoff between LOOCV and k-folds
+ Since LOOCV trains on nearly all the data, the test error rate will generally be lower than k-fold and therefore less biased. 
+ LOOCV will have higher variance since all $n$ models will be very highly correlated to one another. 
+ Since the models won't differ much, the test error rate (which what CV is measuring) will vary more than k-fold which has fewer models that are less correlated with one another. 
+ A value of $k$ between 5 and 10 is a good rule of thumb that balances the trade-off between bias and variance

#### Example: Default Data

In [None]:
df2 = pd.read_csv('data/default.csv', na_values='?').dropna()
df2.describe()

In [None]:
df2.head()

First we'll try just holding out a random 20% of the data:

In [None]:
import statsmodels.formula.api as smf
import statsmodels.api as sm
from sklearn.metrics import confusion_matrix, classification_report

for i in range(1,11):
    train_df2 = df2.sample(8000, random_state = i)
    test_df2 = df2[~df2.isin(train_df2)].dropna(how = 'all')
    
    # Fit a logistic regression to predict default using balance
    model = smf.glm('default~balance', data=train_df2, family=sm.families.Binomial())
    result = model.fit()
    predictions_nominal = [ "Yes" if x < 0.5 else "No" for x in result.predict(test_df2)]
    print("----------------")
    print("Random Seed = " + str(i) + "")
    print("----------------")
    print(confusion_matrix(test_df2["default"], 
                       predictions_nominal))
    print(classification_report(test_df2["default"], 
                            predictions_nominal, 
                            digits = 3))
    print()
    

### Precision Score
The precision is intuitively the ability of the classifier not to label as positive a sample that is negative.
TP – True Positives
FP – False Positives

Precision – Accuracy of positive predictions.
Precision = TP/(TP + FP)

### Recall Score
The recall is intuitively the ability of the classifier to find all the positive samples.
FN – False Negatives

Recall (aka sensitivity or true positive rate): Fraction of positives That were correctly identified.
Recall = TP/(TP+FN)

### F1 Score
F1 Score (aka F-Score or F-Measure) – A helpful metric for comparing two classifiers. F1 Score takes into account precision and the recall. It is created by finding the the harmonic mean of precision and recall.

F1 = 2 x (precision x recall)/(precision + recall)

The F1 score reaches its best value at 1 and worst score at 0.

The F1 score weights recall more than precision by a factor of beta. beta == 1.0 means recall and precision are equally important.

### Support
The support is the number of occurrences of each class in y_true.



# Exercise 

Build a logistic model on the full Default dataset and then run 5-fold cross-validation to get a more accurate estimate of your test error rate:

In [None]:
?np.ones


In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_validate, cross_val_predict
from sklearn import metrics

df2 = pd.read_csv('data/default.csv', na_values='?').dropna()

df2['Yes'] = (df2['default'] == 'Yes').astype(int)

X = np.column_stack((np.ones(len(df2)), df2['balance']))
y=df2['Yes'].values

lr=LogisticRegression()
predicted = cross_val_predict(lr, X, y, cv=5)
print(metrics.accuracy_score(y, predicted))
print(confusion_matrix(y,predicted))
print(metrics.classification_report(y, predicted)) 

# Bootstrap
+ Can be used to quantify the uncertainty associated with a given estimator or statistical learning method.
+ Bootstrap approach allows us to use a computer to emulate the process of obtaining new sample sets, to estimate the variability of estimated parameter without generating additional samples.
+ Rather than repeatedly obtaining independent data sets from the population, we instead obtain distinct data sets by repeatedly sampling observations from the original data set.
+ We randomly select $n$ observations from the data set in order to produce a bootstrap data set.
+ The sampling is performed with replacement, which means that the same observation can occur more than once in the bootstrap data set.




In [None]:
Image('images/pw44.png', width =500)

#### Example:
+ Suppose that we wish to invest a fixed sum of money in two financial assets that yield returns of $X$ and $Y$ , respectively, where $X$ and $Y$ are random quantities. 
+ We will invest a fraction $\alpha$ of our money in $X$, and the remaining $1 − \alpha$ in $Y$. 
+ Since there is variability associated with the returns on these two assets, we wish to choose $\alpha$ to minimize the total risk, or variance, of our investment.
$$\hat{\alpha} = \frac{\hat{\sigma}^2_Y - \hat{\sigma}_{XY}}{\hat{\sigma}^2_X + \hat{\sigma}^2_Y - 2\hat{\sigma}_{XY}}$$
+ 100 pairs of returns for the investments $X$ and $Y$ are simulated, to estimate $\hat{\sigma}^2_X, \hat{\sigma}^2_Y$ and $\hat{\sigma}_{XY}$, to obtain $\hat{\alpha}$.
+ This process is repeated 1000 times, resulting 1000 estimates for $\alpha$.
+ $0.532 \leq \hat{\alpha} \leq 0.657$.
+ $\bar{\alpha} = 0.5996$ and $\textrm{SE}(\hat{\alpha}) = 0.083$.
+ However this cannot be applied, because for real data we cannot generate new samples from the original population. 
+ By using bootstrap technique,we can resample the data repeatedly.
+ The bootstrap data set, $Z^{*i}$ can be used to produce the estimate for $\alpha$, given as $\hat{\alpha}^{*i}$.
+ The SE of this bootstrap estimates is
$$\textrm{SE}_B(\hat{\alpha}) = \sqrt{\frac{1}{B-1}\sum_{r=1}^B{\left( \hat{\alpha}^{*r} - \frac{1}{B} \sum_{r^\prime = 1}^B{\hat{\alpha}^{*r^\prime}}\right)^2}}$$


#### Example

In [None]:
portfolio_df = pd.read_csv('data/portfolio.csv')
portfolio_df.head()

To illustrate the use of the bootstrap on this data, we must first create
a function, `alpha()`, which takes as input the data and outputs the estimate for $\alpha$ (described in more detail on page 187).

In [None]:
def alpha(X,Y):
    return ((np.var(Y)-np.cov(X,Y))/(np.var(X)+np.var(Y)-2*np.cov(X,Y)))

This function returns, or outputs, an estimate for $\alpha$ based on applying
(5.7) to the observations indexed by the argument index. For instance, the
following command tells `python` to estimate $\alpha$ using all 100 observations.

In [None]:
X = portfolio_df.X[0:100]
y = portfolio_df.Y[0:100]
print(alpha(X,y))

The next command uses the `sample()` function to randomly select 100 observations
from the range 1 to 100, with replacement. This is equivalent
to constructing a new bootstrap data set and recomputing $\hat{\alpha}$ based on the
new data set.

In [None]:
dfsample = portfolio_df.sample(frac=1, replace=True)
X = dfsample.X[0:100]
y = dfsample.Y[0:100]
print(alpha(X,y))

**sklearn have deprecated bootstrap function because the ML community does not seen resampling as crucial or useful**

We can implement a bootstrap analysis by performing this command many
times, recording all of the corresponding estimates for $\alpha$, and computing the resulting standard deviation. Below we produce $1,000$ bootstrap estimates for $\alpha$:

In [None]:
def bstrap(df):
    tresult = 0
    for i in range(0,1000):
        dfsample = df.sample(frac=1, replace=True)
        X = dfsample.X[0:100]
        y = dfsample.Y[0:100]
        result = alpha(X,y)
        tresult += result
    fresult = tresult / 1000
    print(fresult)
    
bstrap(portfolio_df)

The final output shows that using the original data, $\hat{\alpha} = 0.58$

In [None]:
Image('images/pw45.png', width =800)

Left: A Histogram of the estimates of $\alpha$ obtained by generating 1,000 simulated data sets from the true population. 

Center: A histogram of the estimates of $\alpha$ obtained from 1,000 bootstrap samples from a single data set.

Right: The estimates of $\alpha$ displayed in the left and center panels are shown as boxplots. 

In each panel, the pink line indicates the true value of α.

#### Example
The bootstrap approach can be used to assess the variability of the coefficient
estimates and predictions from a statistical learning method. 

Here we use the bootstrap approach in order to assess the variability of the
estimates for $\beta_0$ and $\beta_1$, the intercept and slope terms for the linear regression
model that uses horsepower to predict mpg in the Auto data set.

We will compare the estimates obtained using the bootstrap to those obtained
using the formulas for $SE(\hat{\beta}_0)$ and $SE(\hat{\beta}_1)$ described in Section 3.1.2.



In [None]:
from sklearn.utils import resample

auto_df = pd.read_csv('data/auto.csv')

auto_df.describe()


In [None]:
lm = skl_lm.LinearRegression()
X = auto_df['horsepower'].values.reshape(-1,1)
y = auto_df['mpg']
clf = lm.fit(X,y)
print(clf.coef_, clf.intercept_)

In [None]:
from sklearn.metrics import mean_squared_error

Xsamp, ysamp = resample(X, y, n_samples=1000)
clf = lm.fit(Xsamp,ysamp)
print('Intercept: ' + str(clf.intercept_) + " Coef: " + str(clf.coef_))

# Exercise 5

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import pandas as pd

In [None]:
default = pd.read_csv('data/default.csv')
default['student_yes'] = (default['student'] == 'Yes').astype('int')
default['default_yes'] = (default['default'] == 'Yes').astype('int')

In [None]:
default.head()

In [None]:
X = default[['balance', 'income']]
y = default['default_yes']

# No Validation set

### Sklearn

In [None]:
# Notice how tol must be changed to less than default value or convergence won't happen
# Use a high value of C to remove regularization
model = LogisticRegression(C=100000, tol=.0000001)
model.fit(X, y)
model.intercept_, model.coef_

### Statsmodels
Coefficients are similar

In [None]:
import statsmodels.formula.api as smf

In [None]:
result = smf.logit(formula='default_yes ~ balance + income', data=default).fit()

In [None]:
smf.logit?

In [None]:
result.summary()

### Error without validation set
This is an in-sample prediction. Training error in both sklearn and statsmodels. Both are equivalent

In [None]:
(model.predict(X) == y).mean()

In [None]:
((result.predict(X) > .5) * 1 == y).mean()

## With validation set

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y)

In [None]:
model = LogisticRegression(C=100000, tol=.0000001)
model.fit(X_train, y_train)
model.intercept_, model.coef_

In [None]:
X_train_sm = X_train.join(y_train)

In [None]:
result = smf.logit(formula='default_yes ~ balance + income', data=X_train_sm).fit()
result.summary()

In [None]:
result.fit?

In [None]:
# Nearly the same as training set. So not too much over fitting has happened
(model.predict(X_test) == y_test).mean(), ((result.predict(X_test) > .5) * 1 == y_test).mean()

Validation error of only .0272

In [None]:
# c) repeat for 3 different validation sets
model = LogisticRegression(C=100000, tol=.0000001)

for i in range(3):
    X_train, X_test, y_train, y_test = train_test_split(X, y)
    model.fit(X_train, y_train)
    
    X_train_sm = X_train.join(y_train)
    result = smf.logit(formula='default_yes ~ balance + income', data=X_train_sm).fit()
    print((model.predict(X_test) == y_test).mean(), ((result.predict(X_test) > .5) * 1 == y_test).mean())

In [None]:
# d) include student in model
X = default[['balance', 'income', 'student_yes']]
y = default['default_yes']

model = LogisticRegression(C=100000, tol=.0000001)

for i in range(3):
    X_train, X_test, y_train, y_test = train_test_split(X, y)
    model.fit(X_train, y_train)
    
    X_train_sm = X_train.join(y_train)
    result = smf.logit(formula='default_yes ~ balance + income + student_yes', data=X_train_sm).fit()
    print((model.predict(X_test) == y_test).mean(), ((result.predict(X_test) > .5) * 1 == y_test).mean())

Looks like error rate is very similar

## Exercise 6
Computing stand errors of coefficents of logistic regression using bootstrap

In [None]:
result = smf.logit(formula='default_yes ~ balance + income', data=default).fit()
result.summary()

In [None]:
df_params = pd.DataFrame(columns=['Intercept', 'balance', 'income'])
for i in range(100):
    default_sample = default.sample(len(default), replace=True)
    result_sample = smf.logit(formula='default_yes ~ balance + income', data=default_sample).fit(disp=0)
    df_params = pd.concat([df_params, pd.DataFrame([result_sample.params])], ignore_index=True)


In [None]:
# bootstrap parameters and standard error
df_params.mean(), df_params.std()

In [None]:
# model parameters and standard error
result.params, result.bse

Standard errors are a wee bit higher in bootstrap

# 7
a) Fit Logistic Regression with Lag1, Lag2

In [None]:
weekly = pd.read_csv('data/weekly.csv')

In [None]:
weekly['Direction_Up'] = (weekly['Direction'] == 'Up').astype(int)

In [None]:
weekly.head()

In [None]:
X = weekly[['Lag1', 'Lag2']]
y = weekly['Direction_Up']

In [None]:
model = LogisticRegression(C=100000, tol=.0000001)
model.fit(X, y)

In [None]:
model.intercept_, model.coef_

In [None]:
# accuracy
(model.predict(X) == y).mean()

### b) Fit without first observation

In [None]:
# model is different but nearly identical
model.fit(X.iloc[1:], y.iloc[1:])
model.intercept_, model.coef_, (model.predict(X) == y).mean()

In [None]:
# c
# wrong prediction
model.predict([X.iloc[0]]), y[0]

In [None]:
# d
errors = np.zeros(len(X))
for i in range(len(X)):
    leave_out  = ~X.index.isin([i])
    model.fit(X[leave_out], y[leave_out])
    if model.predict([X.iloc[i]]) != y[i]:
        errors[i] = 1

In [None]:
# e
errors.mean()

# 8

In [None]:
np.random.seed(1)
x = np.random.randn(100)
e = np.random.randn(100)
y = x - 2*x**2 + e

In [None]:
y.shape

In [None]:
plt.scatter(x, y);

In [None]:
df = pd.DataFrame(np.array([np.ones(len(x)), x, x ** 2, x ** 3, x ** 4, y]).T, columns=['b0', 'x', 'x2', 'x3', 'x4', 'y'])
df.head()

In [None]:
from sklearn.linear_model import LinearRegression

In [None]:
X = df.iloc[:, :5]
y = df['y']
model = LinearRegression()
errors = np.zeros((len(X), 4))
for i in range(len(X)):
    leave_out  = ~X.index.isin([i])
    for j in range(4):
        model.fit(X.iloc[leave_out, :j+2], y[leave_out])
        errors[i, j] = (model.predict([X.iloc[i, :j+2]]) - y[i]) ** 2

In [None]:
# each error here is average error for linear, quadratic, cubic and quartic model.
# Looks like it stabilizes at quadratic.
errors.mean(axis=0)

In [None]:
# again with different seed. 
np.random.seed(2)
x = np.random.randn(100)
e = np.random.randn(100)
y = x - 2*x**2 + e
df = pd.DataFrame(np.array([np.ones(len(x)), x, x ** 2, x ** 3, x ** 4, y]).T, columns=['b0', 'x', 'x2', 'x3', 'x4', 'y'])


X = df.iloc[:, :5]
y = df['y']
model = LinearRegression()
errors = np.zeros((len(X), 4))
for i in range(len(X)):
    leave_out  = ~X.index.isin([i])
    for j in range(4):
        model.fit(X.iloc[leave_out, :j+2], y[leave_out])
        errors[i, j] = (model.predict([X.iloc[i, :j+2]]) - y[i]) ** 2

# quite a different average error. But again stabilizes at quadratic which makes sense
errors.mean(axis=0)

### f 
since the error doesn't improve after quadratic it's likely the 
standard errors for x3 and x4 would not be significant

# 9

In [None]:
boston = pd.read_csv('data/boston.csv')
boston.head()

In [None]:
#a
boston['medv'].mean()

In [None]:
#b 
# standard deviation of mean
boston['medv'].std() / np.sqrt(len(boston))

In [None]:
#c
#bootstrap standard deviation of mean
means = [boston['medv'].sample(n = len(boston), replace=True).mean() for i in range(1000)]
np.std(means)

In [None]:
#d
se = np.std(means)
boston['medv'].mean() - 2 * se, boston['medv'].mean() + 2 * se

http://stackoverflow.com/questions/15033511/compute-a-confidence-interval-from-sample-data

In [None]:
import scipy.stats as st

In [None]:
st.t.interval(0.95, len(boston['medv'])-1, loc=np.mean(boston['medv']), scale=st.sem(boston['medv']))

In [None]:
#e
boston['medv'].median()

In [None]:
#f
medians = [boston['medv'].sample(n = len(boston), replace=True).median() for i in range(1000)]
np.std(medians)

In [None]:
#g
boston['medv'].quantile(.1)

In [None]:
#h
quantile_10 = [boston['medv'].sample(n = len(boston), replace=True).quantile(.1) for i in range(1000)]
np.std(quantile_10)