The [Jackknife](https://en.wikipedia.org/wiki/Jackknife_resampling) is a *cross-validation* technique, and hence a resampling technique. It is especially useful for [bias](https://en.wikipedia.org/wiki/Bias_of_an_estimator) and [variance](https://en.wikipedia.org/wiki/Variance) estimation.

<h3>General Jackknife process</h3>

For a given statistic $\, \theta \,$ (e.g., mean, variance, correlation, regression coefficient, etc.):

1. Compute the statistic using the **full sample**. Note that with jackknife, we can estimate the same same statistic, or we can use *jackknife replicates* to estimate the properties of this statistic, such as variance, bias, or standard error. For example, if the original estimator T(X) is biased, then its jackknife estimate can reduce its bias (bias correction).
2. Compute **leave-one-out** estimates: remove one observation at a time and recompute the statistic.
3. Use these **jackknife replicates** to estimate for example the variance, standard error, or bias.

<h3>Estimating variance with Jackknife</h3>

In [16]:
import numpy as np

In [17]:
X = np.array([4.17, 5.58, 5.18, 6.11, 4.5, 4.61, 5.17, 4.53, 5.33, 5.14])

In [18]:
# The number of samples
n = X.shape[0]

In [19]:
# (\hat{T})
orig_estimator = np.mean(X)

# (\hat{T}^{j})
# Remove the jth observation, and compute the statistic with the subset
loo_estimates = np.array([np.mean(np.delete(X, j)) for j in range(n)])

# Pseudo values
pv = orig_estimator + (n - 1) * (orig_estimator - loo_estimates)

# The jackknife estimator is the mean of the pseudo values
jackknife_estimator = np.mean(pv)

In [20]:
jackknife_estimator

np.float64(5.031999999999993)

In [21]:
np.mean(X)

np.float64(5.031999999999999)