# Bias of an estimator

 - the bias of an estimator (or `bias function`) is the `difference between this estimator's expected value and the true value` of the parameter being estimated. An estimator or decision rule with `zero bias is called unbiased`.
 
 
 - In statistics, `"bias" is an objective property of an estimator.`
 - Bias is a distinct `concept from consistency`: consistent estimators converge in probability to the true value of the parameter, but may be biased or unbiased; see bias versus consistency for more.


- All else being equal, an unbiased estimator is preferable to a biased estimator, although in practice, biased estimators (with generally small bias) are frequently used.
- When a biased estimator is used, bounds of the bias are calculated. 
- A `biased estimator may be used for various reasons: because an unbiased estimator does not exist without further assumptions about a population`; because an estimator is difficult to compute (as in unbiased estimation of standard deviation); because a biased estimator may be unbiased with respect to different measures of central tendency; because a biased estimator gives a lower value of some loss function (particularly mean squared error) compared with unbiased estimators (notably in shrinkage estimators); or because in some cases being unbiased is too strong a condition, and the only unbiased estimators are not useful.


- Bias can also be measured with respect to the median, rather than the mean (expected value), in which case one distinguishes median-unbiased from the usual mean-unbiasedness property.
- Mean-unbiasedness is not preserved under non-linear transformations, though median-unbiasedness is (see § Effect of transformations); for example, the sample variance is a biased estimator for the population variance.

## Definition : 

![image-2.png](attachment:image-2.png)

    Bias function 
    Bias = Expected Value - actual value of parameter 
   ![image.png](attachment:image.png)

    if we have a RC X , having std of sigma
    X 
     : sample standard deviations  : 

           sample 1 : x1,x2,x3,...,xn = s1
           sample 2 : x1,x2,x3,...,xn = s2
                 .
                .
                .sample n =              sn 

                E[s]= mean of si to be very close to population std sigma


    if we have some sample's varinaces : s1, s2 ,s3,....sn for X

   ideally should be  : E[s] = σ
    
    Biase σ (s) = σ - E[s]


# Example : 

### Sample variance


- The sample variance of a random variable demonstrates two aspects of estimator bias: 
- firstly, the naive estimator is biased, which can be corrected by a scale factor; 
- second, the unbiased estimator is not optimal in terms of mean squared error (MSE), which can be minimized by using a different scale factor, resulting in a biased estimator with lower MSE than the unbiased estimator. 


- Concretely, the naive estimator sums the squared deviations and divides by n, which is biased. Dividing instead by n − 1 yields an unbiased estimator. 

- Conversely, MSE can be minimized by dividing by a different number (depending on distribution), but this results in a biased estimator.

- This number is always larger than n − 1, so this is known as a shrinkage estimator, as it "shrinks" the unbiased estimator towards zero; for the normal distribution the optimal value is n + 1.


        Suppose X1, ..., Xn are independent and identically distributed (i.i.d.) random variables with expectation μ and variance σ2. If the sample mean and uncorrected sample variance are defined as

![image-2.png](attachment:image-2.png)

![image-2.png](attachment:image-2.png)