# What are percentiles?

* Percentiles are used to measure the value below which a given percentage of observations in a group of observations fall. For example, the 25th percentile is the value below which 25% of the observations may be found. The 50th percentile is the median. The 75th percentile is the value below which 75% of the observations may be found. The 100th percentile is the maximum value.

EXAMPLE:

Suppose you scored 45 out of 100 on a test, how would you rate your performance? Good or Bad?

> Is it bad? (because you scored less than 50%)
>
> What if the questions were really hard and everyone else scored poorly?
>
> What if time provided was insufficient to complete the test?

EXAMPLE: Suppose you scored 45 out of 100 on a test. Out of 100 students, only 2 scored greater than 45. How would you rate your performance? Good or Bad?

> You performance looks good now
>
> You can proudly say that you lie in the tio 98 percentile of the class(the score of 98% of the students was less than or equal to your score)

EXAMPLE: A university conducts a written test for 25 students and decides to call those students for an interview whose score is above the 70th percentile.

![image.png](attachment:image.png)

**Can you identify which students will be called for an interview?**

![image.png](attachment:image-2.png)

![image.png](attachment:image-3.png)

The p'th percentile of a sample is a value such that p % of the values in the data are less than or equal to this value.

PROCEDURE
* Sort the data in ascending order
* Compute location of the pth percentile
> pth percentile ($L_p$) = $\frac{p}{100} * (n+1) = \frac{70}{100} * (25+1) = 18.2$
* The 70th percentile lies at the location 18.2
  
![image.png](attachment:image-4.png)

* The 70th percentile should be between 56 and 59, Greater than 56 but closer to 56 than 59
> $56 + 0.2 * (59 - 56) = 56.6$


## Procedure for Computing Percentiles

* Sort the data in ascending order
* Compute location of the pth percentile
  > $L_p = \frac{p}{100} * (n+1)$
* integer part of $L_p = i_p$ 
* fractional part of $L_p = f_p$
* Compute the pth percentile
  > $p^{th} percentile (Y_p) = x_{i_p} + f_p * (x_{i_p+1} - x_{i_p})$

EXAMPLE: A university conducts a written test for 25 students and decides to call those students for an interview whose score is above the 70th percentile.

![image.png](attachment:image.png)

$Y_{70} = 56.6$

> The school will invite only those 7 students whose score was greater than 56.6

EXAMPLE: Suppose the school changes its decision and now only wants to invite students who scored greater than 80 percentile. (p=80)
> $L_p = \frac{p}{100}*(n+1) = \frac{80}{100} * (25+1) = 20.8$

![image.png](attachment:image-2.png)

> $i_p = 20, f_p = 0.8$
> 
> $Y_{80} = x_{20} + f_p * (x_{20+1} - x_{20}) = 61.6$

* Why did we have to compute $Y_p$ when $L_p$ was enough to identify the shortlisted students?
* $L_p$ is  enough to identify the shortlisted students but it is required to declare the cutoff score. Hence we need to compute $Y_p$ also.

EXAMPLE: (special case $f_p = 0$)

* Suppose there were only 24 students and p = 80

![image.png](attachment:image-3.png)

> $L_p = \frac{p}{100}*(n+1) = \frac{80}{100} * (24+1) = 20$
> 
> $i_p = 20, f_p = 0$
>
> $Y_{80} = x_{20} = 60$



## Alternate Procedure for Computing Percentiles

### Method 1

* Sort the data in ascending order
* Compute location of the pth percentile
  > $L_p = \frac{p}{100} * (n+1)$
* integer part of $L_p = i_p$ 
* fractional part of $L_p = f_p$
* Compute the pth percentile
  > $p^{th}$ percentile $(Y_p)$ = $x_{i_p} + f_p * (x_{i_p+1} - x_{i_p})$

### Method 2

* Sort the data in ascending order
* Compute location of the pth percentile
  > $L_p = \frac{p}{100} * (n)$

  >note the use of n instead of n+1
* integer part of $L_p = i_p$
* If $L_p$ is an integer, then $Y_p = \frac{x_{L_p}+x_{L_p +1}}{2}$
* If $L_p$ is not an integer, then $Y_p = x_{i_p + 1}$
  
EXAMPLE: Marks of 25 students and p = 70 or p = 80.
> $L_{70} = \frac{70}{100} * 25 = 17.5$
>
> $L_{80} = \frac{80}{100} * 25 = 20$

![image.png](attachment:image.png)

> $Y_{70} = x_{17+1} = x_{18} = 56$
>
> $Y_{80} = \frac{x_{20} + x_{21}}{2} = \frac{60+62}{2} = 61$ 

 >>Method 2: The pth percentile is that value in the data such that at least p percentage of the values are less than or equal to it and at least (100-p) percentage of the values are greater than or equal to it.

### Method 3
* Sort the data in ascending order
* Compute location of the pth percentile
  > $L_p = \frac{p}{100} * (n+1)$ (Same as Method 1)
* integer part of $L_p = i_p$ 
* Compute the pth percentile
  > $p^{th}$ percentile $(Y_p)$ = $x_{i_p} + 0.5 * (x_{i_p+1} - x_{i_p})$
  > Same as method 1 except that we use 0.5 instead of $f_p$
>> Approx version of method 1

**Comparison**

![image.png](attachment:image-2.png)

|                                 | Alternative 1                                     | Alternative 2                        | Alternative 3                                     |
|---------------------------------|---------------------------------------------------|--------------------------------------|---------------------------------------------------|
| Formulae                        | $L_p = \frac{p}{100} * (n+1)$                     | $L_p = \frac{p}{100} * (n)$          | $L_p = \frac{p}{100} * (n+1)$                     |
| **Case 1** $L_p$ is integer     | $Y_p = x_{L_p}                                    | $Y_p = \frac{x_{L_p}+x_{L_p +1}}{2}$ | $Y_p = x_{L_p}                                    |
| **Case 2** $L_p$ is not integer | $(Y_p)$ = $x_{i_p} + f_p * (x_{i_p+1} - x_{i_p})$ | Y_p = x_{i_p + 1}$                   | $(Y_p)$ = $x_{i_p} + 0.5 * (x_{i_p+1} - x_{i_p})$ |
| P = 70                          | $L_{70} = 18.2$ $Y_{70} = 56.6$                   | $L_{70} = 17.5$ $Y_{70} = 56$        | $L_{70} = 18.2$ $Y_{70} = 57.5$                   |
| P = 80                          | $L_{80} = 20.8$ $Y_{80} = 61.6$                   | $L_{80} = 20$ $Y_{80} = 61$          | $L_{80} = 20.8$ $Y_{80} = 61$                     |


In practice use the first method.