# 02: Measures of Central Tendency

Numbers or words that attempt to describe, most generally, the middle or typical value for a distribution. 

#### Three types of averages (or measures of central tendency)

1. **Mode** - The value of the most frequent score, usually indicates the typical score.  
2. **Median** - The middle value when observations are ordered from least to most. This always has a percentile rank of 50. 
3. **Mean** - The mean is found by adding all dist and then dividing by the number of scores. It always describes the balance point of a distribution, i.e. the point where the sum of positive deviations is equal to the sum of negative deviations. 

<br />

$$ 
Mean = \dfrac{(sum \, of \, all \, scores)}{(number \, of \, scores \,)}
$$

<br />

## Mode

- *Bimodal* describes any distribution with two obvious peaks 
- *Multimodal* describes distributions with more than two peaks. 

In [78]:
def find_mode(dist): 
    """Returns all the modes in the distrbution"""
    max_count = 0 
    count = dict()
    for s in dist: 
        if s in count: 
            count[s] += 1
        else:
            count[s] = 1
        if count[s] > max_count:
            max_count = count[s]

    modes = []
    for s,c in count.items():
        if  c == max_count:
            modes.append(s)
    return modes

#### Example Problem 
Determine the mode for the following retirement ages:

60, 63, 45, 63, 65, 70, 55, 63, 60, 65, 63.

In [81]:
ages = [60, 63, 45, 63, 65, 70, 55, 63, 60, 65, 63]
find_mode(ages)[0]

63

#### Example Problem
The owner of a new car conducts six gas mileage tests and obtains the following results, expressed in miles per gallon: 26.3, 28.7, 27.4, 26.6, 27.4, 26.9. Find
the mode for these data.

In [83]:
mpg = [26.3, 28.7, 27.4, 26.6, 27.4, 26.9]
find_mode(mpg)[0]

27.4

#### Example Problem

To the question “During your lifetime, how often have you changed your permanent residence?” a group of 18 college students replied as follows: 1, 3, 4, 1, 0, 2, 5, 8, 0,
2, 3, 4, 7, 11, 0, 2, 3, 3. Find the mode, median, and mean.

## Median 

In [84]:
def find_median(dist): 
    """Find the middle value of a distribution"""

    values = sorted(dist)

    # No median for an empty list
    median = None
    if len(values) >= 1:

        # Find the middle position 
        n = (len(values) + 1) // 2   

        # If length is even 
        if len(values)  % 2 == 0: 
            # Average result of middle-1 and middle 
            median = (values[n-1] + values[n])/2 
        # If length is odd 

        else:
            # Value at the middle is the median 
            median = values[n]
    return median


#### Example Problem:

Find the media for the following retirement ages: 60, 63, 45, 63, 65, 70, 55, 63, 60, 65, 63.

In [85]:
ages = [60, 63, 45, 63, 65, 70, 55, 63, 60, 65, 63]
find_median(ages)

63

#### Example Problem:

Find the median for the following gas mileage tests: 26.3, 28.7, 27.4, 26.6, 27.4, 26.9

In [87]:
mpg = [26.3, 28.7, 27.4, 26.6, 27.4, 26.9]
find_median(mpg)

27.15

## Mean 

- *Population* - a complete set of dist. 
- *Sample* - a subset of dist. 
- *Sample Size($n$)* - the total number of scores in the sample. 
- *Population Size($M$)* - the total number of scores in the population. 

**Sample Mean($\bar{X}$)** - The balances point for a sample found by dividing the sum for the values of all scores in the sample by the number of scores in the sample. 

</br>

$$
\bar{X} = \dfrac{\sum{X}}{n}
$$

</br>

**Population Mean($\mu$)** - The balance point for a population found by diving the sum for all scores in the population by the number of scores in population. 

</br>

$$
\mu = \dfrac{\sum{X}}{N}
$$

</br>

The mean reflects the value of all scores, not just those that are middle ranked (as with the median), or those that occur most frequently (as with the mode). 

In [21]:
def find_mean(dist):
    return sum(dist)/len(dist)

#### Example Problem:

Find the mean for the following retirement ages: 60, 63, 45, 63, 65, 70, 55, 63, 60, 65, 63.

In [88]:
ages = [60, 63, 45, 63, 65, 70, 55, 63, 60, 65, 63]
find_mean(ages)

61.09090909090909

#### Example Problem:

Find the mean for the following gas mileage tests: 26.3, 28.7, 27.4, 26.6, 27.4, 26.9

In [89]:
mpg = [26.3, 28.7, 27.4, 26.6, 27.4, 26.9]
find_mean(mpg)

27.21666666666667

## Which Average 

- If the distribution is not skewed, the values of the mode, median and mean are similar, and any of them can be used to describe central tendency of the distribution. 
- If the distribution is skewed, the values of the three averages can differ appreciably. 
- The mean is very sensitive to extreme scores, or outliers. 
- Always report both the median and the mean if a distribution is skewed. 

</br>

1. Positively skewed distribution 

$$ Mean > Median > Mode $$

2. Negatively skewed distribution 

$$ Mean < Median < Mode $$

</br>

- The **Mean** is the single most preferred average for quantitative data. It, however, cannot be used with qualitative data. 
- The **Mode** can always be used with qualitative data
- The **Median** can be used whenever it is possible to order qualitative data from least to most because the level of measurement is ordinal. For ordinal data, the median is class which has the percentile rank of 50 . 

#### Example 

College students were surveyed about where they would most like to spend their spring break: Daytona Beach (DB), Cancun, Mexico (C), South Padre Island (SP),
Lake Havasu (LH), or other (O). The results were as follows

In [90]:
destinations = ['DB', 'DB', 'C', 'LH', 'DB', 'C', 'SP', 'LH', 'DB', 'O', 'O', 'SP', 'C', 'DB', 'LH', 'DB', 'C', 'DB', 'O', 'DB']
find_mode(destinations)

['DB']

### Example

To the question “During your lifetime, how often have you changed your permanent residence?” a group of 18 college students replied as follows: 1, 3, 4, 1, 0, 2, 5, 8, 0,
2, 3, 4, 7, 11, 0, 2, 3, 3. Find the mode, median, and mean.

In [91]:
res_change = [1, 3, 4, 1, 0, 2, 5, 8, 0, 2, 3, 4, 7, 11, 0, 2, 3]

res_change

[1, 3, 4, 1, 0, 2, 5, 8, 0, 2, 3, 4, 7, 11, 0, 2, 3]

In [92]:
mean = find_mean(res_change)

mean

3.2941176470588234

In [93]:
median = find_median(res_change)
median

3

In [94]:
mode = find_mode(res_change)
mode

[3, 0, 2]