# Measures of Central Tendency - Quiz

## Objectives
You will be able to:
* Understand and describe the significance of measuring central tendency of continuous data
* Understand the formula and intuition behind the mean, median, mode and modal class
* Compare mean-median-mode, along with histograms to explain the central tendency of given data

### Exercise 1
Calculate the mean, median and mode for this data set: 
```
19, 18, 21, 16, 15, 17, 20, 18
```
While comparing the results of three measures, comment about this distribution. 

In [6]:
number_list = sorted([19, 18, 21, 16, 15, 17, 20, 18])
length = len(number_list)

def calculate_mean(numbers):
    return sum(numbers) / length

def calculate_median(numbers):
    if length % 2 == 1:
        i = int(length / 2)
        return numbers[i]
    else:
        i = int(length / 2)
        return (numbers[i] + numbers[i - 1]) / 2

def calculate_mode(numbers):
    num_frequency = {}
    for number in numbers:
        num_frequency[number] = num_frequency.get(number, 0) + 1
    frequencies = list(num_frequency.values())
    i = frequencies.index(max(frequencies))
    return list(num_frequency.keys())[i]
    
def three_m(numbers):
    mean = calculate_mean(numbers)
    median = calculate_median(numbers)
    mode = calculate_mode(numbers)
    return {'mean': mean, 'median': median, 'mode': mode}

three_m(number_list)

# The mean, median, and mode are all equal; therefore, this is a normal distribution

{'mean': 18.0, 'median': 18.0, 'mode': 18}

### Exercise 2

Calculate the mean, median and mode for given distribution and state which of these measures does not describe the "middle" of this data set? and why ?
```
100, 99, 97, 97, 96, 98, 95, 72
```

In [5]:
number_list = sorted([100, 99, 97, 97, 96, 98, 95, 72])

three_m(number_list)

# The list has an outlier of 72, which skews the mean from the "middle" of the dataset

{'mean': 94.25, 'median': 97.0, 'mode': 97}

### Exercise 3
On the first three days of his bookshop opening, Joe sold 15, 18, and 16 books (He initially hoped that he would sell 17 books every day).  How many books does he need to sell on the next day to have a mean sale of 17?

In [3]:
books_sold = [15, 16, 18]
17 * (len(books_sold) + 1) - sum(books_sold)    

19

### Exercise 4
The histograms show the amount of time (hours per day) spent on Facebook by 46 middle school girls and 40 middle school boys from a school in San Francisco. A total of 50 boys and 50 girls took the survey, 4 girls and 10 boys did not use Facebook at all. 
Each is graphed with a bin width of 0.25 hours.

![](boys.png)
![](girls.png)

Looking at these histograms, answer following questions. 

*Hint: For most parts, you will have to figure out the location of required bins and count the frequencies. *

#### How many boys spend more than 1.5 hours/day on Facebook?


In [8]:
boy_times = [10, 1, 6, 9, 4, 7, 5, 5, 3]
girl_times = [4, 1, 4, 3, 2, 2, 0, 5, 4, 3, 5, 1, 2, 1, 1, 6, 6]
keys = []

for i in list(range(0, len(girl_times))):
    keys.append(i / 4)

boy_times_updated = boy_times
for i in list(range(len(boy_times), len(girl_times))):
    boy_times_updated.append(0)

girls = {}
boys = {}

for key in keys:
    boys[key] = boy_times[int(key * 4)]
    girls[key] = girl_times[int(key * 4)]

boy_count = 0
for key in keys:
    if key > 1.5:
        boy_count += boys[key]
       
boy_count

8

#### Compare the percentage of boys and girls that spend more than zero but less than 1 hour/day on Facebook.

In [9]:
girl_count = 0
boy_count = 0

for key in keys:
    if key < 1.10 and key > 0:
        girl_count += girls[key]
        boy_count += boys[key]

girl_percent = girl_count/50
boy_percent = boy_count/50

print(girl_percent)
print(boy_percent)

0.2
0.4


#### Find the bin where the median of the boys' data set lies.

In [10]:
calculate_median(boy_times)

1

#### In terms of Facebook usage times based on given data, what can you conclude about usage habits of boys and girls?

In [16]:
print("Boys:", three_m(boy_times))
print("Girls:", three_m(girl_times))

# Boys are less likely to use Facebook, and if they do use it, they spend considerably less time on it than girls. 

Boys: {'mean': 2.9411764705882355, 'median': 1, 'mode': 0}
Girls: {'mean': 2.9411764705882355, 'median': 3, 'mode': 1}
