## Mean

 The mean, also known as the average, is a fundamental concept in statistics. It's calculated by adding up all the values in a dataset and then dividing that sum by the number of values in the dataset. In mathematical terms, if you have a dataset with 'n' values (x₁, x₂, ..., xₙ), the mean (μ) is calculated as:

μ = (x₁ + x₂ + ... + xₙ) / n

The formula for the mean (average) of a dataset using sigma notation is as follows:

μ = (Σx) / n

Where:

μ - represents the mean (average).
Σx - denotes the sum of all individual data points in the dataset.
n - is the number of data points in the dataset.

In [4]:
# Sample dataset
data = [85, 90, 78, 92, 50]

# Calculate the mean
mean = sum(data) / len(data)

# Print the result
print("Mean:", mean)

Mean: 79.0


## Mean with Frequency

Suppose we have the following data representing the values and their corresponding frequencies:

| Value | Frequency |
|-------|-----------|
|  10   |     3     |
|  15   |     5     |
|  20   |     2     |
|  25   |     4     |


Mean = (Σ(value * frequency)) / (Σfrequency)

In [5]:
# Data: Values and Frequencies
data = [
    (10, 3),
    (15, 5),
    (20, 2),
    (25, 4)
]

# Calculate the weighted sum and sum of frequencies
weighted_sum = sum(value * frequency for value, frequency in data)
total_frequency = sum(frequency for _, frequency in data)

# Calculate the mean
mean = weighted_sum / total_frequency

# Print the result
print("Mean:", mean)


Mean: 17.5


## Median
The median is a statistical measure that represents the middle value of a dataset when it's arranged in ascending or descending order. In other words, it's the value that separates the higher half from the lower half of the data.

The median is particularly useful when dealing with datasets that might have outliers or when the data isn't symmetrically distributed (Housing Prices, Exam Scores, Age Distribution, etc)


- Sort the Data: Arrange the dataset in either ascending or descending order.
- Find the Middle Value 'odd number': If the dataset has an odd number of values, the median is the middle value.
- Find the Middle Value 'even number': If the dataset has an even number of values, the median is the average of the two middle values.


In [8]:
# Odd number of values - 7 numbers
dataset_odd = [7, 2, 1, 6, 4, 5, 3]
sorted_data = sorted(dataset_odd)

#get value by middle index of the sorted data
median_odd = sorted_data[len(sorted_data) // 2]
print("Median:", median_odd)

Median: 4


In [10]:
def calculate_median(data):
    sorted_data = sorted(data)
    print("Sorted data:", sorted_data)
    n = len(sorted_data)
    
    # Calculate the indices of the two middle values
    middle_index1 = n // 2 - 1
    middle_index2 = n // 2
    
    # Get the values at the middle indices
    middle_value1 = sorted_data[middle_index1]
    middle_value2 = sorted_data[middle_index2]
    
    # Calculate the median by averaging the middle values
    median = (middle_value1 + middle_value2) / 2
    
    return median

# Example dataset with an even number of values
dataset_even = [8, 2, 1, 6, 4, 5, 3, 7]

median_even = calculate_median(dataset_even)
print("Median:", median_even)


Sorted data: [1, 2, 3, 4, 5, 6, 7, 8]
Median: 4.5
