# <a id='toc1_'></a>[Measures of Central Tendency](#toc0_)

A measure of central tendency is a single value that attempts to describe a set of data by identifying the central position within that set of data. As such, measures of central tendency are sometimes called measures of central location. They are also classified as summary statistics. 

The goal of a measure of central tendency is to provide a summary of the data with a single number that represents the center or middle of its distribution. 

The most common measures of central tendency are:

1. **Mean (Arithmetic Mean):** The sum of all the data entries divided by the number of entries. This is the most common measure of central tendency.

2. **Median:** The middle value when the data entries are arranged in ascending (or descending) order. If there is an even number of observations, the median is the average of the two middle numbers.

3. **Mode:** The most frequently occurring value in a data set.

Other measures of central tendency include the geometric mean, harmonic mean, weighted mean, midrange, and others. Each of these measures is appropriate to use in different situations, depending on the nature of the data and the context of the analysis.

**Table of contents**<a id='toc0_'></a>    
- [Measures of Central Tendency](#toc1_)    
- [Most commonly used Measures of Central Tendency](#toc2_)    
  - [**Arithmetic Mean**](#toc2_1_)    
  - [**Median**](#toc2_2_)    
  - [**Mode**](#toc2_3_)    
- [Less Common Arithmetic Measures of Central Tendency](#toc3_)    
  - [**Weighted Mean**](#toc3_1_)    
  - [**Midrange**](#toc3_2_)    
  - [**Truncated Mean (or Trimmed Mean)**](#toc3_3_)    
  - [**Winsorized Mean**](#toc3_4_)    
- [Less Common Geometric Measures of Central Tendency](#toc4_)    
  - [**Geometric Mean**](#toc4_1_)    
  - [**Geometric Median**](#toc4_2_)    
  - [**Harmonic Mean**](#toc4_3_)    
  - [**Quadratic Mean (or Root Mean Square)**](#toc4_4_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

# <a id='toc2_'></a>[Most commonly used Measures of Central Tendency](#toc0_)

## <a id='toc2_1_'></a>[**Arithmetic Mean**](#toc0_)

**Definition:** The arithmetic mean, often simply called the mean, is the average of a set of numbers. It is calculated by adding up all the numbers in the set and then dividing by the count of numbers.

**Formula:** If we have a set of numbers $x_1, x_2, ..., x_n$, the arithmetic mean is given by:

$$
\text{Arithmetic Mean} = \frac{x_1 + x_2 + ... + x_n}{n}
$$

**Intuition:** The arithmetic mean represents a "flattening out" of the numbers. It is the value that each number would need to be in order for the total of the set to remain the same.

**Use Case:** The arithmetic mean is used in a wide range of applications, from basic mathematics and statistics to economics and data analysis. It provides a simple measure of central tendency, giving a quick sense of the "middle" of a data set.

**Cautions:** The arithmetic mean can be heavily influenced by outliers. A single very large or very small value can significantly affect the mean, potentially giving a misleading impression of the data.

**Assumptions:** The arithmetic mean assumes that all numbers in the set are equally important and contribute equally to the total.

**Example:** Consider the numbers 6, 11, and 7. The arithmetic mean is calculated as follows:

Add the numbers: 6 + 11 + 7 = 24
Divide by how many numbers (there are 3 numbers): 24 / 3 = 8

So, the arithmetic mean of 6, 11, and 7 is 8.




## <a id='toc2_2_'></a>[**Median**](#toc0_)

**Definition:** The median is the middle value in a data set when the values are arranged in ascending or descending order. If there is an even number of values, the median is the average of the two middle values.

**Formula:** For a set of numbers $x_1, x_2, ..., x_n$ arranged in ascending order, the median is given by:

- $x_{(n+1)/2}$ when n is odd
- $(x_{n/2} + x_{n/2 + 1}) / 2$ when n is even

**Intuition:** The median represents the "middle" of a data set, with half of the values being less than the median and half being greater. It is not influenced by outliers or extreme values.

**Use Case:** The median is often used in situations where the data may be skewed or have outliers. It provides a measure of central tendency that is more robust to these issues than the mean.

**Cautions:** The median only considers the middle values of the data set and does not take into account the actual values of the data points. Therefore, it may not accurately represent the data if the values are widely spread or have multiple modes.

**Assumptions:** The median assumes that the data is ordered. It does not assume that all values are equally important.

**Example:** Consider the numbers 3, 5, and 8. Arranged in ascending order, they are 3, 5, 8. As there are 3 numbers, which is odd, the median is the middle number, which is 5.



## <a id='toc2_3_'></a>[**Mode**](#toc0_)

**Definition:** The mode is the value that appears most frequently in a data set. A data set may have one mode, more than one mode, or no mode at all.

**Formula:** The mode is the value $x_i$ that maximizes the frequency function $f(x_i)$.

**Intuition:** The mode represents the most common or popular value in a data set. It can be used to identify the most typical value.

**Use Case:** The mode is often used in situations where the most common value is of interest. This could be in market research, where the most popular product is important, or in voting, where the most popular candidate wins.

**Cautions:** The mode is not a robust measure of central tendency as it can be influenced by fluctuations in data. If the data set has multiple modes or no mode, it may not provide a useful measure of central tendency.

**Assumptions:** The mode does not assume that the data is ordered or that all values are equally important.

**Example:** Consider the numbers 3, 5, 5, and 8. The number 5 appears twice, which is more frequently than any other number, so the mode is 5.

# <a id='toc3_'></a>[Less Common Arithmetic Measures of Central Tendency](#toc0_)

## <a id='toc3_1_'></a>[**Weighted Mean**](#toc0_)

**Definition:** The weighted mean, also known as the weighted average, is a measure of central tendency that assigns different weights to different values in the data set. These weights represent the importance or frequency of the values.

**Formula:** For a set of numbers $x_1, x_2, ..., x_n$ with corresponding weights $w_1, w_2, ..., w_n$, the weighted mean is given by:

$$
\text{Weighted Mean} = \frac{w_1x_1 + w_2x_2 + ... + w_nx_n}{w_1 + w_2 + ... + w_n}
$$

**Intuition:** The weighted mean gives more importance to some values over others. It is useful when we want to calculate an average that is not uniformly distributed.

**Use Case:** The weighted mean is often used in situations where some data points contribute more to the final average than others. This could be in grading systems, where different assignments have different weights, or in finance, where different investments have different levels of risk.

**Cautions:** The weighted mean can be influenced by the weights assigned to the values. If the weights are not accurately reflecting the importance or frequency of the values, the weighted mean can be misleading.

**Assumptions:** The weighted mean assumes that the weights accurately represent the importance or frequency of the values.

**Example:** Consider the numbers 3, 5, and 8 with weights 2, 3, and 5 respectively. The weighted mean is calculated as follows:

Calculate the sum of the product of the numbers and their weights: 2*3 + 3*5 + 5*8 = 6 + 15 + 40 = 61
Calculate the sum of the weights: 2 + 3 + 5 = 10
Divide the sum of the product by the sum of the weights: 61 / 10 = 6.1

So, the weighted mean of 3, 5, and 8 with weights 2, 3, and 5 is 6.1.




## <a id='toc3_2_'></a>[**Midrange**](#toc0_)

**Definition:** The midrange is a measure of central tendency that is the arithmetic mean of the maximum and minimum values of a data set.

**Formula:** Midrange = (Maximum value + Minimum value) / 2

**Intuition:** The midrange gives the midpoint of the extreme values in a dataset.

**Use Case:** The midrange is often used in statistical analysis for quickly estimating the range and the central tendency, especially in quality control processes.

**Cautions:** The midrange is very sensitive to outliers and may not represent the data well if the dataset is skewed.

**Assumptions:** Assumes that the maximum and minimum values are representative of the data set.

**Real World Example:** In weather forecasting, the midrange might be used to give an average daily temperature forecast, calculated from the day's predicted high and low temperatures.



## <a id='toc3_3_'></a>[**Truncated Mean (or Trimmed Mean)**](#toc0_)

**Definition:** The truncated mean or trimmed mean involves calculating the mean after discarding given parts of a probability distribution or sample at the high and low end, and typically discarding an equal amount of both.

**Formula:** The formula depends on the percentage of observations you decide to trim. If you decide to trim 10% of observations from both ends, you would discard the lowest 10% and highest 10% of values, then calculate the mean of the remaining values.

**Intuition:** The truncated mean helps to reduce the impact of outliers on the calculated mean.

**Use Case:** Truncated mean is often used in sports events, where the highest and lowest scores are discarded to prevent bias. It's also used in robust estimation of location parameters.

**Cautions:** The choice of the percentage to trim can be arbitrary and may impact the result.

**Assumptions:** Assumes that the outliers are at the high and low ends of the data set.

**Real World Example:** In figure skating, the judges' scores often use a truncated mean. The highest and lowest scores are discarded and the mean of the remaining scores determines the result.




## <a id='toc3_4_'></a>[**Winsorized Mean**](#toc0_)

**Definition:** The Winsorized mean is a measure of central tendency which involves modifying the data by replacing extreme values with certain percentiles (the "trimming" points) before calculating the mean.

**Formula:** The formula depends on the percentage of observations you decide to Winsorize. If you decide to Winsorize 10% of observations from both ends, you would replace the lowest 10% of values with the next smallest value, and replace the highest 10% of values with the next largest value, then calculate the mean of the resulting values.

**Intuition:** The Winsorized mean is a way to reduce the influence of outliers on the mean.

**Use Case:** Winsorized mean is often used in robust statistical techniques, particularly in situations where you want to resist outliers but do not want to entirely discard data points.

**Cautions:** The choice of the percentage to Winsorize can be arbitrary and may impact the result.

**Assumptions:** Assumes that the outliers are at the high and low ends of the data set.

**Real World Example:** In economics, the Winsorized mean can be used to calculate average income or wealth, as these distributions are often skewed by a small number of very high values.


# <a id='toc4_'></a>[Less Common Geometric Measures of Central Tendency](#toc0_)


## <a id='toc4_1_'></a>[**Geometric Mean**](#toc0_)

**Definition:** The geometric mean is a type of average where we multiply the numbers together and then take a root (square root for two numbers, cube root for three numbers, etc.). 

**Formula:** For a set of numbers $x_1, x_2, ..., x_n$, the geometric mean is given by:

$$
\text{Geometric Mean} = \sqrt[n]{x_1 \times x_2 \times ... \times x_n}
$$

**Intuition:** The geometric mean is a kind of "dimensionless" average, which can be useful when comparing things with very different properties. It is the value that, when multiplied by itself n times, gives the same product as the product of the numbers in the set.

**Use Case:** The geometric mean is often used in finance and investing, where it can give a more accurate picture of returns over time than the arithmetic mean. It's also used in geometry and in situations where the numbers being averaged are ratios or rates.

**Cautions:** Like the arithmetic mean, the geometric mean can be influenced by outliers. However, because it involves multiplication rather than addition, it can be even more sensitive to very small or very large values.

**Assumptions:** The geometric mean assumes that the numbers in the set are all positive. It also assumes that the numbers are all of the same unit or dimension, since it wouldn't make sense to multiply numbers of different dimensions together.

**Example:** Consider the numbers 2 and 18. The geometric mean is calculated as follows:

First, we multiply them: 2 × 18 = 36
Then, as there are two numbers, we take the square root: √36 = 6

So, the geometric mean of 2 and 18 is 6.

Let's move on to the harmonic mean. Please wait a moment while I fetch the information.



## <a id='toc4_2_'></a>[**Geometric Median**](#toc0_)

**Definition:** The geometric median is a generalization of the median to higher dimensions. It is a point that minimizes the sum of distances to the sample points.

**Formula:** The geometric median of a discrete set of sample points can be computed using the Weiszfeld's algorithm. However, there is no closed-form solution for the geometric median; it is typically computed using iterative methods.

**Intuition:** The geometric median gives a central point in multi-dimensional data.

**Use Case:** The geometric median is used in multivariate statistics and in the social sciences.

**Cautions:** The geometric median is not well-defined for all distributions, especially those with heavy tails.

**Assumptions:** Assumes that the data points are in a metric space (usually Euclidean).

**Real World Example:** In location science, the problem of locating a facility to minimize the sum of Euclidean distances to the clients is called the Fermat-Weber problem, and the solution is given by the geometric median.



## <a id='toc4_3_'></a>[**Harmonic Mean**](#toc0_)

**Definition:** The harmonic mean is the reciprocal of the average of the reciprocals of a set of numbers. 

**Formula:** For a set of numbers $x_1, x_2, ..., x_n$, the harmonic mean is given by:

$$
\text{Harmonic Mean} = \frac{n}{\frac{1}{x_1} + \frac{1}{x_2} + ... + \frac{1}{x_n}}
$$

**Intuition:** The harmonic mean is a kind of average that is useful when dealing with rates or ratios. It gives more weight to smaller values, so it can be useful when you want to find an average rate that takes into account both fast and slow rates.

**Use Case:** The harmonic mean is often used in situations where you are averaging rates or ratios. For example, if you are calculating average speed and you travel the same distance at different speeds, the harmonic mean gives the correct average speed.

**Cautions:** The harmonic mean can be heavily influenced by small values. A single very small value can significantly lower the harmonic mean. Also, it's undefined if any of the values are zero.

**Assumptions:** The harmonic mean assumes that all numbers in the set are positive. It also assumes that the numbers are all of the same unit or dimension.

**Example:** Consider the numbers 1, 2, and 4. The harmonic mean is calculated as follows:

First, calculate the reciprocals: 1/1 = 1, 1/2 = 0.5, 1/4 = 0.25
Then, find the average of the reciprocals: (1 + 0.5 + 0.25) / 3 = 0.5833...
Finally, take the reciprocal of that average: 1 / 0.5833... = 1.714...

So, the harmonic mean of 1, 2, and 4 is approximately 1.714.


## <a id='toc4_4_'></a>[**Quadratic Mean (or Root Mean Square)**](#toc0_)

**Definition:** The quadratic mean (also known as the root mean square) is a statistical measure of the magnitude of a set of numbers. It gives a sense for the "typical" magnitude of the numbers.

**Formula:** For a set of numbers $x_1, x_2, ..., x_n$, the quadratic mean is given by:

$$
\text{Quadratic Mean} = \sqrt{\frac{x_1^2 + x_2^2 + ... + x_n^2}{n}}
$$

**Intuition:** The quadratic mean is a measure of the magnitude of the numbers in a data set and is always greater than or equal to the arithmetic mean.

**Use Case:** The quadratic mean is often used in physics and electrical engineering, where it is used to calculate the root mean square current or voltage in an alternating current (AC) circuit.

**Cautions:** The quadratic mean is sensitive to outliers, as it squares each value before averaging.

**Assumptions:** Assumes that all numbers in the set are real numbers.

**Real World Example:** In physics, the RMS value is used to find the effective or DC-equivalent voltage or current of an AC signal.