------------
### Winsorized mean
- also known as the `Winsorized trimmed mean`, is a variation of the arithmetic mean that reduces the impact of outliers in a dataset by `capping` or "winsorizing" extreme values. 

- Instead of discarding outliers or using robust measures like the median, the Winsorized mean `replaces` extreme values with values that are closer to the rest of the data. 
- This helps mitigate the influence of outliers while retaining the advantages of the mean.

In [6]:
import numpy as np

In [7]:
# Define a dataset with outliers
data = [12, 15, 18, 21, 50, 55, 60, 70, 75, 200]

In [8]:
# Set the percentage for winsorization (e.g., 10% for each tail)
lower_percentile = 0.10
upper_percentile = 0.10

In [9]:
# Calculate the lower and upper percentiles
lower_cutoff = np.percentile(data,      lower_percentile  * 100)
upper_cutoff = np.percentile(data, (1 - upper_percentile) * 100)

In [10]:
# Winsorize the data
# Clip (limit) the values in an array.

# Given an interval, values outside the interval are clipped to
# the interval edges.  
# For example, if an interval of ``[0, 1]`` is specified, 
# values smaller than 0 become 0, and values larger than 1 become 1.
winsorized_data = np.clip(data, lower_cutoff, upper_cutoff)

In [11]:
# Calculate the Winsorized mean
winsorized_mean = np.mean(winsorized_data)

In [12]:
# Print the result
print("Winsorized Mean:", winsorized_mean)

Winsorized Mean: 46.61999999999999
