# Measures of shape - Skewness and Kurtosis

## Skewness



It is the __degree of distortion/measure of asymmetry__ from the symmetrical bell curve or the normal distribution. It measures the lack of symmetry in data distribution. It differentiates extreme values in one versus the other tail. A symmetrical distribution will have a skewness of 0.

There are two types of Skewness: Positive and Negative.

![](1_skewness.PNG)

__Positive Skewness__ means when the tail on the right side of the distribution is longer or fatter. The mean and median will be greater than the mode.  
__Negative Skewness__ is when the tail of the left side of the distribution is longer or fatter than the tail on the right side. The mean and median will be less than the mode.

### So, when is the skewness too much?
The rule of thumb seems to be:
- If the skewness is between -0.5 and 0.5, the data are fairly symmetrical.
- If the skewness is between -1 and -0.5(negatively skewed) or between 0.5 and 1(positively skewed), the data are moderately skewed.
- If the skewness is less than -1(negatively skewed) or greater than 1(positively skewed), the data are highly skewed.

![](1_skew_right.PNG)


> If the peak of the distribution was left of the average value, portraying a positive skewness in the distribution. It would mean that many houses were being sold for less than the average value

> If the peak of the distributed data was right of the average value, that would mean a negative skew. This would mean that the houses were being sold for more than the average value.

![](1_skewness_code.PNG)

![](1_skewness_log.PNG)

The skewed data can reduce the performance of the machine learning models - __It affects the regression intercept, coefficients associated with the model.__. Therfore we make a log transformation to remove the skewness.

A transformation may be used to reduce skewness.  A distribution that is symmetric or nearly so is often easier to handle and
interpret than a skewed distribution. More specifically, a normal or Gaussian distribution is often regarded as ideal as it is assumed by many statistical methods.

> To reduce right skewness, take roots or logarithms or reciprocals (roots are weakest). This is the commonest problem in practice.

> To reduce left skewness, take squares or cubes or higher powers.

__Important : If you log transform the response variable, it is required to also log transform feature variables that are skewed.__



***

## Kurtosis

Kurtosis is all about the tails of the distribution — not the peakedness or flatness. It is actually the measure of outliers present in the distribution.

__High kurtosis (+ve value)__ in a data set is an indicator that data has heavy tails or outliers. If there is a high kurtosis, then, we need to investigate why do we have so many outliers. 

__Low kurtosis (-ve value)__ in a data set is an indicator that data has light tails or lack of outliers. If we get low kurtosis(too good to be true), then also we need to investigate and trim the dataset of unwanted results.

This heaviness or lightness in the tails usually means that your data looks flatter (or less flat) compared to the normal distribution. The standard normal distribution has a kurtosis of 3, so if your values are close to that then your graph’s tails are nearly normal. These distributions are called mesokurtic.


> The main difference between skewness and kurtosis is that the skewness refers to the degree of symmetry, whereas the kurtosis refers to the degree of presence of outliers in the distribution.

Reference:
1. [Skew and Kurtosis: 2 Important Statistics terms you need to know in Data Science](https://codeburst.io/2-important-statistics-terms-you-need-to-know-in-data-science-skewness-and-kurtosis-388fef94eeaa)
2. [Transforming Skewed Data](https://towardsdatascience.com/transforming-skewed-data-73da4c2d0d16)
3. [How to deal with Skewed Dataset in Machine Learning?](https://becominghuman.ai/how-to-deal-with-skewed-dataset-in-machine-learning-afd2928011cc)
4. [Transformations and their need in ML](http://fmwww.bc.edu/repec/bocode/t/transint.html)
5. [Understanding Descriptive Statistics](https://towardsdatascience.com/understanding-descriptive-statistics-c9c2b0641291)
6. [StatsHow To - Kurtosis](https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/statistics-definitions/kurtosis-leptokurtic-platykurtic/)

FYI

__Measure of central tendency:__ Mean, Median, Mode, AM, GM, HM

__Measure of dispersion:__ Range, Variance, Standard deviation

__Measure of spred:__ Skewness, Kurtosis