In [None]:
from scipy.stats import skew, skewtest, norm
from scipy.stats import kurtosis

## Skewness is of two types:


- **Positive skewness:** When the tail on the right side of the distribution is longer or fatter, we say the data is positively 
  skewed. For a positive skewness **mean > median > mode**.


     1. log Transformation
     2. Root Transfomation
        a. Square root Transfomation
        b. Cube root Transfomation
     3. Reciprocals Transformation
     
- **Negative skewness:** When the tail on the left side of the distribution is longer or fatter, we say that the distribution 
  is negatively skewed. For a negative skewness **mean < median < mode**.

     1. Squares Transformation
     2. Cubes Transformation
     3. Higher powers

  The thumb rule is:
  
  * If the **skewness** is **between -0.5 to +0.5** then we can say data is **fairly symmetrical**.
  
  * If the **skewness** is **between -1 to -0.5 or 0.5 to 1** then data is **moderately skewed**.
  
  * If the **skewness** is **less than -1 and greater than +1** then our data is **heavily skewed**.

### Positive skewness

#### 1. Log

In [None]:
# performing the log transformation using numpy
log_mileage = np.log(data['mileage'])
log_mileage

In [None]:
# checking the skewness after the log-transformation
log_mileage.skew()

In [None]:
# visualising by density plot
sns.distplot(sqrt_mileage, hist = True)

#### 2. Root Transformation
- 2.1 Square root Transfomation
- 2.2 cube root Transformation

In [None]:
# calculating the square root for data['mileage'] column
sqrt_mileage = np.sqrt(data['mileage'])
sqrt_mileage

In [None]:
# calculation skewness after calculating the square root & we can observe change in the value of skewness
sqrt_mileage.skew()

In [None]:
# calculating the cube root for the column data['mileage'] column
cube_root_mileage = np.cbrt(data['mileage'])
cube_root_mileage

In [None]:
# calculation skewness after calculating the cube root 
cube_root_mileage.skew()

#### 3. reciprocals Transformation

In [None]:
# calculating the reciprocal for the column data['mileage'] column
recipr_mileage = np.reciprocal(data['mileage'])
recipr_mileage

In [None]:
recipr_mileage.skew()

### 4. Box cox Transformation

- your **data must be positive(Works only for positive skewness)** and **Not** even If the feature has **0**


- **Box cox transformation** totally works on **Lambda**


- You can import it from the Scipy library, but the check for the skew you’ll need to convert the resulting Numpy array to a Pandas Series:

### Negative skewness

#### 1. Squares Transformation

In [None]:
# calculating the square for the column df['Marks'] column
Square_marks = np.square(df['Marks'])
Square_marks

In [None]:
# checking the skewness
Square_marks.skew()

In [None]:
# plotting the density and histogram plot
sns.distplot(Square_marks, hist=True)

#### 2. Cubes Tranformation

In [None]:
# calculating the Cubes for the column df['Marks'] column
cube_marks = np.power(df['Marks'], 3)
cube_marks

In [None]:
# calculating the skewness
cube_marks.skew()

In [None]:
# plotting the density and histogram plot
sns.distplot(cube_marks, hist= True)

#### 3. higher powers

In [None]:
# calculating the Higher power(power = 4) for the column df['Marks'] column
higher_power_4 = np.power(df['Marks'], 4)
higher_power_4

In [None]:
# calculating the skewness
higher_power_4.skew()

In [None]:
# plotting the density and histogram
sns.distplot(higher_power_4, hist = True)

In [None]:
# applying the higher power(power = 5) and calculating the skewness
higher_power_5 = np.power(df['Marks'], 5)
higher_power_5.skew()

In [None]:
# applying the higher power(power = 5) and calculating the skewness
higher_power_6 = np.power(df['Marks'], 6)
higher_power_6.skew()