# Set Up

In [None]:
using DataFrames, Normalize

# Main

Get the directory of the desired dataset.
- Windows
    1. Right-click the file.
    2. Click Properties.
    3. Copy the path in Location.
- Mac
    1. Right-click the file.
    2. Click Get Info.
    3. Copy the path in Where.

Type it between the first set of quotes. If the dataset is an Excel or OpenDocument Spreadsheet, type the sheet name between the second set of quotes.

In [None]:
df = tabular_to_dataframe("<dataset directory>", "<sheet name>")
describe(df)

Choose the variable(s) to analyze, including any grouping variables.

Prefix each desired variable with a colon between the innermost brackets.

In [None]:
sample = df[[:variable1, :variable2,]]
describe(sample)

Format columns containing strings to decimals (floats).

In [None]:
string_to_float!(sample);

Replace `missing` with `NaN` in the data. This will allow for skewness and kurtosis calculations.

In [None]:
missing_to_nan!(sample);

Prefix any grouping variables with a colon between the innermost bracket.

If the below cell is run, replace `sample` in later cells with `gd`.

In [None]:
gd = groupby(sample, [:grouping1,])

Display details about skewness and kurtosis of the data.

In [None]:
print_skewness_kurtosis(sample)

Attempt once to normalize the data.

In [None]:
results = normalize(sample)
print_findings(results)

## Transformations

_min_ – minimum value in the data <br> 
_max_ – maximum value in the data

### Positive Skew

square root: $\sqrt{x}$


add and square root: $\sqrt{x + 1 - min}$


invert: $\frac{1}{x}$


add and invert: $\frac{1}{x + 1 - min}$


square and invert: $\frac{1}{x^2}$


add, square, and invert: $\frac{1}{x^2 + 1 - min^2}$


square root and invert: $\frac{1}{\sqrt{x}}$


add, square root, and invert: $\frac{1}{\sqrt{x + 1 - min}}$


square root, add, and invert: $\frac{1}{\sqrt{x} + 1 - \sqrt{min}}$


log base 10: $\log_{10}(x)$


add and log base 10: $\log_{10}(x + 1 - min)$


natural log: $\ln(x)$


add and natural log: $\ln(x + 1 - min)$

### Negative Skew

square: $x^2$

cube: $x^3$

antilog: $10^x$

reflect and invert: $\frac{1}{max + 1 - x}$

reflect and square root: $\sqrt{max + 1 - x}$

reflect and log base 10: $\log_{10}(max + 1 - x)$

### Stretch Skew

logit: $\log_{10}|\frac{x}{1 - x}|$

add and logit: $\log_{10}|\frac{x + 0.25}{1 - (x + 0.25)}|$